We were recently notified about a discussion around the overlay tunnel scale in an SD-WAN network in a few mailing lists. As the number of nodes in the network increases, a full-mesh overlay would lead to an
O(n^2) explosion. In what follows, we provide our (short) perspective on some of these scale properties.
A bit of theory on tunnel numbers
Multiple uplink (access) circuits are a commonplace in SD-WAN networks. Subsequently, each node attempts to establish a tunnel over each of its uplinks to every uplink of a peer node, leading to a tunnel for each uplink combination. That does lead to the full-mesh complexity of
O(n^2), but that’s the number of tunnels in the network. The number of tunnels on each node is different than above.
Instead of “42. (The answer to life, the universe, and everything.) - Douglas Adams”, we can actually express these two numbers in mathematical terms.
Number of tunnels on any given node, j, can be expressed as:
T_j = numUplinks_j * ( SUM(numUplinks_i) from i=1 to i=numNodes, i != j )
As per the total number of tunnels in the network, it’s a bit more subtle than a pure math combination formula
(N | 2), as a node doesn’t establish tunnels between its own uplinks. This, in fact, can be represented as a complete multipartite graph, with each set contains the corresponding node’s uplinks. Accordingly, the total number of tunnels in the network can be expressed as follows:
N = SUM(numUplinks_i) from i=1 to i=numNodes,
T = (N | 2) - (SUM(numUplinks_i | 2) from i=1 to i=numNodes)
In summary, the number of tunnels a node needs to manage is in
O(n). The total number of tunnels in the network is in
If you follow all the rules, you’ll miss all the fun!
The real world isn’t always as perfect as the above numbers. The reasons the actual number of tunnels deviate from the above theoretical numbers are:
- Disparate networks may lead to unreachability between some uplink combinations. E.g. one side is an Internet/broadband link and the other side is MPLS.
- NATs can have their way. E.g. the nodes sit behind symmetric firewalls.
- Simply, some links may not be in UP state.
The actual number of tunnels, in reality, is thus less than the theoretical max.
What are the tradeoffs?
In other words, can any SD-WAN system or CPE support tunnel scale even when the number of nodes in the network grows arbitrarily high? The good news is that with the right implementation strategy, the overlay tunnels should not be exposed as independent interfaces in the system (that would otherwise be quite restrictive in terms of system resources).
For example, SPAN devices utilize DTLS to create overlay tunnels. It is lightweight, lives completely in user space, and takes up less encapsulation space compared to other VPN technologies. In addition, it lets us express each tunnel, logically, as simply a data structure (instead of creating a separate interface/device construct in the system).
That said, as the saying goes: “there is no such thing as free lunch,” each overlay tunnel will at least include the following state:
- Encapsulation text
- Crypto attributes
Now enter the “feature creep” that constantly leads to more state and processing overheads to be considered per tunnel, including the following:
- DPI-level stats
- Active SLA monitoring
- Optimization functions such as packet-level load balancing and FEC
The following table summarizes how these affect the base system resources:
A good implementation will want to keep all of the optional features pluggable and tunable to achieve scale.
Most of the SPAN devices scale to 1000s of overlay tunnels easily. See the following figure for a quick observation on the CPU and memory snapshot as we scale the number of tunnels on a mid-range SPAN device. This is with a fully-featured configuration. The DPI-level stats maintenance per tunnel contributes the most to the memory usage in our system.
Additionally, SPAN also provides an extensible policy framework to build dynamic topology groups by matching on specific attributes. E.g. if the link is LTE, the topology should be hub-n-spoke. This is quite useful as LTE links are quite sensitive to the amount of data being sent on them.
Now to the real question: does it make sense to always go for a full-mesh overlay topology irrespective of the number of nodes in the network? The answer lies in that property of the system design that’s often overlooked: debuggability. If each node in the network has 1000s of tunnels, how does the network administrator really debug: (a) are all the tunnels up?, (b) is the data going on the right set of tunnels?, and so on. Although the SD-WAN system provides substantial set of tools and visibility, they are not enough to troubleshoot such issues at scale.
For large networks, it thus makes more practical sense to decompose into smaller subnetworks and build a hierarchical topology.