13 Interesting Questions About SD-WAN

Thanks to everyone who participated in the network field day event 9. As a first time participant, I must say it was simultaneously fun and informative to exchange ideas and approaches with the group. There were several important and consequential questions which were raised in the discussions- a great summary can be found at Ethan Bank’s blog “Questions I’m Asking Myself About SD-WAN solutions”. We thought we’d take the opportunity to continue the discussion by providing CloudGenix’s view below…

SD-WAN Solutions

  1. What’s the impact to hosts on virtual machine based endpoints, i.e. how much CPU does an SD-WAN VM eat for solutions that use VMs? Not a simple question to answer anymore, as there’s usually cryptography involved.CloudGenix: By taking advantage of Intel data plane & crypto capabilities as well as by intelligently implementing encryption algorithms, encryption at WAN speeds is now both practical and cost effective on x86 architectures.
  2. How much latency does the SD-WAN controller introduce, and under what circumstances? CloudGenix: One of the key architectural requirements of an SD-WAN controller is that it should introduce NO latency to the data plane, which means that data plane traffic must not be sent to the controller. Policies should be evaluated locally & forwarding decisions should be made locally without requiring round trips across the WAN to the controller.
  3. When WAN-based SD-WAN tunnel endpoints are inevitably separated from the controller due to a network fault, what happens? CloudGenix: Because network faults over the WAN are a certainty, SD-WAN architectures should NOT require:
     The data plane to traverse the WAN to the controller
     Reachability between the endpoints and the controller for ongoing network operation
    The CloudGenix architecture provides such separation of tunnel endpoints and data plane from the controller.
  4. How does the SD-WAN infrastructure track tunnel availability, and how quickly does the controller react when a tunnel is down? CloudGenix: The CloudGenix architecture does NOT require the controller to track tunnel or respond to changes in path availability. Each endpoint tracks not only tunnel availability but also application availability and application performance on each tunnel. This allows us to not only leverage traditional techniques like BFD for network path failures but also to incorporate more comprehensive methods of rapidly detecting and mitigating application specific brown outs. Each endpoint constantly monitors both path availability as well as application performance on each of the paths so that path changes can me made locally without incurring performance & latency penalties induced by going to the controller.
  5. What happens to in-flight traffic when a tunnel dies? (Every financial services organization is going to ask that question.) CloudGenix: In the CloudGenix model, if a tunnel dies, traffic will be moved to alternate available paths.
  6. How does the SD-WAN solution get traffic into the system? As in, routers attract traffic by being default gateways or being in the best path for a remote destination. SD-WAN tunnel endpoints need to attract traffic somehow, just like a WAN optimizer would. How is it done? WCCP? PBR? Static routing? (All 3 of those are mostly awful if you think about them for about 2.5 seconds.) Or do the SD-WAN endpoints interact with the underlay routing system with BGP or OSPF and advertise low cost routes across tunnels? Or are they placed inline? Or is some other method used? CloudGenix: CloudGenix provides 3 modes of deployment in customers’ networks:
    1. Discovery mode. In this mode, CloudGenix is placed in one or more branches (no presence in data center or cloud is required.) CloudGenix will automatically identify applications that are running on the network, and measure application performance for SAAS, enterprise, and Unified Communications applications. No policy enforcement or optimization is performed.
    2. Overlay mode. In this mode, CloudGenix interoperates with the existing WAN infrastructure in the branch and data center providing visibility as well as policy enforcement and optimization.
    3. Router replacement mode. In this mode, CloudGenix provides visibility as well as policy enforcement and optimization. CloudGenix replaces legacy routers in the branch and/or the data center.

    In the Discovery and overlay modes described above, CloudGenix can be inserted into customers’ environments with minimal changes to the existing infrastructure and in a phased manner by selecting a subset of applications and/or branches to be carried on the CloudGenix network. Customers can use “in path” insertion in the branch and selective peering using protocols such as BGP in the data center. WCCP & PBR can be used however this adds complexity to the integration and can have unintended consequences on the rest of the infrastructure.

    Path selection decisions aren’t made using traditional Link State or Distance Vector Protocol “cost” metrics or link specific latency and jitter. Path selection decisions are made based upon specific application performance metrics (Application Response Time for transactional apps and CODEC conformance for media and Unified Communications apps).

  7. What about traffic I don’t want to go through the overlay fabric? How do I exempt it? CloudGenix: It is very easy to include / exclude traffic from the overlay fabric. We can do this with application specific policies (down to the sub-application level) on the CloudGenix devices or selectively peering with specific branches in the data center CloudGenix device. This provides the ability to do very granular control of what traffic gets placed on the fabric to the application / branch level.
  8. Double-encryption is often a bad thing for application performance. Can certain traffic flows be exempted from encryption? As in, encrypted application traffic is tunneled across the overlay fabric, but not encrypted a second time by the tunnel? CloudGenix: Encryption can be selectively enabled/disabled based upon policy. That said, the CloudGenix architecture could support the additional encryption at enterprise WAN speeds without compromising throughput or application performance.
  9. Is the encapsulation type standard or proprietary? If it’s proprietary, convince me I don’t care.CloudGenix: Standard. VXLAN
  10. Assuming unique keys per tunnel (and I’d hate to imagine a single key per tunnel fabric), how are these keys managed and by whom?CloudGenix: Our controller has a full PKI infrastructure. Each link / context combination has a unique encryption key. The controller rotates keys frequently and automatically, without requiring administrator intervention.
  11. Is path symmetry important when traversing an SD-WAN infrastructure? Why or why not? Depending on how the controller handles flow state and reflects it to various endpoints in the tunnel overlay fabric, this could be an interesting answer. CloudGenix: Preserving symmetry is absolutely critical when traversing across the WAN. Stateful network services such as firewalls require symmetry to work and asymmetric routing can greatly complicate troubleshooting. Our solution operates at the application session/flow level. As such flow symmetry is maintained across the fabric.
  12. Selectively forcing certain flows to traverse firewalls or other security devices is part of the SD-WAN unicorn. How, exactly, does this happen, and what are the network underlay dependencies required to bring it about? Ergo, SD-WAN service chaining differs from service chaining through a hypervisor-based vSwitch where a controller can direct the traffic inside of a nice, tidy ecosystem wherever it wants. SD-WAN service chaining has to work on traditional IP fabrics that have no inherent notion of service chaining, and all you’ve got to work with are overlay tunnel endpoints.CloudGenix: Our approach is to provide for granular classification of flows down to the sub-application & user group level and then forward the flows of interest to the location of the specific service sets. Each service set may comprise physical, virtual & cloud-based services across one or more vendors so an SD-WAN solution should bring the traffic directly to the service set without requiring any modifications to the existing network and be interoperable with the service chaining solutions available (or soon to be available) in the market.
  13. Just how granularly can I identify applications, considering progressively more applications are encrypted as they traverse the wire? CloudGenix: CloudGenix fingerprints applications at the application flow & session level. This allows us to classify encrypted applications down to the sub-application level without decrypting the streams.

Looking forward to a some more great discussions on SD-WAN.

Vijay Sagar & the team at CloudGenix