Test – Throughput alchemy using a snake topology
Sometimes it’s best not to trust network vendor datasheets. Nothing quite beats a controlled test of a network device in your lab with your config and your required features. But if you want to load test multiple ports on your 10G device-under-test (or DUT), then things can get very expensive, very fast.
In this post I’ll show a test topology that will help you turn 10Gbps of test traffic into 640Gbps or more.
Test every port at line rate
So before you get excited you should note this is a ‘niche’ test topology. Two or three test ports will cover the vast major of test scenarios. There are very few reasons why you’d need to run line-rate throughput tests on all ports of a DUT at the same time.
Furthermore, this test is hard to pull off on L4-L7 networking devices, as they may lack the ‘internal loopback’ capability needed by the topology. The CPU-processed path of L4-L7 devices are most likely to fail under high throughput conditions. ASIC switched L2 or L3 traffic will most likely transit your DUT unhindered.
Here are some reasons to test using the ‘snake’ topology.
- Burn-in testing. You may want to burn-in a network device before it gets into service. i.e. get it forwarding on all ports at line rate for 24-hours before entering live service.
- Acceptance testing for a customer. I have heard of some customers demanding proof that a device gives the full throughput promised by the datasheet. Probably something a VAR would charge extra for, but hard to execute cost-effectively.
- If full line-rate forwarding is critical to your design. Most of the time it simply isn’t that important. If your design hinges on this assumption, you’d better test it.
- Power draw – If you have a keen electrical engineer, they may want to know the ‘actual’ power draw of your network device when operating at full tilt. The PHY chips on your device can consume a lot of power on a high-density switch, so exercising them will increase your power draw. Over provisioning power can cost your company a lot of money, so this test may deliver the highest return.
Layer 2 Snake test
So that’s the caveats out of the way. How would you test a modern 64 port 10Gbps switch at line rate on all ports at the same time. That’s a whopping 640Gbps of test traffic. In our example we’ll use the Spirent Axon as I know it’s targeted at enterprise customers and is available in a 2 x 10Gbps variant. I’m sure there are other entry-level 10G testers out there which can do a similar job.
The test device connects to the first port on the DUT and then ‘snakes’ the same stream of 10Gbps traffic from the tester through the DUT. The stream of test traffic exists the DUT on it’s last port before returning to the test device.
There are some points to note here. You can see that I’ve used vlans to provide the internal loopbacks, so this is a layer-2 switching test. You need n/2 VLANS where ‘n’ is the number of ports you want to test. In the diagram above n = 8, but the main point here is that ‘n’ isn’t limited by the number of test device ports available to you.
By externally bridging the VLANS together, you force the device to switch the same frame ‘n’ times, and exercise all ports at once. Most testers can run a test port bi-directionally so that you can send and receive a duplex 10Gpbs flow.
EDIT: If you’re finding this difficult to understand, remember that these are VLAN access ports. As such the frame that leaves the port from VLAN1 into the loopback patch is ‘untagged’. At this point, the frame doesn’t belong to ‘any’ VLAN. VLAN2 receives the frame from the loopback port and, based on switch configuration for the port, regards that received frame as part of VLAN2… and so on.
You’ll need n x transceivers and n/2 patch cables. NOTE: you are deliberately creating a looped topology with the external patch cables. I just disable STP, but in theory PVST should allow this topology. Just don’t forget to re-set the config after the DUT leaves the lab. Also, please, please isolate your DUT from all networks except serial console before doing any kind of testing.
Layer 3 snake test
You can also test at layer 3 if you want. This time you need to use n/2 VRF instances instead of vlans. You will also need a route in each VRF to reach the test-device’s target IP interface. Note that some devices won’t let you configure this many VRFs, so your mileage may vary. If you’re testing for power consumption or trying to burn-in, then the Layer 2 test will probably suffice.
The snake topology allows you to leverage an entry-level tester for big results. It can help you avoid a big spend if you need to blast a lot of traffic at a modern high-density switch. Regard it as another tool in your testing toolbox.
Disclaimer: Spirent presented the Axon at NFD4 which I attended as a delegate. I have received no incentive for this post. See http://thenetworksherpa.com/disclaimer/ for more details.
21 thoughts on “Test – Throughput alchemy using a snake topology”
Snake tests are not particularly useful for any sort of switch testing. For multi-ASIC devices, traffic will never hit the backplane, and thus you’re not providing any sort of meaningful stress to the device at all.
When I wrote this post I was thinking of a single RU switch-on-chip style device with a single stage architecture. However, I don’t agree that ‘snake tests are not particularly useful’ to multi-stage/backplane devices.
You can use the snake topology to test your backplane if you so wish. Take a Nx7010 with 2 x Fab-1 fabric modules and 2 x M108 I/O modules. If you wanted to stress both line cards and the fabric you could use the following topology.
Slot1-port1 ->Vlan_1 -> Slot2-port1 -> External Loop -> Slot2_port2 -> Vlan2 -> Slot1_port2 -> External loop ….. etc. That topology would stress all 16 x M108 10Gports at line rate whilst giving the backplane 80Gbps of traffic.
Thanks again for the comment.
I stand corrected; your detailed scenario is more stressful that what I had originally envisioned. However, that still doesn’t load up the backplane as much as a full-mesh test would.
I guess the point is that the snake test can produce a big, impressive bandwidth number with little in the way of test hardware, but it isn’t as stressful on the device as an RFC style full-mesh test. The forwarding numbers measured by the two different tests come with some important distinctions.
Thanks for the extra details. I’m always happy to learn more so I’ll dig a little deeper into the full mesh tests.
good stuff. I think with limited test resource and trying to stress test the switch, snake test is a way adding load to switch to understand power-draw with full load and also throughput+latency with full line rate. when running RFC2544 test with a snake test setup, we can actually see the latency is one line rate stream passing through all path (one hop latency * how many hops for the snake) and the counters on all ports on the pass increase. So, it’s kind of simulate the traffic passing through entire switch ports.
As to fully stress the switch, and how it stress/load the backplane correctly, I think the test topology design and understand the backplane architect would help. Just my two cents
Great point Maggie. Whilst the black box test idea is a useful, you can’t really stress the box properly without knowing the internal architectures. Thanks for the comments!
This is an interesting bit. Please can you advice if you were able to find any leads for the full mesh test? It would be nice to know more about it.
The most popular full mesh test by far is RFC 2544. There’s a nice overview on the packet pushers blog. http://packetpushers.net/smoke-and-mirrors-questions-and-rfcs-that-help-you-interpret-vendor-performance-claims/
I can’t understand what Loopback is?
plz can you share your configuration…..
I really need it…
I don’t have the configuration snippet available right now. The Loopback patch is a just a physical cable either fiber or copper, depending on the port-type of your switch.
If you’re doing a L2 test I recommend:
1) disconnect this device from all production network and then disable spanning-tree
2) use the CLI of the switch in question to configure the vlan membership as shown in the diagram. If you were using
3) before you load test try a simple ARP, if it doesn’t work, then re-check your cabling and try again.
If you’re having diffiulty then scale down the test and run it through a single vlan. when that works scale it up until you have the full snake topology.
Any particular reason for using VLANs for internal loopback in case of L2 switch test? From Spirent tester ports, reversed mac address traffic streams will be sent, mac addresses will get learnt on appropriate ports. VLANs will only be used for keeping two ports in one VLAN, to restrict flooding? Because, we are for sure not interseted in unknown unicast flooding throughput even there are only 2 ports in one VLAN.
The starting point for these tests is the ARP reply (or gratuitous ARP) sent by the testers return port.
If all ports shared a common vlan, say vlan1, then then Vlan1 would become aware of the testers return port, and would map it’s MAC address to that port. Upon receiving a frame from the testers input port, the switch would forward the frame directly to the return port, bypassing all the intermediate ports. You would end up with a two-port test.
We need to use multiple vlans to hide the return port’s location. Each vlan learned from the original ARP reply to forward the return port’s destination MAC out it’s per-vlan external loopback interface. The next vlan makes a similar decision, etc. until you get to the true return port in the final VLAN.
Let me know if I haven’t understood your question or feel free to propose an alternative configuration. Thanks for your comment.
Hi John, Thanks for the quick response. I was under the impression that traffic streams for L2 switching can be configured on traffic generator with user defined DMAC, SMAC and other header fields, and ARP would play no role for MAC learning. Alternatively, if mac addresses can be appropriately configured statically on switch ports, then within same VLAN (=1), only data traffic throughput can be tested without bothering about VLANs and MAC ageing. Please share your views.
You’re right about the ARP. It’s not mandatory to do an ARP operation, it’s just a useful to perform MAC learning. I guess I’m used to applying IP addresses to the test end-points. If you statically map MACs to ports, or sent traffic from DMAC in the opposite direction you can also perform MAC learning.
However you still have a snake-test challenge even if you bypass MAC ageing and move to static MAC to port assignments. If everyone is in VLAN 1, which port would you map the testers DMAC to? If it’s only the testers Dest port, then you’re back to a 2 port switch test.
The instrument port is not enough, Can I test as L3(full mash) + L2(Snake)?
Not sure what you mean by ‘instrument port’ is not enough.Are you saying that the Snake test is not enough? It depends on what you’re trying to achieve (test requirements). An RFC2899 full mesh test is much more comprehensive.
Are you looking to do full-mesh and snake in parallel? I don’t see a huge benefit to be honest, unless you also run each of them separately. You get to claim you put the box under more stress, but if anything failed in the combined test you may have difficulty isolating the root cause.
Can you compare the snake test with PRBS.
How do they differ.
This is a good post.
I have one quick question: will Spirent send untagged or vlan1 traffic into the switch?
If Spirent sends untagged traffic, since port1 is in vlan1, so switch drops the traffic?
or if Spirent sends vlan1 tagged traffic, when this traffic comes out of port2 on to a physical loop to port3, port3 should drop the packet as it is configured with vlan3?
Appreciate some insight into this..
Thanks in advance.
You would send untagged frames to port1. If vlan1 ties port1 and port2 together, then that frame remains untagged as it egresses port 2. Under the hood, vlan1 is most likely tagged to the frame on ingress to port1 along with other meta-data, but this is internal-only and stripped from the frame before egress.
If you really wanted to, you could configure Q-in-Q on the switch and send tagged frames to the switch. However I’m not sure that’s going to provide much benefit for a throughput test like this, unless you suspect that throughput would drop when switching tagged frames.
hi, how is it that different vlan can communicate with one another? i thought they are not supposed to? If that if so, why would patch cables allowed communication? Thank you.
You should note that an access port (trunk ports behave differently) is configured to belong to a VLAN. When a frame is received on a port, the switch looks up it’s config and decides that the port belongs to that VLAN.
It’s an an internal configuration check… the switch isn’t reading a ‘tag’ which tells it the vlan membership of the received frame.
The reverse is also true, assuming the VLAN allows a frame to be sent out a given access port, there is not tag attached.
The snake topology ‘hacks’ this behavior. So the frame could leave a given port belonging to VLAN_A but it is untagged, thus when it arrives on a port configured for VLAN_B then according to the switch config, that frame is ‘temporarily’ part of VLAN_B.
Hard to understand..yes. But if you understand this you ‘really’ understand VLANs.