The Myth of Network Speed
Speed is the most common network metric, yet it does not actually exist as a well-defined scalar. Speed is a statistic whose value and significance changes depending how it is observed. Network problems are most often characterized as speed problems, but finding the cause requires an understanding of the ways speed can be measured and what they mean.
IT people usually talk about speed as the signal rate of a physical link: "I have a 1 gigabit NIC" or "I have a 100 megabit uplink." This is speed in the purest sense: A series of events occurring at some rate over time. But in IP networks, data moves as packets, not bits. Packets have a number of properties that make speed difficult to measure:
- Packets move as whole units, with each packet's bits appearing in an instant. Network-level speed must be calculated by averaging these events over time.
- Not every bit is user data. Ethernet, IP, PPPoE, UDP, TCP, VPN, MPLS, and other layers take up space. That overhead is impossible to predict and can change over time. A good rule of thumb is that between 6% and 10% of the signal rate will be consumed by overhead.
- Not every bit of user data is useful. With packet loss and duplication, 100 megabits per second might be passing through a router while nothing is being delivered to the end user. Packet loss itself could be an entire article. For the purposes of speed, the data rate you observe at any point in a network path may be very different from how quickly user data is actually being received.
When looking at a switch, router, modem, or interface, you are only seeing bits and packets. The difference between throughput at each node, and the "goodput" of the entire path can be orders of magnitude.
End users view speed from the perspective of productivity: "I pushed go on my one gigabyte file and it took three minutes to finish." That involves a beginning, an end, and a set of data, none of which are as well defined as they may seem.
- Data paths do not begin or end with the network. Data almost always moves between storage devices, and these are often the slowest components in the path. As with networks, there is a big difference between the raw signal rate and the practical goodput.
- Caching and compression in devices and operating systems can give some bits a head start before the user hits "go."
- The end isn't really the end. Write caches in storage and operating systems may build up seconds or even minutes of data when there is enough RAM. Your transfer isn't really finished until all that data is flushed to storage. Some file transfer software takes this into account, but most does not.
I am often in meetings where end users say the network is too slow at the same time that IT managers say it's saturated. Each has graphs and measurements to prove their point. But these contradictory measurements are actually a good thing; they reveal which components of the path require closer scrutiny.
- Isolate storage. Test with the type of storage that will be used in production, which may include NAS, SAN, HDDs, or other bottlenecks. Then test with a RAM disk, /dev/zero, or a fast SSD. A difference here tells you whether storage is part of the problem.
- Use fresh, random data for each test. This eliminates caching and compression as factors.
- Minimize write caching. Caches only improve performance if the data being transferred is smaller than the cache. Linux especially may freeze all I/O for tens of seconds while flushing its cache, so consider clamping vm.dirty_bytes to something like 125000000 to minimize those effects.
- Be aware of layers that pile on headers. They can push packet sizes high enough to cause fragmentation, which increases overhead, magnifies congestion, and can trigger firewall rules and operating system bugs. Minimize layers and use a packet sniffer to check for fragmentation. Make sure there is a good reason for any network link with an MTU other than 1500.
There are many ways to measure network speed, but the one that matters most is the speed at which work gets done. The best way to optimize that network performance is to observe speed in as many ways as possible in as many places as possible, and let the differences be your guide.
Seth Noble, PhD, is the creator of the patented Multipurpose Transaction Protocol (MTP) technology and a top data transport expert. He is founder and president of Data Expedition, Inc., with a dual BS-MS degree from Caltech and a doctorate in computer science from the University of Oklahoma for his work developing MTP.
Recommended For You
Low-Power WANs offer an alternative to 5G for connecting a fast-growing array of basic devices and sensors that transmit small amounts of data.
An effective network visibility strategy requires understanding the technical, financial, political, and legal aspects impacting your network operations.
Emerging organizational structures for IT include placement of IT pros in user areas and departments forming their own "micro IT's."
Comparing a good and bad trace helps identify performance issues. Dynamic baselining can be used when you do not have a good trace to reference.
Combining commodity server platforms and FPGA-based SmartNICs will allow network applications to operate at hundreds of gigabits of throughput with support for millions of simultaneous flows.
SD-WAN implementations are on the rise thanks to the potential cost savings, increased network resiliency, and better application performance they deliver.