NETWORKING

  • 08/30/2017
    7:00 AM
  • Rating: 
    0 votes
    +
    Vote up!
    -
    Vote down!

How Accurate Is Your Protocol Analyzer?

When it comes to packet capture, a hardware analyzer will produce different results vs. a software-based tool.

Capturing packets for network analysis seems pretty straightforward: Install your favorite packet capture tool, hit start, stop and save your trace. What could possibly be a problem?

Unfortunately not all protocol analyzers are created equally. For the purposes of this article, I will refer to two broad categories of network analyzers: dedicated hardware-based and general software-based tools

Hardware-based protocol analyzers  are straightforward since they are specifically built to capture packets. The general software-based tools are exactly that software you can download and install on various systems. Both types of analyzers have their place.

When I need to perform network latency analysis or troubleshoot a performance-related issue that requires millisecond or lower accuracy, I reach for a hardware-based analyzer. Many times, I use a hardware-based analyzer to capture packets and then use Wireshark software to analyze the data. Wireshark is user-friendly and I can run it on my laptop instead of lugging around a hardware analyzer. If a trace file is too large, I use Wireshark’s editcap utility to break the large trace file into smaller ones.

When I need to capture data and perform protocol analysis or application profiles, I’m comfortable using software analyzers on my laptop. This is due to the fact that all I need is the packets and addresses to determine application flow or dependency; I don't need the timings.

Software analyzer limitations

Since the software-based analyzer wasn’t designed for a specific type of operating system or computer, be aware that it will have some limitations. For example, if you install Wireshark on a Windows-based computer, it has to contend with other Windows services while capturing packets.

In addition, configuration issues may arise due to interference from endpoint security, firewalls and other CPU-intensive applications. Some people I have spoken to think that running the software analyzer on Linux or Apple is the answer. Unfortunately this may not help since you need to understand quite a bit about your computer hardware, configuration and operating system behavior to get an accurate picture of hardware performance.

Then there are some of the wildcard issues like using USB Ethernet adapters without knowing what version they are or remotely capturing from an interface (i.e. rpcapd). Both scenarios may result in lost packets or inaccurate time stamps. The best-case scenario is that you drop packets because you might notice that and then figure out why it's happening or try a different tool. Inaccurate time stamps are the worst because you will probably be unaware when this happens. I’ve seen many analysts assume that if they capture all the packets, then the delta times must be accurate.

Generally speaking, in a Windows environment, Wireshark uses WinPcap software and Microsoft’s NDIS to capture packets. According to the WinPcap website, “WinPcap consists of a driver that extends the operating system to provide low-level network access, and a library that is used to easily access the low-level network layers.”

Lab setup

To demonstrate the different results you can get with hardware vs. software analyzers, I set up a lab and ran a couple tests. I used NetScout OptiView XG because I just happened to have access to two of them while writing this article. My first test was a control test in which I generated packets using Optiview XG, connected using a cross-over cable to another OptiView XG for packet capture. I did not send a ridiculous amount of data in order to show you that it doesn’t take much to see a difference. I only sent 100 packets per second for 10 seconds with a packet size of 1,024 bytes.

 

In the second test, I had the Optiview XG generate the same stream, but my laptop capture the packets using Wireshark. I used an Alienware Killer e2200 gaming laptop with a Gigabit Ethernet controller, i7 processor, 16GB of RAM, and minimal software/services installed.

 

The methodology I used was to take the captured packets, filter out just the traffic stream, create a CSV file and graph the delta time between the packets.

The first chart shows the XG back-to-back test results. The delta time shows a very consistent delta time of 10 milliseconds with no dips or spikes.

 

The second chart below shows the delta times from the test with the laptop all over the place. It is important to remember that the only application running was Wireshark and the computer wasn’t doing anything other capturing packets.


Comments

Nice work Tony!

Great way to show the timing difference in capturing with a laptop, even at lower speeds. Thanks for the post.

Re: Nice work Tony!

thanks Chris

Re: Nice work Tony!

I don't see how this is "at lower speeds". The variance is less than 2% - unless I'm missing something obvious here - and something that would be interesting would be to average the response time to see if it was consistent, slower, or faster than the hardware system. Perhaps that was the point?

Re: Nice work Tony!

Maybe "light loads" instead "lower speeds" would explain it better.
i was referring to the "100 packets per second for 10 seconds with a packet size of 1,024 bytes" that already caused the variance. .

thanks for the comment..

What variance?

Hi, again I may be missing something, but is 2% really a variance, esp when the average of all the plotted points seems to state that the average is consistent? The chart shows 10ms vs 9.8ms to 10.2ms, so one could even argue that there are plot points that show the software-based solution performing better.

Re: What variance?

i was referring to the difference between the Optiview and laptop graphs. The Optiview was a 10ms flat consistent line where the laptop was not flat. hope that helps

Re: What variance?

Thanks - but now I must ask - who cares? In some cases the laptop was 2% faster, in some cases a little less than 2% slower. What is the net impact? I think something more useful would be "the general purpose laptop falls apart when capturing packets at rate x, whereas the blah blah does fine". I don't understand the relevance of this chart. 2% is not substantial and the argument is invalid when that 2% is 2% worse OR 2% better depending on the timeslice.

Further, it says *nothing* about the "accuracy" of the protocol analyzer, which is what your article appears to be targeting.

Re: What variance?

fair question. if all you want is the packets, i agree, "who cares" but when you are concerned about accuracy, "then I care" Cheers

2% is huge when you consider it was only 100 packets/sec of 10 seconds. I've already written other articles demonstrating when the analyzer drops packets, so I thought this was new.

when the Optiview has no variance and the laptop does, I argue it does demonstrate the accuracy issue,

I guess we can agree to dis agree on this one.

Regards

Re: What variance?

Hi Tony,

Variance doesn't equal inaccuracy. Inaccuracy is incorrect reporting or incorrect data or misrepresented data.

If you've already written articles on how software solutions fall apart vs hardware solutions, forgive me for not knowing - I literally stumbled upon your article in isolation. Being in networking I was naturally curious so I read it. And after reading the article in isolation, had absolutely no idea what your point was regarding A) accuracy of the analyzer or B) why using a software solution was bad.

2% variance between the two under low load (did you state that this was a light load in the article?) is nothing, especially given that we all understand how general purpose operating systems work AND that 2% variance could be better or worse than baseline AND that variance does not mean the software system would be dropping packets on the floor AND variance does not mean the software system would misrepresent packets or their data. Variance does not equal inaccuracy. Dropped packets or misrepresented packets equals inaccuracy.

This isn't an agree to disagree matter, these are facts. It would have been more helpful had you referenced your previous work showing how software systems fall apart compared to hardware systems, or, shown a graph that would have actually led us to the conclusion that "this is a problem". 2% variance is not a problem on a general purpose OS and to come to some conclusion that there is an "accuracy problem" because of seeing some random chart with 2% variance (which is quite good actually) from baseline, highlighting an operating system behaving exactly as one would hope it would does absolutely nothing.