Part 2.2: Adapters' Performance, CPU load, and PI
The results are divided into two sections. The first one includes diagrams of adapters' average maximum speeds, CPU load at that moment, and performance index (PI). The second one comprises adapters' performance research when the size of transferred data packets is changed on different MTUs (using NetPIPE and Linux).
During the tests we take the readings of an average value for adapter's maximum throughput at a selected period. And if the program allowed it, CPU load for that period was also measured. But taken separately, the readings of adapter throughput and CPU load are opposite in their values (who needs a gigabit speed if the CPU is 100% loaded?), that is why we came up with the idea to unite these two characteristics in one formula.
Adapter performance formula seems evident: you just have to divide the data transmission speed by the CPU load. But in reality, it turned out to be a bit more complicated. Let's take, for example, adapter's throughput equal to 100 Mbits (that's for a gigabit adapter!) and the correspondingly low CPU load, say, 5%. The division procedure gives a very high index, but this value is far from being real. That made me and Alexey Kuznetsov introduce the notion of a relative index (RI) and multiply the results by the ratio of the real transmission speed to the peak speed.
Thus, the performance index (PI) can be calculated in the following way:
PI = T/C * T/1000, where
The diagrams are a little different (in comparison to the 32bit bus tests) - there appeared test results for 16 KB Jumbo frame size. Unfortunately this Jumbo frame size was supported only in adapters from Intel, that's why they were the only participants in such tests.
NTTTCP results, Windows
NTTTCP test results are the first. I must admit that I was disappointed by the NTTTCP test in this particular case. Most results were worse than those received in other tests, and all attempts to improve them ended in failure. Next time, we'll probably do without this test.
Adapters from Intel are almost always leaders. But the strangest thing is that the adapter on a CSA bus outscores a PCI adapter. Strictly speaking, embedded adapter on a PCI bus does not display very high results, it's a pity we didn't manage to test full-fledged external solutions from this company. Both chips do not load CPU very much, but if compared with each other, you will see that the CSA solution loads CPU a tad heavier than a chip on a PCI bus.
3Com adapter demonstrated high results only with frame sizes from 6000 bytes, but with smaller sizes this card offers low CPU load.
The other adapters are approximately on the level, except for CNET (on RTL8169), Hardlink and TRENDnet TEG-PCITX2 (old model) all these cards have very high requirements to CPU resources (they heavily load CPU). So, you can see it all for yourself on the summary diagram "Performance Index".
Iperf results, Windows
The next are test results for adapters' performance in Iperf. Unfortunately, the program does not register CPU load, so there are no diagrams on CPU load and PI index.
The results are similar to those of NTTTCP, only a little better.
NetIQ Chariot results, Windows
The test allows to register both adapter's throughput and CPU load. Therefore we have provided all the three diagrams including the Performance Index diagram.
Test results are almost identical to the previous ones. But in this case the lowest CPU load is offered by 3Com, which allows this adapter to shoot ahead in the PI test, despite the fact that its performance is a tad lower than in the adapters from SysKonnect.
With Jumbo frames disabled the best performance results are shown by chips from Intel. Please, pay attention that with 16 KB frames these cards show results even lower than with 9 KB Jumbo frames. For some reason such giant Jumbo frame sizes are becoming ineffective.
Adapters from SysKonnect perform not bad at all, especially with 6 KB and larger frames they score on the level with Intel, though they are not leaving these chips behind.
NetIQ Chariot results, Linux
In our tests last year, in most cases the adapters showed faster speeds in Linux. This time the situation has not changed it's interesting why.
Unexpected failure of adapters from Intel. They almost showed slower speed than in Windows and are steadily lagging behind from almost all cards.
On the contrary, a CNET adapter improved its performance a little and considerably lowered CPU load it means that in Windows the problem is with drivers. Unfortunately, the current driver version in Linux did not allow to enable Jumbo frames, that's why we can see the RTL8169 chip performance only with common frame sizes.
The majority of other cards display steady high performance results. The exceptions include Hardlink and Zyxel - they are slightly lagging behind (and with frames over 6 KB Zyxel performance drops disastrously), Intel and CNET, as well as the old dual chip version of TRENDnet TEG-PCITX2.
Comparing performance with CPU load, the leaders are 3Com and, strange as it may seem, Intel the latter offered very low CPU load but its performance let the card down (not enough though to squander PI).
Peak performance results in NetPIPE, Linux
The NetPIPE utility generates traffic constantly increasing the packet size. Thus it helps us learn the bottlenecks, where the adapter may have performance falls. As a result, we can estimate the adapter's peak speed (usually done on large-size packets) and speed changes in the range between the minimum and the maximum packet sizes.
The situation is almost identical to Chariot the speed of most adapters stops at a certain threshold, which slightly raises with increasing MTU sizes. I wonder what stops them from growing higher.
Evgeniy Zaitsev (email@example.com)
27 May, 2004
Write a comment below. No registration needed!