Saturday 6 August 2011

Gigabit Ethernet efficiency with standard 1500 bytes MTU

In my experiments, I have found that using jumbo frames (9k MTU) eclipses many gains seen by other improvements in the network stack such as GRO, LRO, interrupt coalescing etc. Because of the large MTU size, the per packet overhead is very small and renders other receive side optimizations a little less effective.

Now I have started experimenting with standard MTU size (1500 bytes) on 10 GbE. With 9K MTU, I can easily get to the line speed of ~9850-9900 Mbps with 4K or bigger message sizes. But with 1500 MTU, I can not get past some odd ~9400 Mbps, even with large message size of 1MB. It is not CPU bounded and both rx and tx side CPUs were less than 100% loaded. Upon further investigation and calculations I understood that ~9400 was the theoretical application data limit on 10 GbE with 1500 MTU. Lets break it down point by point

- 10 Gigabits per second or 10^10 bits/sec transmission speed refers to the raw bit transmission capacity of the link. This is layer 1 (L1) in the OSI model.
- L2 is Ethernet. Ethernet has a standard frame size which is shown here:
http://en.wikipedia.org/wiki/Ethernet_frame#Structure. So, as can be seen, for every 1500 (MTU) bytes payload transferred, Ethernet would require transmitting additional 38 bytes (= 7 bytes (Preamble)+1 byte (delimiter)+12 bytes (src+dst)+2 bytes(Ethertype)+4 bytes (CRC)+12 bytes (Interframe Gap, YES this is also transmitted))
- L3, routing aka omnipresent IP stack. Add another 20 bytes to the protocol overhead.
- L4, transport aka everyone's favorite TCP stack. By default Linux has timestamp option enabled for the TCP stack. Which adds another 12 bytes of overhead in standard 5 Word TCP header. So in total TCP header becomes 32 bytes.

So for every 1500 bytes transmitted on the wire, Ethernet transmits additional 38 bytes. And in 1500 bytes payload, apart from user data we have 52 (20+32) bytes of TCP and IP headers. So the net efficiency of the stack becomes

(1500 - 52 ) / ( 1500 + 38) = 0.9414 or 94.14%, which is exactly what you get as end-to-end application data rate. This is called "protocol overhead". With jumbo frames we have same calculation but with 9k MTU so

(9000 - 52) / ( 9000 + 38) = 0.9900 or 99%. Less than 1% protocol overhead.

In these calculations I have ignored the vLAN extension to Ethernet which adds another (optional) 4 bytes to Ethernet frame and other various optional TCP/IP headers. Any additional stuff would just increase the protocol overhead.

3 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. Nice analysis ! Informative.

    Another metric you could look at is packet per sec (or PPS).
    For example with 64 byte packets you cannot be anywhere close to 10Gbps of data rather pps becomes important metric to consider.

    For example to transmit 64byte packet on 10G takes (7-byte preamble, 1 byte delimiter, and the 12-byte inter-packet gap) = 84 bytes === (10Gbps/85) =14.88 mpps.

    If you are close to 14.88 mpps on 10g network then you can say you are transferring at line rate n pretty much saturating the network !

    ReplyDelete