How to receive a million packets per second
Last week during a casual conversation I overheard a colleague saying: "The Linux network stack is slow! You can't expect it to do more than 50 thousand packets per second per core!"
That got me thinking. While I agree that 50kpps per core is probably the limit for any practical application, what is the Linux networking stack capable of? Let's rephrase that to make it more fun:
On Linux, how hard is it to write a program that receives 1 million UDP packets per second?
Hopefully, answering this question will be a good lesson about the design of a modern networking stack.
CC BY-SA 2.0 image by Bob McCaffrey
First, let us assume:
Measuring packets per second (pps) is much more interesting than measuring bytes per second (Bps). You can achieve high Bps by better pipelining and sending longer packets. Improving pps is much harder.
Since we're interested in pps, our experiments will use short UDP messages. To be precise: 32 bytes of UDP payload. That means 74 bytes on the Ethernet layer.
For the experiments we will use two physical servers: "receiver" and "sender".
They both have two six core 2GHz Xeon processors. With hyperthreading (HT) enabled Continue reading