Tuning TCP performance

Overview

This page looks at the impact of the message size, TCP window size and the Nagle algorithim on network utilisation.

The tests were performed using iperf between linux systems on a 1 Gbit network.

Results

TCP window size.

In this test data is sent from one linux machine to another.

TCP window size
(KB)
Bandwidth
(Mb/s)
Utilisation
relative to peak
4096 942 100%
1024 942 100%
256 941 100%
64 928 99%
32 935 99%
16 772 82%
8 481 51%
4 346 37%
2 344 37%

Message size.

In each test, the TCP window size was 256 kB.

Message size
(KB)
Bandwidth
(Mb/s) (Nagle on)
Bandwidth
(Mb/s) (Nagle off)
4096 933 939
1024 917 939
256 941 941
64 942 941
32 942 941
16 942 941
8 942 941
4 942 941
2 941 857
1 941 907
512 byte 816 462
256 byte 530 256
128 byte 349 124
64 bytes 209 62
16 bytes 71 25
4 bytes 17.5 6.5
1 byte 4 1
  • On this system, 1 byte packets were slower than 4 bytes packets on a packets per second basis.

What is not considered in this test.

  • Latency. Latency impacts performance and a larger window size may help with high latency connections. i.e. the internet.
  • The behaviour Java performing the same tests. Java can demonstrate a higher overhead on a per systemc all basis than a C application. Java is more likely to need larger message sizes to achieve high network utilisation.
  • The behaviour of a real application. Real applications need to perform work to create a message and will perform work on receiving a message. This reduces the peak utilisation which can be achieved, but also reduces the sensitivity of the application to these tuning parameters.
  • Different hardware choices are likely to affect the optimal tuning point for the parameters concidered. If concerned about this, I would suggest making the buffer two to four times what this test suggests is optimal.

What is a disadvantage of a large TCP window size.

  • Each socket connection needs two buffer, one for sending and one for receiving. This means each socket will use more than double the buffer size. If you have only a few connections, this is unlikely to be important but if you have 100 connections with a window of 512K, it will use 100 MB in buffers and 1,000 connections will use 1 GB.

Conclusion.

  • If your server has a default buffer size/TCP window of 32KB or more, this is likely to be fine. If you have a large number of connections you may want to bring the buffers sizes down to 32KB.
  • If you leave the Nagle algorithim on, a block size of 1KB or more is fine. For writes smaller than this, consider using buffering.
  • If you turn Nagle off, to reduce latency, batching/buffering in your application up to 1K - 4K could provide a performance improvement.
  • If your typical message size is 4K of more, you may get better performance by turning off the Nagle algorithim. If your application is very latency sensitive, you may concider turning off the Nagle in all cases.

Future work.

  • Test the behaviour of a Java application under similar conditions, with; no serialization, binary data, Java serialization, and XML serialization.

Hardware used.

Two HP DL385 G2, AMD 8218 2.6 GHz, 8 GB with 1 Gb ethernet.
Both systems were running RedHat 5 RT, 2.6.24.7-65.el5rt
These server and network hardware were about three years old at time of writing (Dec 2008). i.e. not bleading edge.

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.