TCP WINDOW SIZE (SCALING, TAKING ADVANTAGE FOR HIGH PERFOMANCE, TELNET AND SSH UNDER LARGE TCP WINDOW SIZE)

Image result for TCP

TCP WINDOW SIZE (SCALING, TAKING ADVANTAGE FOR HIGH PERFOMANCE, TELNET AND SSH UNDER LARGE TCP WINDOW SIZE)




Image result for TCP TCP WINDOW SCALE OPTION

 The TCP window scale option is an option to increase the receive window size allowed in Transmission
 Control Protocol above its former maximum value of 65,535 bytes. This TCP option, along with several
 others, is defined in IETF RFC 1323 which deals with long fat networks. TCP windows The throughput of a communication is limited by two windows: the congestion window and the receive window. The former tries not to exceed the capacity of the network ( congestion control) and the latter tries not to exceed the capacity of the receiver to process data (flow control). The receiver may be overwhelmed by data if for example it is very busy (such as a Web server). Each TCP segment contains the current value of the receive window. If, for example, a sender receives an ack which acknowledges byte 4000 and specifies a receive window of 10000 (bytes), the sender will not send packets after byte 14000, even if the congestion window allows it. Theory The TCP window scale option is needed for efficient transfer of data when the bandwidth-delay product is greater than 64K. For instance, if a T1 transmission line of 1.5Mbits/ second was used over a satellite link with a 513 millisecond round trip time (RTT), the bandwidth-delay product is (1,572,864 * 0.513) = 806,879 bits or about 100,860 bytes. Using a maximum buffer size of 64K only allows the buffer to be filled to (65,535 / 100,860) = 65% of the theoretical maximum speed of 1.5Mbits/second, or 1.02 Mbit/s. By using the window scale option, the receive window size may be increased up to a maximum value of 1,073,725,440 bytes. This is done by specifying a one byte shift count in the header options field. The true receive window size is left shifted by the value in shift count. A maximum value of 14 may be used for the shift count value. This would allow a single TCP connection to transfer data over the example satellite link at 1.5Mbit/second utilizing all of the available bandwidth. Possible side effects because some routers and firewalls do not properly implement TCP Window Scaling, it can cause a user's Internet connection to malfunction intermittently for a few minutes, then appear to start working again for no reason. There is also an issue if a firewall doesn't support the TCP extensions. [1] SCALING Configuration of operating systems TCP Window Scaling is implemented in Windows since Windows 2000.[2][3] It is enabled by default in Windows Vista / Server 2008 and newer, but can be turned off manually if required.[4] Windows Vista and Windows 7 have a fixed default TCP receive buffer of 64 kb, scaling up to 16 MB through "autotuning", limiting manual TCP tuning over long fat networks.[5] Linux kernels (from 2.6.8, August 2004) have enabled TCP Window Scaling by default. The configuration parameters are found in the /proc filesystem, see pseudo file /proc/sys/net/ipv4/tcp_window_scaling and its companions /proc/sys/ net/ipv4/tcp_rmem and /proc/sys/net/ipv4/tcp_wmem (more information: man tcp, section sysctl).[6] Scaling can be turned off by issuing the command sysctl –w "net.ipv4.tcp_window_scaling=0" as root. To maintain the changes after a restart, include the line "net.ipv4.tcp_window_scaling=0" in /etc/sysctl.conf (or /etc/ sysctl.d/99-sysctl.conf as of systemd 207). The default setting for FrereBSD, OpenBSD, NetBSD and Mac OS X is to have window scaling (and other features related to RFC 1323 ) enabled. To verify their status, a user can check the value of the "net.inet.tcp.rfc1323" variable via the sysctl command: sysctl net.inet.tcp.rfc1323 A value of 1 (output "net.inet.tcp.rfc1323=1") means scaling is enabled, 0 means "disabled". If enabled it can be turned off by issuing the command: sudo sysctl -w net.inet.tcp.rfc1323=0 This setting is lost across a system restart. To ensure that it is set at boot time, add the following line to /etc/sysctl.conf: net.inet.tcp.rfc1323=0 Squeeze Your Gigabit NIC for Top Performance Gigabit network cards are becoming more and morecommon, but getting maximum speed depends on the right mix of hardware, software, and finesse. Here's how to squeeze top performance out of your gigabit gear using Linux, FreeBSD, and Windows. Many new workstations and servers are coming with integrated gigabit ( define) network cards, but quite a few people soon discover that they can't transfer data much faster than they did with 100 Mb/s network cards. Multiple factors can _affect your ability to transfer at higher speeds, and most of them revolve around operating system settings. In this article we will discuss the necessary steps to make your new gigabit-enabled server obtain close to gigabit speeds in Linux, FreeBSD, and Windows. Hardware Considerations First and foremost we must realize that there are hardware limitations to consider. Just because someone throws a gigabit network card in a server doesn't mean the hardware can keep up. Network cards are normally connected to the PCI busbus via a free PCI slot. In older workstation and non server-class motherboards the PCI slots are normally 32 bit, 33MHz. This means they can transfer at speeds of 133MB/s. Since the bus is shared between many parts of the computer, it's realistically limited to around 80MB/s in the best case. Gigabit network cards provide speeds of 1000Mb/s, or 125MB/s. If the PCI bus is only capable of 80MB/s this is a major limiting factor for gigabit network cards. The math works out to 640Mb/s, which is really quite a bit faster than most gigabit network card installations, but remember this is probably the best-case scenario. If there are other hungry data-loving PCI cards in the server, you'll likely see much less throughput. The only solution for overcoming this bottleneck is to purchase a motherboard with a 66MHz PCI slot, which can do 266MB/s. Also, the new 64 bit PCI slots are capable of 532MB/s on a 66MHz bus. These are beginning to come standard on all server-class motherboards. Assuming we're using decent hardware that can keep up with the data rates necessary for gigabit, there is now another obstacle — the operating system. For testing, we used two identical servers: Intel Server motherboards, Pentium 4 3.0 GHz, 1GB RAM, integrated 10/100/1000 Intel network card. One was running Gentoo Linux with a 2.6 SMP (define) kernel, and the other is FreeBSD 5.3 with an SMP kernel to take advantage of the Pentium 4's HyperThreading capabilities. We were lucky to have a gigabit capable switch, but the same results could be accomplished by connecting both servers directly to each other. Software Considerations

 Image result for TCP


For testing speeds between two servers, we don't want to use FTP or anything that will fetch data from disk. Memory to memory transfers are a much better test, and many tools exist to do this. In actuality, most people will see even worse performance out of the box. However, with a few minor setting changes, we quickly realized major speed improvements — more than a threefold improvement over the initial test. Many people recommend setting the MTU of your network interface larger. This basically means telling the network card to send a larger Ethernet frame. While this may be useful when connecting two hosts directly together, it becomes less useful when connecting through a switch that doesn't support larger MTUs. At any rate, this isn't necessary. 900Mb/s can be attained at the normal 1500 byte MTU setting. For attaining maximum throughput, the most important options involve TCP window sizes. The TCP window controls the flow of data, and is negotiated during the start of a TCP connection. Using too small of a size will result in slowness, since TCP can only use the smaller of the two end system's capabilities. It is quite a bit more complex than this, but here's the information you really need to know. Configuring Linux and FreeBSD For both Linux and FreeBSD we're using the sysctl utility. For all of the following options, entering the command 'sysctl variable=number' should do the trick. To view the current settings use: 'sysctl ' Maximum window size: FreeBSD: kern.ipc.maxsockbuf=262144 Linux: net.core.wmem_max=8388608 Default window size: FreeBSD, sending and receiving: net.inet.tcp.sendspace=65536 net.inet.tcp.recvspace=65536 Linux, sending and receiving: net.core.wmem_default = 65536 net.core.rmem_default = 65536 RFC 1323: This enables the useful window scaling options defined in rfc1323, which allows the windows to dynamically get larger than we specified above. FreeBSD: net.inet.tcp.rfc1323=1 Linux: net.ipv4.tcp_window_scaling=1 Buffers: When sending large amounts of data, we can run the operating system out of buffers. This option should be enabled before attempting to use the above settings. To increase the amount of "mbufs" available: FreeBSD: kern.ipc.nmbclusters=32768 Linux: net.ipv4.tcp_mem= 98304 131072 196608 These quick changes will skyrocket TCP performance. Afterwards we were able to run ttcp and attain around 895 Mb/s every time – quite an impressive data rate. There are other options available for adjusting the UDP datagram sizes as well, but we're mainly focusing on TCP here. Windows XP/2000 Server/Server 2003 The magical location for TCP settings in the registry editor is HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services \Tcpip\Parameters We need to add a registry DWORD named TcpWindowSize, and enter a sufficiently large size. 131400 (make sure you click on 'decimal') should be enough. Tcp1323Opts should be set to 3. This enables both rfc1323 scaling and timestamps. And, similarly to Unix, we want to increase the TCP buffer sizes: ForwardBufferMemory 80000 NumForwardPackets 60000 One last important note for Windows XP users: If you've installed Service Pack 2, then there is another likely culprit for poor network performance. Explained in 'knowledge base article 842264(http://support.microsoft.com/?kbid=842264), Microsoft says that disabling Internet Connection Sharing after an SP2 install should fix performance issues. The above tweaks should enable your sufficiently fast server to attain much faster data rates over TCP. If your specific application makes significant use of UDP, then it will be worth looking into similar options relating to UDP datagram sizes. Remember, we obtained close to 900Mb/s with a very fast Pentium 4 machine, server-class motherboard, and quality Intel network card. Results may vary wildly, but adjusting the above settings are a necessary step toward realizing your server's capabilities. 

USING LARGE TCP WINDOW OVER TELNET AND INTERACTIVE SSH CONNECTIONS 

On the subject of TCP flow control, we can't neglect mention of the Nagle algorithm. What would happen if you had a large TCP window over a telnet connection? You'd type a command, then wait and wait and wait for a response. This is a major problem for real-time applications. Furthermore, telnet can add to congestion, since a 1-byte packet will include 40 bytes of header. RFC 896 defines the Nagle algorithm to attempt to abolish tiny packets. The idea is that we should give data a chance to pile up before sending, to be more efficient. It says that we can only have one unacked small segment, and you can't send more data until you get an ACK. Telnet and interactive ssh connections turn this off with the TCP_NODELAY socket option, so that you can get an immediate response when you press a key.

Comments

Popular Posts