As part of our ongoing discussion on fast file transfers, or “acceleration,” (which we began with the article “Accelerating File transfer“), we have already talked about how transfers can be “virtually” accelerated by minimizing the data sets. This is done with compression or sending file deltas. In short, less data = faster transfer.
An acceleration solution should also optimize your link. This takes us out of the realm of “virtual” speed boosts and into the realm of more data making it across the line in less time. One way to accomplish this is to use multiple streams.
As a TCP-based protocol, FTP is sequential in nature. For the easily distracted, the problem with this is that if there are problems with this sequence (the cycle of acknowledging packets) such as lost packets and latency, TCP transfers are throttled. You can compensate for this bottneck at least in part by sending more than one chunk of data at a time, be it separate files or separate pieces of the same file.
For the braver souls, here’s the techno-babble:
Latency and its effect on FTP
In order to provide reliable data transmission, TCP/IP requires the receiver to acknowledge each packet being sent, in sequential order. Each such communication is measured as round trip time (RTT), or the time it takes for a packet to be sent and acknowledged.
TCP responds to latency by adjusting the amount of unacknowledged data that can be on the link before waiting for a reply. The optimal amount of unacknowledged data en route should equal the end-to-end bandwidth multiplied by the RTT, also called the bandwidth-delay product. TCP continually estimates this value, setting a “TCP window” to control how much data should be sent. TCP has limits on the size of this value, so when the bandwidth-delay product exceeds a certain threshold, the result is a lot of waiting or “dead air”. Satellite connections can be into the hundreds or even thousands of milliseconds of RTT.
Packet loss and its effect on FTP
Network congestion typically causes buffer overflows of intermediate routers, ausing packet loss. Since packets are sent sequentially, this can cause a hold-up in the cycle. Unacknowledged packets also cause the TCP window to shrink or even close completely for periods of time. Wireless and satellite transfers can have even higher packet loss due to sources of interference such as clouds or physical structures. Packet loss combined with high latency creates even worse performance for FTP transfers.
Acceleration with Multiple Streams
The idea behind multiple streams is to reduce the “dead air” created by the acknowledgment cycle. Instead of using 1 stream, multiple streams are opened, each sending a different piece of data. In theory this could be separate files, but this wouldn’t help transmission of a single large file. Instead, FileCatalyst will divide a file up into separate chunks: in a 3-stream situation, for example, stream 1 might send chunks 1, 4, 7, 10, etc; stream 2 would send 2, 5, 8, 11, etc; stream 3 would send 3, 6, 9, 12, etc. This way, there is almost always SOME data going across. No “dead air”. And even if chunks arrive out of order, they are just “slotted” into place accordingly. Sounds good, but it’s important to recognize that there ARE diminished returns on opening too many streams (and your hardware I/O needs to be able to keep up in order to make it worthwhile) but for the most part having multiple streams will produce a reasonable amount of acceleration.
When applied to TCP-based transfers such as FTP, you can see the benefit of multiple streams. Because of this, FileCatalyst acceleration technology includes multiple TCP streaming for situations in which UDP-based transfers are not possible. However, this approach still does not produce the same dramatic effect as ditching TCP in favour of alternatives like UDP. UDP-based transfers, which are at the heart of the FileCatalyst acceleration technology, will be the subject of our next article on acceleration.