Monday, September 7, 2009

Congestion Avoidance and Control

This paper describes changes to the BSD TCP stack made to improve its response to congestion in the network. The algorithms tries to maintain equilibrium, by which the authors mean a conservation of packets, or that no new packet is put into the network until an old one is taken out. Getting to this state without global knowledge is the focus of this paper.

There are a number of techniques used to achieve equilibrium. First is slow-start, in which a second "congestion window" (cwnd) is used to limit the number of packets are sent initially. When sending the min of the receivers advertised window and cwnd is used. cwnd starts at one and is increased by one due to each successful ACK. This will increase window size quickly, but avoids the problem where two hosts on fast networks have a slow link between them which will be overloaded by sending the size of the advertised window.

The second technique is estimating the variance in the round trip time of packets (gleaned from ACKs). This is necessary since as the network becomes congested variance will increase. If the sender responds to packets not getting ACKed fast enough by retransmitting it will only overload the network more. The authors develop a cheap method for estimating variance and the retransmit timer based on that method avoids unnecessary retransmits.

Finally the authors take the results from Increase/Decrease Algorithms paper and argue for additive increase and multiplicative decrease in window sizes. They solve the problem of delayed feedback by using dropped packets as an indicator of congestion rather than an explicit bit.

This was a very well argued paper, and it has the additional benefit of being an actual implementation. The reasons for the decisions made are clearly set forth and only the assumptions the authors make can really be argued with.

Nevertheless I still have some issues with this paper. The paper does not deal with misbehaving nodes which are always a problem in a heterogeneous network. (See here for example). Also, dropped packets are not always an indication of congestion. I think wireless was probably much less used when this paper was written that it is now, but I imagine there must be higher dropped packet rates on a wireless network due to corruption than on a wired network and as such the slow retransmit could mean that wireless networks are not used at their full capacity with this scheme. This paper also doesn't consider more ad-hoc networks in which each stage in a message transmission might be acked but never the whole path (i.e. in a sensor network). How well the RTT estimation would work in such a network is not clear.

Still, this seems a very fundamental paper to and should be kept on the reading list.

No comments:

Post a Comment