Syncookies break with Windows 8

Kevin Day kevin at your.org
Fri Feb 1 21:21:20 UTC 2013


We've got a large cluster of HTTP servers, each server handling >10,000req/sec. Occasionally, and during periods of heavy load, we'd get complaints from some users that downloads were working but going EXTREMELY slowly. After a whole lot of debugging, we narrowed it down to being only Windows 8 clients experiencing this problem. It turns out that FreeBSD's implementation of syncookies is likely violating RFC1323.

When syncookies kicks in, either because the syncache limit is reached or net.inet.tcp.syncookies_only is set, some shortcuts are taken with regard to TCP connections. Unlike some other syncookies implementations which (ab)use timestamps to store options, the FreeBSD implementation of syncookies discards TCP options such as window scaling. In itself this isn't a bad thing, but it becomes a bad thing because we then lie and pretend that we are supporting window scaling.

According to RFC1323, if you want to use TCP window scaling, the client says so on the initial SYN. If the server is also willing to use scaling, it says so on the SYN/ACK. If both parties included a scaling option on their respective SYN, you assume window scaling is working and proceed to use it. If one or both parties don't have a scaling option, you don't scale at all. The problem here is that with syncookies, we don't save the wscale parameter from the client's SYN, but offer to use window scaling anyway on our SYN/ACK, so the client thinks we successfully negotiated window scaling even though we haven't.


This is how a normal window scaled connection happens:

client > server: Flags [S], win 65535, options [mss 1460,nop,wscale 4,nop,nop,sackOK], length 0
(client is connecting, offering a window of 64K, but if scaling is negotiated wants to scale future window sizes by 4 bits)

server > client: Flags [S.], win 65535, options [mss 1460,nop,wscale 5,sackOK,eol], length 0
(server is ACKing the client's SYN, also offering an unscaled window of 64K, but wanting to shift by 5 going forward)

The server and client both offered window scaling, so they're now using it from this point on. All window sizes sent/received are shifted by the appropriate number of bits.


When syncookies kicks in on the server, and the client is anything BUT Windows 8, this happens:

client > server: Flags [S], win 65535, options [mss 1460,nop,wscale 4,nop,nop,sackOK], length 0
However, syncookies cause the options to get lost. The client sent the "wscale 4" parameter, but we immediately forgot it.

server > client: Flags [S.], win 65535, options [mss 1460,nop,wscale 5,sackOK,eol], length 0
(server is ACKing the client's SYN, also offering an unscaled window of 64K, but wanting to shift by 5 going forward)

The server sent a wscale back on its SYN/ACK, so the client thinks window scaling is now in effect. But it's not, the server didn't remember the client's wscale option, so it's not scaling any of the received window sizes that are coming in from the client. This doesn't actually hurt much. The client thinks it's telling us it has a 1MB window open, but we're only hearing that it's sent a 64K window, so that's all we ever use. It's "failing safe" here, and nothing actually breaks.


Now throw Windows 8 into the mix. Windows 8's TCP auto tuning is much more aggressive than previous versions of Windows. I honestly can't tell if this is a bug or intentional design, but Windows will sometimes, intermittently, advertise a much much larger wscale option than it actually needs. This is a mild example of what happens:

client > server: Flags [S], win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
(client is connecting, offering an unscaled window of 8192 bytes, but wants to negotiate window scaling of 8 bits if the server will accept it)

server > client: Flags [S.], win 65535, options [mss 1460,nop,wscale 5,sackOK,eol], length 0
(server is ACKing the client's SYN, also offering an unscaled window of 64K, but wanting to shift by 5 going forward)

We're at the same point here as in the above example, the client now believes we've successfully negotiated window scaling, but on the server side we're treating all window sizes coming from the client as being shifted by 0. So the client sends it's first ACK:

client > server: Flags [.], seq 1, ack 1, win 256, length 0

The client believes we're still scaling everything it says by 8 bits, but it only wants to give us a 64K window, so it's saying 256 here. (256<<8 = 65536). We don't remember that we agreed to shift everything by 8, so we treat that as just 256. The connection now proceeds, but we think we can only send 256 bytes at a time. It is extremely slow.

I have seen Windows 8 attempt to use wscale parameters of 8 all way up to 10. While I've only caught a few cases of this happening in the wild, when it's using 10 we end up thinking we only have a 64 byte window and things get really silly really fast.


I've been talking with someone on Microsoft's side of things about why Windows is choosing to do this. But my own view of this is that if syncookies are being used in their current state (we lose the client's wscale option), we can't advertise wscale on the SYN/ACK. My reading of RFC1323 says that if we put a wscale option in our SYN/ACK that means we agreed to use the client's wscale in their SYN. I don't think that's correct. If syncookies are being used, we should advertise MIN(sb_max, TCP_MAXWIN) with no scaling and stay within the RFC.

This doesn't affect Linux because it uses timestamp options to stuff the client's wscale, so it gets re-learned on the ACK. OpenBSD and OS X don't have syncookies. NetBSD seems to have the same problem if it's new syncookie implementation gets turned on. 

Any thoughts? Was there a reason why we're forcing the use of wscale on syncookie connections?

-- Kevin



More information about the freebsd-net mailing list