At 10AM -0800 on 1/12/12 Erik A Johnson wrote:
On November 29, 2012 at 2:39:51 PM PST, Timo Sirainen tss@iki.fi wrote:
Yes, that sounds like it would work better:
if (!proxy->client && net_getpeername(proxy->fd_ssl, NULL, NULL) < 0 && errno == ENOTCONN) {
Using getpeername or net_getpeername, errno is set to EINVAL = "socket has been shut down", so we could instead use
if (!proxy->client && net_getpeername(proxy->fd_ssl, NULL, NULL) < 0 && errno == EINVAL) {
So it seems that we have the following options:
- net_geterror(proxy->fd_ssl) == EBADF
- read(proxy->fd_ssl, &err, 0) < 0 && errno == ENOTCONN
- net_getpeername(proxy->fd_ssl, NULL, NULL) < 0 && errno == EINVAL
Which is preferable?
I think the djb page I mentioned before comes down in favour of
if (!proxy->client && net_getpeername(...) < 0 &&
read(proxy->fd_ssl, &err, 1) < 0 && errno == ENOTCONN) {
where the read is of 1 byte rather than 0 since a read of 0 apparently sometimes succeeds even if the socket isn't connected, and the getpeername is to protect against actually reading from a connected socket. However, doing all that every time we try to read seems like a lot of wasted effort; if possible, it would be better to identify the circumstances when it might happen (for instance, is it true that if we've done at least one successful SSL read on this socket that this error won't occur?).
Should the "#ifdef __APPLE__" remain? or would any of these tests be appropriate for other platforms as well?
I had a go at reproducing this on FreeBSD and failed, but I don't believe we've seen a packet trace yet so I wasn't entirely sure what might provoke it. There is definitely a bug in the OS here somewhere, unless the socket never gets as far as SYN-SYN/ACK-ACK, since ENOTCONN should only be returned *before* the socket has connected successfully. An ordinary disconnected socket should simply return EOF from read, and a socket that got RST should return ECONNRESET.
Are you able to reproduce this and get a tcpdump packet trace (on the dovecot side of any firewalls)? Also, when this happens, does it happen straight away or is there a delay until the connection times out?
(I don't suppose you know if the source for the OSX network stack is online anywhere? I'd be interested to see how different it is from FreeBSD's.)
Ben