Fix FIN_WAIT_2 accumulation by draining sockets before close#600
Fix FIN_WAIT_2 accumulation by draining sockets before close#600renaudallard wants to merge 2 commits intotinyproxy:masterfrom
Conversation
After shutdown(SHUT_WR) sends a FIN to the remote, tinyproxy was calling close() without waiting for the remote FIN. The socket was orphaned in FIN_WAIT_2 state. On Linux this is masked by tcp_fin_timeout reaping orphaned sockets after 60s. On OpenBSD, without SO_KEEPALIVE, these sockets persist indefinitely and accumulate until the proxy stalls. Add close_socket() that calls shutdown(SHUT_WR), then drains the socket with a 2-second receive timeout to allow the remote FIN to arrive before calling close(). Use it in conn_destroy_contents() for both client and server file descriptors, covering all exit paths from relay_connection() including idle timeout and poll error returns. Also add the missing shutdown(server_fd, SHUT_WR) in relay_connection() after flushing remaining data to the upstream server, so the server receives a proper FIN rather than relying on the implicit close().
72c3e70 to
004da4b
Compare
src/conns.c
Outdated
| * from getting stuck in FIN_WAIT_2 on systems that do not aggressively | ||
| * reap orphaned sockets (e.g. OpenBSD without SO_KEEPALIVE). | ||
| */ | ||
| static void close_socket (int fd) |
There was a problem hiding this comment.
this should probably go to sock.c
There was a problem hiding this comment.
Agreed, moved to sock.c where socket creation and destruction already live.
src/conns.c
Outdated
|
|
||
| shutdown (fd, SHUT_WR); | ||
|
|
||
| tv.tv_sec = 2; |
There was a problem hiding this comment.
why 2 secs ? this timeout is probably too short for long-range or wacky mobile connections.
There was a problem hiding this comment.
Good point, bumped to 10 seconds. This is only for receiving the remote FIN after we sent ours, so it should not block normal operation.
| if (write_buffer (connptr->server_fd, connptr->cbuffer) < 0) | ||
| break; | ||
| } | ||
| shutdown (connptr->server_fd, SHUT_WR); |
There was a problem hiding this comment.
why another shutdown() here ?
There was a problem hiding this comment.
It mirrors the pre-existing shutdown(client_fd, SHUT_WR) above. Both initiate the close handshake early so that by the time close_socket() runs in conn_destroy_contents, the remote FIN may already have arrived and the drain loop returns immediately.
|
tbh this is the first time i hear that another read is necessary after shutdown. do you have a link to some documentation about this ? (optimally POSIX). |
|
POSIX does not specify TCP state machine behavior, so there is no POSIX reference for this. The relevant sources are RFC 793 section 3.5 (TCP connection closing and the FIN_WAIT_2 state) and Stevens, "UNIX Network Programming" Vol. 1, section 6.6 (the shutdown function and half-close pattern). The problem: after close(), the socket becomes orphaned in the kernel. If it is still in FIN_WAIT_2 (we sent our FIN, waiting for the remote's FIN), the kernel must handle the rest on its own. Linux reaps orphaned FIN_WAIT_2 sockets via net.ipv4.tcp_fin_timeout (default 60s), but OpenBSD has no equivalent, so they persist indefinitely. The fix: shutdown(SHUT_WR) sends our FIN while keeping the fd open. The read loop then waits for the remote's FIN (read returns 0). Only then is close() called, at which point the socket has already completed the four-way handshake and transitions to TIME_WAIT normally. |
Problem
On OpenBSD, proxied connections accumulate in FIN_WAIT_2 state and are
never reaped. When enough build up the proxy stalls.
After
shutdown(client_fd, SHUT_WR)sends a FIN to the client,conn_destroy_contents()callsclose()without waiting forthe remote FIN. The socket is orphaned while still in FIN_WAIT_2.
On Linux this is masked by
net.ipv4.tcp_fin_timeout(default 60 s)which aggressively reaps orphaned FIN_WAIT_2 sockets. OpenBSD has no
equivalent aggressive timeout, so without
SO_KEEPALIVEthese socketspersist indefinitely.
The problem also affects the timeout and poll-error return paths in
relay_connection(), which skipshutdown()entirely and gostraight to
close().Fix
close_socket()helper inconns.cthat performs a properTCP close handshake:
shutdown(SHUT_WR), drain with a 2-secondSO_RCVTIMEO, thenclose(). This gives the remote side timeto send its FIN before the socket is orphaned.
conn_destroy_contents()for both client and serverfile descriptors, covering all exit paths from
relay_connection().shutdown(server_fd, SHUT_WR)inrelay_connection()after flushing remaining data to the upstreamserver.
Testing
Tested on OpenBSD. After the fix, connections transition through
TIME_WAIT normally instead of accumulating in FIN_WAIT_2.