Background
PR #218 routes lori's _on_closed through the session state machine so peer-initiated TCP close is detected in every session state. That covers the cases where the OS tells us the connection is gone — FIN (graceful close) or RST (abrupt close).
It does not cover cases where nobody tells us. If the server host loses power, a VM is hard-killed, a NAT box drops the connection from its tracking table, or a network partition drops packets without generating a RST, the client-side OS has no way to know the connection is dead. lori's read never returns, _on_closed never fires, and an idle session sitting in _QueryReady with nothing to send hangs indefinitely.
Why this is a low-priority enhancement, not a bug
In practice, this situation is self-limiting:
- The next query attempt writes to the dead socket; eventually (once OS retransmits give up) lori reports the failure and
_on_closed fires.
- If the application is shutting down, it doesn't matter — the actor is torn down anyway.
The only scenario where it bites is a long-idle session that the application wants to keep "warm" without issuing any traffic.
Possible approaches
- OS-level TCP keepalive on the socket (simplest — kernel does the probing)
- Application-level heartbeat (periodic no-op query or
Sync message)
- Idle timeout via lori's
idle_timeout() mechanism
- Some combination, configurable via
ServerConnectInfo
Design should consider whether this is opt-in (default off, activated by passing a keepalive/heartbeat interval) to avoid changing behavior for existing users.
Background
PR #218 routes lori's
_on_closedthrough the session state machine so peer-initiated TCP close is detected in every session state. That covers the cases where the OS tells us the connection is gone — FIN (graceful close) or RST (abrupt close).It does not cover cases where nobody tells us. If the server host loses power, a VM is hard-killed, a NAT box drops the connection from its tracking table, or a network partition drops packets without generating a RST, the client-side OS has no way to know the connection is dead. lori's read never returns,
_on_closednever fires, and an idle session sitting in_QueryReadywith nothing to send hangs indefinitely.Why this is a low-priority enhancement, not a bug
In practice, this situation is self-limiting:
_on_closedfires.The only scenario where it bites is a long-idle session that the application wants to keep "warm" without issuing any traffic.
Possible approaches
Syncmessage)idle_timeout()mechanismServerConnectInfoDesign should consider whether this is opt-in (default off, activated by passing a keepalive/heartbeat interval) to avoid changing behavior for existing users.