Skip to content

add backoff to heartbeats and detect unresponsive nodes#20

Open
miDeb wants to merge 8 commits into
mainfrom
feat/node-offline-detection
Open

add backoff to heartbeats and detect unresponsive nodes#20
miDeb wants to merge 8 commits into
mainfrom
feat/node-offline-detection

Conversation

@miDeb

@miDeb miDeb commented May 25, 2026

Copy link
Copy Markdown
Member

Adds additional config options to enable exponential backoff, as well as a maximum number of unanswered heartbeats.

Moved heartbeat logic to heartbeat.rs

miDeb and others added 8 commits May 9, 2026 17:28
If nodes don't respond, we retry sending heartbeat requests with exponential backoff.
If the still do not respond after a set amount of attempts, we remove them from the list of connected nodes,
and they must re-register.
Previously, heartbeats were scheduled an amount of periods from a fixed reference point.
Therefore, if one heartbeat request was delayed, but the next was not, there could be a smaller time delta between those two requests.
This made the calculation of the backoff wrong.
@raffael0 raffael0 linked an issue Jun 7, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement Heartbeat in Ferroflow

1 participant