Skip to content

Stale IN_PROGRESS causes source to be permanently skipped after unclean shutdown #654

@lujunsan

Description

@lujunsan

If the server process is killed ungracefully (OOMKill, SIGKILL, container crash), a sync that was in progress is left in IN_PROGRESS with no ended_at. On restart, the sync record is preserved as-is (startup initialization skips existing rows). Because ShouldSync returns early for any source in Syncing state with no timeout or age check, the source is silently skipped on every subsequent coordinator cycle and never retried.

The fix should be a background watchdog goroutine (independent of the coordinator loop) that periodically resets IN_PROGRESS rows older than a configurable threshold to FAILED. The threshold must be configurable to account for varying worst-case sync durations. This also covers the case where PerformSync hangs indefinitely in a live process and the coordinator loop is blocked.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions