Feat/auto-restart tunnels on stale handshake or ping failure 1036 EXPERIMENTAL#1176
Closed
naonak wants to merge 8 commits intowgtunnel:masterfrom
Closed
Feat/auto-restart tunnels on stale handshake or ping failure 1036 EXPERIMENTAL#1176naonak wants to merge 8 commits intowgtunnel:masterfrom
naonak wants to merge 8 commits intowgtunnel:masterfrom
Conversation
Introduces the data model for the auto-restart feature: - MonitoringSettings entity/domain model with all configurable fields: isAutoRestartEnabled, restartCooldownSeconds, maxHandshakeRestartAttempts, startupGraceSeconds, isRecoveryNotificationEnabled, isPingMonitoringEnabled, pingFailuresBeforeRestart, isBackoffEnabled, backoffMaxAttempts, maxAttemptsAction - MaxAttemptsAction enum: DO_NOTHING or STOP_TUNNEL when max attempts reached - TunnelRestartProgress domain state for real-time UI feedback - BackendMessage extended with restart-related events (restarting, recovered, max attempts reached) - MonitoringSettingsMapper for entity ↔ domain conversion - DatabaseConverters updated for new types - AppDatabase bumped to v31 with auto-migrations (v29→30, v30→31) - DB schema snapshots for v30 and v31 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
HandshakeRestartHandler runs a coroutine per active tunnel and handles all auto-restart logic. Key behaviours: Restart triggers - Stale handshake: restarts when last handshake exceeds the WireGuard threshold (always active when auto-restart is enabled) - Ping failure streak: optionally restarts after N consecutive ping failures reported by the ping monitor (isPingMonitoringEnabled) False-positive protection - Startup grace period: skips restart checks for configurable seconds after the tunnel first starts, avoiding false triggers during the initial WireGuard handshake - Post-restart grace period: waits after each restart before re-checking, preventing rapid-fire loops when the cooldown is shorter than the WireGuard re-keying time - Pre-restart verification pings: when ping is enabled, performs a fresh ping series just before restarting; skips restart if any target is reachable (tunnel self-recovered) Rate limiting & give-up - Configurable cooldown between attempts (restartCooldownSeconds) - Optional exponential backoff: doubles cooldown each attempt up to backoffMaxAttempts, then triggers maxAttemptsAction - maxAttemptsAction: DO_NOTHING (keep monitoring) or STOP_TUNNEL Observability - Emits TunnelRestartProgress events consumed by the UI for real-time status display (countdown, attempt count, restart reason) - Recovery notifications via NotificationMonitor when a tunnel that was restarting comes back healthy Integration - TunnelManager creates one HandshakeRestartHandler per tunnel start and cancels it on stop - TunnelLifecycleManager and TunnelProvider updated to expose the required state flows Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New AutoRestartScreen accessible from Settings > Tunnel Monitoring: - Enable/disable auto-restart toggle - Restart cooldown dropdown (5s → 5min) - Startup grace period dropdown (0 → 60s) - Restart on ping failure toggle (gated on ping being enabled) - Consecutive ping failures threshold (1–5) - Exponential backoff toggle with give-up attempts dropdown; dropdown label shows estimated total wait time for quick tuning - Max attempts action: do nothing or stop tunnel - Recovery notifications toggle Navigation: added AutoRestart route, entry in MainActivity nav graph, and navbar state mapping. MonitoringViewModel exposes all settings as state with individual update intents. LabelledNumberDropdown added as a new reusable component for numeric option lists. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Each tunnel card now displays live restart progress when HandshakeRestartHandler is active: - "Restarting… (attempt N)" during a restart - "Next restart in Xs" countdown during cooldown - Restart reason: stale handshake or ping failure - Ping target when the trigger is a ping failure - "Max attempts reached" when give-up action fires - Status clears automatically on tunnel recovery SharedAppViewModel collects TunnelRestartProgress from TunnelManager and exposes it as a StateFlow. TunnelsUiState carries the progress map keyed by tunnel ID. SettingsViewModel passes it through to the tunnels screen. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This was referenced Feb 26, 2026
Contributor
Author
- AppDatabase: consolidate auto-migrations 31→32→33→34→35 into a single AutoMigration(31, 35); intermediate schema files were never committed so Room could not generate the migration code - TunnelManager: remove `override val restartCounts` which was absent from the TunnelProvider interface (restartCounts is managed internally by HandshakeRestartHandler; attemptNumber in TunnelRestartProgress serves the same purpose externally) - SharedAppViewModel: remove redundant restartCounts from the combine; use restartProgress only (which already contains attemptNumber) - TunnelList: remove restartCount parameter from TunnelStatisticsRow call (parameter was removed from the composable signature) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add description under "Restart on ping failure" (requires ping monitoring) - Add description under "Startup grace period" - Tune defaults: grace 30→10s, cooldown 30→15s, ping failures before restart 1→2 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
10 seconds of silent traffic failure on WiFi→LTE transitions was too aggressive. 3 seconds is sufficient to distinguish a real network switch from a brief drop, while limiting unnecessary downtime. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…creen When auto-restart is active but battery optimization is NOT disabled, Android may restrict the monitoring process (especially on pre-Android-14 devices). A contextual banner now appears at the top of the screen with a direct link to the system battery optimization settings. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Contributor
Author
|
new version #1182 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Auto-restart tunnels on stale handshake or ping failure (EXPERIMENTAL)
WORK IN PROGRESS.
Summary
Adds an optional auto-restart mechanism that monitors the active WireGuard tunnel and automatically restarts it when a connection problem is detected. The feature is entirely opt-in and configurable through a new screen under Settings → Tunnel Monitoring → Auto-restart.
Problem
WireGuard tunnels can silently stop passing traffic when:
Without manual intervention the tunnel stays "Up" in the UI while being effectively dead.
What's new
Functional
Configuration
Technical design
HandshakeRestartHandlerThe core of the feature. A monitoring coroutine is started when the tunnel comes up and cancelled when it goes down (via
activeTunnelsStateFlow). AMutexserialises job lifecycle.Trigger logic (
shouldTrigger)isPingMonitoringEnabledFalse-positive protection
lastPingAttemptMilliscomparisonNetworkUtils.pingWithStats()isTunnelStale()can still fire on stale stats before the new handshake completesRate limiting
ArrayDeque<Long>cooldown × 2^(attempt-1), capped at attempt 31 to preventLongoverflowNetwork change reactivity
networkChangeFlowobserves connectivity state changes (WiFi ↔ Cellular ↔ Ethernet) and wakes the monitoring loop immediately after a 3 s grace, avoiding the full ~3.5 min stale-handshake wait after a network switchGive-up
DO_NOTHING— suspends until the tunnel recovers or goes down, then resets timestamps and resumes monitoringSTOP_TUNNEL— callsstopTunnel(id)and returns (job terminates)Database
MonitoringSettingsentity extended with 9 new fields (all with sane defaults via auto-migration)MaxAttemptsActionstored as a string enum viaDatabaseConvertersTunnelRestartProgressis a pure domain state type — not persisted, lives only in memoryUI
AutoRestartScreenexposes all settings throughMonitoringViewModel(Orbit MVI pattern)AutoRestartScreenshows a warning banner when battery optimization is enabled, with a direct tap-to-disable shortcut — battery optimization can prevent auto-restart from firing reliably on some devicesLabelledNumberDropdownadded as a reusable component for numeric option lists5 attempts (~4m35s)) computed fromcomputeCooldown()so the user can reason about the effective timeoutTunnelRestartProgressflows fromHandshakeRestartHandler→TunnelManager→SharedAppViewModel→TunnelsUiState→TunnelListAlso included
fix: reduce network change grace period from 10 s to 3 s
HandshakeRestartHandlerobserves network transitions (WiFi ↔ LTE ↔ Ethernet) and wakes the restart loop early to avoid waiting the full ~3.5 min stale-handshake window. The previous grace period of 10 s was longer than necessary: 3 s is enough to distinguish a real network switch from a momentary drop, while still reacting quickly enough to restart the tunnel before the user notices the outage.fix: show battery optimization warning in auto-restart screen
If Android battery optimization is active, the app process can be throttled or delayed in the background, preventing auto-restart from firing reliably (especially on devices running Android < 14 with aggressive OEM power management). A contextual warning banner is now shown at the top of the Auto-restart screen whenever battery optimization is not disabled, with a tap action that opens the system exemption prompt directly.
Test plan