Skip to content

fix(lume): handle guest-initiated VM shutdown via VZVirtualMachineDelegate#1254

Open
NikkeTryHard wants to merge 2 commits intotrycua:mainfrom
NikkeTryHard:fix/guest-shutdown-cleanup
Open

fix(lume): handle guest-initiated VM shutdown via VZVirtualMachineDelegate#1254
NikkeTryHard wants to merge 2 commits intotrycua:mainfrom
NikkeTryHard:fix/guest-shutdown-cleanup

Conversation

@NikkeTryHard
Copy link
Copy Markdown

@NikkeTryHard NikkeTryHard commented Apr 1, 2026

Problem

When shutting down macOS from inside the guest (Apple menu > Shut Down), the Lume host process does not exit. lume ls continues to show the VM as running, and lume stop hangs because the while true { Task.sleep } loop in VM.run() never breaks. The only recovery is kill.

The root cause is that Lume never sets a VZVirtualMachineDelegate on its VZVirtualMachine instance, so the guestDidStop callback -- which Virtualization.framework fires when the guest OS powers off -- is never received.

Fix

Conform BaseVirtualizationService to VZVirtualMachineDelegate and replace the infinite sleep loops with waitForGuestStop(), a continuation-based suspend that returns when guestDidStop or didStopWithError fires. On return, the existing cleanup path runs (release file lock, stop VNC, nil out service).

A pendingGuestStop buffer handles the edge case where the delegate fires between service.start() and waitForGuestStop().

Changes

  • VMVirtualizationService.swift -- add NSObject + VZVirtualMachineDelegate conformance to BaseVirtualizationService, set delegate in init, implement guestDidStop / didStopWithError, add waitForGuestStop() to protocol and implementation
  • VM.swift -- replace while true loops in run() and runWithUSBStorage() with waitForGuestStop() + cleanup on return
  • MockVMVirtualizationService.swift -- add waitForGuestStop() stub

Test plan

  • lume run <vm> -- inside guest, Apple menu > Shut Down -- host process exits cleanly, lume ls shows stopped
  • lume run <vm> -- lume stop <vm> from another terminal -- still works as before
  • lume serve -- run VM via API, guest shutdown -- VM cleaned up, server stays alive
  • swift build passes (verified on macOS 15.7.5 -- zero new errors, all failures are pre-existing ARM-only symbols)

Closes #1184

Summary by CodeRabbit

  • Improvements
    • VM shutdown detection now uses event-driven approach instead of time-based polling for faster response times.
    • Enhanced error tracking and logging when VMs stop unexpectedly or shut down normally.
    • Improved explicit cleanup procedures during VM termination.

@vercel
Copy link
Copy Markdown
Contributor

vercel bot commented Apr 1, 2026

@NikkeTryHard is attempting to deploy a commit to the Cua Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 1, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 359ab31a-935a-4036-95f1-f67bc4483a7d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This change replaces polling-based guest shutdown detection with an event-driven approach using delegate callbacks. The virtualization service now awaits guest termination via a new waitForGuestStop() async API, while the main VM run method uses this signal to perform coordinated cleanup including service teardown and file lock release.

Changes

Cohort / File(s) Summary
Virtualization Service API
libs/lume/src/Virtualization/VMVirtualizationService.swift
Added new async method waitForGuestStop() -> Error? to protocol and implementation. BaseVirtualizationService now adopts VZVirtualMachineDelegate, tracks pending guest-stop state with continuation-based coordination, and implements guestDidStop and virtualMachine(_:didStopWithError:) delegate callbacks to signal guest termination.
VM Lifecycle Management
libs/lume/src/VM/VM.swift
Replaced indefinite polling loops in run() and runWithUSBStorage() with blocking await on waitForGuestStop(). On successful guest shutdown, performs explicit cleanup: stops/closes clipboardWatcher, nullifies virtualizationService, stops vncService, releases file lock via flock(..., LOCK_UN), and closes handle. Main run path additionally calls unlockConfigFile() after lock release.
Mock Implementation
libs/lume/tests/Mocks/MockVMVirtualizationService.swift
Added stubbed waitForGuestStop() -> Error? method returning nil to maintain mock interface compatibility.

Sequence Diagram(s)

sequenceDiagram
    participant VM as VM.run()
    participant Service as VMVirtualizationService
    participant Delegate as VZVirtualMachine Delegate
    participant Cleanup as Cleanup Handler
    
    VM->>Service: waitForGuestStop()
    activate Service
    Note over Service: Awaits guest stop event
    
    Delegate->>Service: guestDidStop() or didStopWithError()
    Service->>Service: Resolve continuation with error status
    deactivate Service
    
    Service-->>VM: Returns Error? (nil or error)
    
    VM->>Cleanup: Perform cleanup on return
    Cleanup->>Cleanup: Stop clipboardWatcher
    Cleanup->>Cleanup: Stop vncService
    Cleanup->>Cleanup: Release file lock (flock LOCK_UN)
    Cleanup->>Cleanup: Close file handle
    Cleanup->>Cleanup: Unlock config file
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

release:lume

Suggested reviewers

  • ddupont808

Poem

🐰 A rabbit hops with glee,
Guest shutdowns now flow so free!
No more polls that never end,
Events whisper "friend, please send"—
VM cleanup, swift and clean,
Best delegation ever seen! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 37.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main fix: handling guest-initiated VM shutdown via VZVirtualMachineDelegate, which directly addresses the core change in the PR.
Linked Issues check ✅ Passed The PR successfully implements all coding requirements from issue #1184: adds VZVirtualMachineDelegate support to detect guest shutdowns, replaces polling loops with event-driven waitForGuestStop(), and maintains existing host-initiated stop behavior.
Out of Scope Changes check ✅ Passed All changes are directly scoped to fixing guest-initiated VM shutdown detection: delegate implementation, event-driven wait mechanism, mock updates, and cleanup path integration—no unrelated modifications present.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@NikkeTryHard NikkeTryHard force-pushed the fix/guest-shutdown-cleanup branch from 2b5f297 to 4646b5b Compare April 1, 2026 21:55
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
libs/lume/tests/Mocks/MockVMVirtualizationService.swift (1)

66-68: Keep the mock stop waiter stateful.

Returning nil immediately makes VM.run() finish as soon as start() succeeds, and currentState never transitions back to .stopped. That makes the new lifecycle hard to exercise in tests and leaves no way to cover guest-error handling. A test-controlled continuation/result here would keep the mock aligned with production behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libs/lume/tests/Mocks/MockVMVirtualizationService.swift` around lines 66 -
68, The mock waitForGuestStop currently returns nil immediately which prevents
VM.run() from observing a guest stop and blocks tests from exercising lifecycle
and error paths; update MockVMVirtualizationService to make waitForGuestStop()
await a test-controlled continuation/result (e.g., store a
CheckedContinuation<Error?, Never> or similar in the mock) and provide helper
methods on the mock to resume that continuation with nil (normal stop) or an
Error (guest failure), and ensure start() and any stop helpers update
currentState appropriately so tests can drive the state transitions and cover
guest-error handling.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@libs/lume/src/VM/VM.swift`:
- Around line 328-347: The code logs guestError from service.waitForGuestStop()
but swallows it by returning normally; capture the error when waitForGuestStop()
returns a non-nil guestError (in the block surrounding
service.waitForGuestStop() in VM.swift), perform the existing shared cleanup
(stop clipboardWatcher, nil out virtualizationService, stop vncService, release
lock/close fileHandle, call unlockConfigFile()), and then rethrow the captured
guestError so callers see the unexpected stop; apply the same pattern to the
other occurrence mentioned (the block around lines 1071–1084) so both places
log, clean up, and then throw the original error instead of returning
successfully.

---

Nitpick comments:
In `@libs/lume/tests/Mocks/MockVMVirtualizationService.swift`:
- Around line 66-68: The mock waitForGuestStop currently returns nil immediately
which prevents VM.run() from observing a guest stop and blocks tests from
exercising lifecycle and error paths; update MockVMVirtualizationService to make
waitForGuestStop() await a test-controlled continuation/result (e.g., store a
CheckedContinuation<Error?, Never> or similar in the mock) and provide helper
methods on the mock to resume that continuation with nil (normal stop) or an
Error (guest failure), and ensure start() and any stop helpers update
currentState appropriately so tests can drive the state transitions and cover
guest-error handling.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f094a50a-a4c0-47a5-a7e0-8836da4af801

📥 Commits

Reviewing files that changed from the base of the PR and between 8e6fa30 and 2b5f297.

📒 Files selected for processing (3)
  • libs/lume/src/VM/VM.swift
  • libs/lume/src/Virtualization/VMVirtualizationService.swift
  • libs/lume/tests/Mocks/MockVMVirtualizationService.swift

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Guest-initiated shutdown leaves VM process running

1 participant