Fix race condition in 04-persistent-data-volume.sh#4831
Merged
AkihiroSuda merged 1 commit intolima-vm:masterfrom Apr 22, 2026
Merged
Fix race condition in 04-persistent-data-volume.sh#4831AkihiroSuda merged 1 commit intolima-vm:masterfrom
AkihiroSuda merged 1 commit intolima-vm:masterfrom
Conversation
A race between growpart/e2fsck and udev device probing can delete the /dev/disk/by-label/data-volume symlink, causing mount failures and potential data loss on Alpine ramdisk VMs. growpart triggers a kernel partition table re-read, which generates a udev re-probe for the data partition. When e2fsck runs concurrently, udevd's libblkid probe reads a partially modified ext4 superblock, fails with "incorrect ext4 checksum", and removes the symlink. Core fixes: 1. Resolve the device via blkid output (which probes devices directly) instead of the udev symlink. This eliminates the dependency on udev state entirely. The sed pattern anchors LABEL= on whitespace so it does not match PARTLABEL= on util-linux blkid, and quits after the first match so a stray duplicate label never produces a newline- separated device list. 2. Add udevadm settle after growpart to prevent the concurrent probe from racing with e2fsck. Both settle calls (after growpart and after mkfs) tolerate failure with || true; the blkid resolution already removes the core dependency on udev state, so a missing or stuck udevadm must not crash the boot under set -e. 3. Fix the else-branch "disk in use" check to reject any disk that already has partitions or carries a filesystem signature, not just mounted devices in /proc/mounts. The old check could misidentify a partitioned-but-unmounted or raw-formatted data disk as unused and reformat it. Derive DATA_DISK via lsblk --output pkname instead of stripping a single trailing digit, so the parent disk is resolved correctly for any partition naming scheme. Scope the /proc/mounts awk substitution to $1. Ref: rancher-sandbox/rancher-desktop#10133 Signed-off-by: Jan Dubois <jan.dubois@suse.com>
17462f4 to
34a1f47
Compare
Member
Author
|
Round 2 AI review: https://jandubois.github.io/lima/20260420-120300-pr-4831.html I don't think any of the issues are worth addressing. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
A race between growpart/e2fsck and udev device probing can delete the /dev/disk/by-label/data-volume symlink, causing mount failures and potential data loss on Alpine ramdisk VMs.
growpart triggers a kernel partition table re-read, which generates a udev re-probe for the data partition. When e2fsck runs concurrently, udevd's libblkid probe reads a partially modified ext4 superblock, fails with "incorrect ext4 checksum", and removes the symlink.
Core fixes:
Resolve the device via blkid output (which probes devices directly) instead of the udev symlink. This eliminates the dependency on udev state entirely. The sed pattern anchors LABEL= on whitespace so it does not match PARTLABEL= on util-linux blkid, and quits after the first match so a stray duplicate label never produces a newline-separated device list.
Add udevadm settle after growpart to prevent the concurrent probe from racing with e2fsck. Both settle calls (after growpart and after mkfs) tolerate failure with
|| true; the blkid resolution already removes the core dependency on udev state, so a missing or stuck udevadm must not crash the boot underset -e.Fix the else-branch "disk in use" check to reject any disk that already has partitions or carries a filesystem signature, not just mounted devices in /proc/mounts. The old check could misidentify a partitioned-but-unmounted or raw-formatted data disk as unused and reformat it.
Derive DATA_DISK via
lsblk --output pknameinstead of stripping a single trailing digit, so the parent disk is resolved correctly for any partition naming scheme. Scope the /proc/mounts awk substitution to$1.Ref: rancher-sandbox/rancher-desktop#10133
Closes: #4830