archive: contain extraction writes within the destination root#31585
archive: contain extraction writes within the destination root#31585Jarred-Sumner wants to merge 1 commit into
Conversation
|
Updated 2:14 PM PT - May 29th, 2026
✅ @Jarred-Sumner, your commit 4dcffd279c2136bb7f691329718f2b7d5dad10d3 passed in 🧪 To try this PR locally: bunx bun-pr 31585That installs a local version of the PR into your bun-31585 --bun |
|
This PR may be a duplicate of:
🤖 Generated with Claude Code |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (3)
WalkthroughThis PR adds safe parent-directory resolution to archive extraction on Unix, preventing traversal through symlinks when creating entries. A new ChangesArchive path safety via parent-directory resolution
Possibly related PRs
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
No bugs found, but this is security-sensitive path-traversal hardening in the archive extractor with new fd-walking logic and a behavior change (silently skipping entries under symlinked dirs), and it overlaps with #31481 — worth a human look.
Extended reasoning...
Overview
This PR hardens Bun.Archive.extract() against symlink-based directory escape by adding open_entry_parent() in src/libarchive/lib.rs, which walks each parent path component via openat(O_RDONLY|O_DIRECTORY|O_NOFOLLOW|O_CLOEXEC) (creating missing components with mkdirat) and returns an EntryParent holding the resolved parent dirfd plus the basename offset. A new ExtractOptions.nofollow_parents flag gates it; Bun.Archive.extract() (both the fast path in Archiver::extract_to_disk and the glob-filtered path in extract_to_disk_filtered) enables it, while bun install/bun create/standalone extraction keep it off. Two new unix-only tests verify a pre-existing pivot -> outside symlink is not written through.
Security risks
This is the security-sensitive surface: archive extraction path containment. The change is a hardening, but correctness here is subtle — fd lifecycle (EntryParent::Drop closes the held fd; intermediate fds are closed in the walk loop), TOCTOU between mkdirat and the re-openat, and the interaction with the existing ENOENT/EACCES fallback that still calls dir.make_path_u8(dirname) against the root dir (lib.rs around the mkdirat/symlinkat/openat error arms). When nofollow_parents is on, open_entry_parent should have already materialized the parent so those fallbacks shouldn't fire, but the fallback path itself is not nofollow-guarded. There is also a user-visible behavior change: entries under a pre-existing symlinked directory are now silently skipped rather than written.
Level of scrutiny
High. This is production extraction logic that handles untrusted archive input and governs where bytes land on disk. It introduces new per-entry fd open/close churn, a new Drop impl, and rewires which dirfd + name each mkdirat/symlinkat/openat call uses across two files. Mistakes here are either fd leaks or containment bypasses.
Other factors
- github-actions flagged this as overlapping with #31481, which appears to address the same vulnerability with the same per-component
O_NOFOLLOWapproach — a maintainer should decide which lands. - The bug-hunting pass found nothing; tests are added and the full archive suite reportedly passes. Windows is intentionally untouched (cfg-gated).
- Given the security sensitivity, the behavior change, and the duplicate-PR question, I'm deferring rather than approving.
Jarred-Sumner
left a comment
There was a problem hiding this comment.
This is way too slow. Can you come up with a faster approach?
What
Bun.Archive.extract()now resolves each entry's parent directory component-by-component relative to the extraction root before creating the entry, instead of handing the full relative path to a singleopenat/mkdirat/symlinkatcall.src/libarchive/lib.rs: newopen_entry_parenthelper (unix) that walks an entry's parent components against the destination dirfd withO_RDONLY|O_DIRECTORY|O_NOFOLLOW|O_CLOEXEC, creating missing components withmkdirat(0o755) and re-verifying them. Entries whose parent cannot be resolved to a real directory inside the destination are skipped (not counted), the same way the extractor already skips other unresolvable entries (overlong paths, unsafe symlink targets, created-symlink traversal). Enabled via a newExtractOptions.nofollow_parentsflag.src/runtime/api/Archive.rs: the flag is set forBun.Archive.extract()'s fast path, and the glob-filtered path uses the same helper directly.bun install,bun create, and standalone-executable extraction keep the flag off and are byte-for-byte unchanged.This matches what node-tar does when an entry's directory resolves to something other than a regular directory inside the destination: the entry is skipped, the rest of the archive still extracts, and the pre-existing directory entry (e.g. a symlink the user placed there) is left untouched.
Behavior note
Archive entries whose ancestor directory inside the destination is a pre-existing symlink (or any non-directory) are no longer written through that path; they are skipped and not included in the returned count. Entries with normal directory ancestors extract exactly as before.
Verification
test/js/bun/archive.test.ts("path safety" describe): extracting an archive containingpivot/file.txtinto a destination that already haspivot -> <outside dir>leaves the outside directory empty, keeps the symlink intact, and still extracts the sibling entries — for both the fast path and{ glob: "**" }. Both fail on the previous behavior and pass with this change. Compared against node-tar 6.x, which skips such entries the same way.test/js/bun/archive.test.tssuite passes (103 pass / 0 fail).bun installsmoke test (tarball extraction path with the flag off) unchanged.bun run rust:check-all: 10/10 targets pass.