fix(server): guard web contract unpack against path traversal#4204
fix(server): guard web contract unpack against path traversal#4204c-tonneslan wants to merge 4 commits into
Conversation
WebApp::unpack called tar's unpack() straight on the archive, with no overwrite or path filtering. A contract author could publish an archive with a `../../etc/passwd` entry, or an absolute path, and escape the webapp_cache_dir() sandbox. Since freenet#3942 this is reachable from any subresource URL, not just top-level contract navigation. Unpack entries one at a time instead: reject any entry whose path is absolute or contains a `..` component, disable overwrite, and extract each remaining entry with unpack_in into the canonicalized destination. Closes freenet#3946 Signed-off-by: Charlie Tonneslan <cst0520@gmail.com>
|
Now I have all the information I need to complete the review. Rule Review: hardlink test coverage gapRules checked: git-workflow.md, code-style.md, testing.md Warnings
Info
Rule review against |
The traversal guard rejects both absolute paths and `..` components, but only the `..` branch had a test. Add unpack_rejects_absolute_path_entry so the absolute-path error path is exercised too. Signed-off-by: Charlie Tonneslan <cst0520@gmail.com>
|
Thanks for the review. Both addressed:
|
sanity
left a comment
There was a problem hiding this comment.
Comprehensive PR Review: #4204
Summary
- PR Title: fix(server): guard web contract unpack against path traversal
- Type: fix (security)
- CI Status: Most checks
action_required— first-time contributor; CLA and Snyk pass. CI has been approved on this review pass; full results pending. - Linked Issues: #3946
- Review tier: Full (security / state-authorization surface)
- Reviewers run: code-first, testing, skeptical, big-picture, plus Codex CLI external pass
- HEAD SHA reviewed:
e9a4c3ca4cfcc0a35b69e2aaed56aba98c3e783f
Code-First Analysis
Independent Understanding: WebApp::unpack is rewritten to iterate tar entries manually, canonicalize the destination, set set_overwrite(false) and set_preserve_mtime(false), and reject any entry whose entry.path() is absolute or contains Component::ParentDir. Three unit tests cover benign / .. / absolute paths.
Stated Intent: Close issue #3946 ("Tar unpack of web contract lacks path-traversal guards"), which lists four required protections: set_overwrite(false), absolute-path filter, .. filter, and a symlink/hardlink escape filter.
Alignment: Partial. The PR implements 3 of the 4 documented protections. The symlink/hardlink filter is silently omitted (not deferred in the PR description, just not done).
Gaps:
entry.link_name()is not inspected. A symlink entry with a benign archive path and a../../etc/passwdlinkname passes the new guard and is created verbatim bytar::Entry::unpack_ininside the cache dir.- Hardlinks are separately fine because the upstream
tarcrate already validates hardlink targets viavalidate_inside_dst(tar-0.4.46/src/entry.rs:541-547) — but the symlink path (entry.rs:560) writes the linkname unchecked.
Testing Assessment
Coverage Level: Adequate for the two branches actually implemented; misses the documented attack class.
| Test Type | Status | Notes |
|---|---|---|
| Unit | 3 tests, all pass (cargo test -p freenet --lib app_packaging::tests::). Covers .. and absolute. |
|
| Integration | ❌ | No integration test driving a malicious WebApp state through path_handlers::variable_content. |
| Simulation | N/A | |
| E2E | N/A |
Test helper soundness — verified. The helper writes the GNU header name field directly via header.as_gnu_mut().unwrap().name[..] and then set_cksum(), producing a valid GNU header. header.rs:1553 (truncate-at-NUL) and header.rs:1742 (bytes2path on Unix) confirm the absolute-path test entry decodes to Path::new("/etc/passwd") for which is_absolute() returns true. Both tests do exercise the intended guard branches.
Missing Tests (in priority order):
- Symlink with escaping
link_name— even if the guard rejected symlinks outright, this would pin the behavior. Right now anEntryType::Symlinkentry withlink_name = "../../etc/passwd"passes the guard. - Absolute-path test should also assert nothing leaked. The
..test correctly asserts!dir.path().join("escape.txt").exists()at line 239; the absolute-path test (3408-style) only asserts the error variant. Mirror the pattern using a writable absolute target (e.g.dir.path().parent().unwrap().join("escape.txt")). - Mixed-component path
foo/../bar— the check rejects it, but no test pins the behavior.
Skeptical Findings
Risk Level: High — confirmed exploitable end-to-end.
| Concern | Severity | Location | Details |
|---|---|---|---|
| Symlink linkname traversal bypasses the new guard | High | crates/core/src/server/app_packaging.rs:89-99 |
entry.path() is checked but entry.link_name() is not. tar::Entry::unpack_in creates the symlink with the verbatim target. See "Exploit walkthrough" below. |
variable_content serves through symlinks without canonicalizing |
High | crates/core/src/server/path_handlers.rs:350,361,378 |
base_path.join(relative_path) then tokio::fs::read_to_string (for .js) or tower_http::services::fs::ServeFile::new(&file_path). Both follow symlinks. No starts_with(canonical_base) check. |
| Threat model in PR description overstated | Informational | PR body | Tar already silently strips Component::RootDir (entry.rs:409) and skips Component::ParentDir (entry.rs:415), so the literal /etc/passwd and ../escape.txt entries described in the PR were converted-to-no-op, not direct escapes. The actual escape is the symlink case, which the PR does not cover. |
set_overwrite(false) is a silent behavior change for archives with duplicate paths |
Medium | app_packaging.rs:79 |
Previous Archive::unpack used the upstream default (overwrite). Duplicate entries within a single archive now hard-fail. The sole caller's remove_dir_all only covers cross-unpack state, not within-tarball duplicates. No test pins the new behavior. |
Manual entry loop drops Archive::unpack's deferred-directory-perms handling |
Medium | app_packaging.rs:81-100 vs. tar::Archive::unpack |
Tar normally applies directory entries last so restrictive dir permissions don't block descendant extraction. The new loop applies them in stream order. An archive with a mode 000 directory entry before its file children would now fail. Theoretical, but a real behavior change. |
| Partial-unpack state leak on rejection | Medium | app_packaging.rs:81-100 |
If a 100-entry archive's 101st entry is ..escape.txt, entries 1-100 stay written. The caller's remove_dir_all (path_handlers.rs:278) only runs on the next unpack. Not exploitable on its own (the *.hash file is written last so variable_content's freshness gate at :326 blocks serving), but worth a cleanup-on-error in the unpack itself. |
Path::is_absolute() is platform-dependent |
Low | app_packaging.rs:93 |
A C:\... entry would not be flagged on Unix. freenet is Unix-primary for this path, but worth a comment. |
Exploit walkthrough (the high-severity bug):
- Attacker publishes a web contract whose tarball contains a single symlink entry: archive path
app.js, linkname../../../etc/passwd. - Any user (or any page on freenet that embeds
<script src="/v1/contract/web/<key>/app.js">) triggersvariable_content→ensure_contract_cached→unpack_if_stale→WebApp::unpack. - The PR's guard inspects
entry.path() = "app.js". Not absolute, no..component → passes. entry.unpack_in(&dst)creates<cache>/<key>/app.js -> ../../../etc/passwd.variable_contentresolvesfile_path = base_path.join("app.js"), sees the.jsextension, callstokio::fs::read_to_string(&file_path)(line 361), which follows the symlink and returns/etc/passwdcontents to the browser.
The codebase already understands this attack pattern: sandbox_content_body (lines 545-559) does the right canonicalize+starts_with check with a TOCTOU-resistant File::open(&canonical_file) (line 562), and sandbox_content_rejects_symlink_escape (line 3408) is its regression test — with a comment that literally says: "The canonicalize + starts_with check must catch this even though the component-level ParentDir check would not." That comment applies verbatim to this PR's situation. variable_content is the unprotected sibling.
Big Picture Assessment
Goal Alignment: Partial — fixes 3 of 4 documented protections; the omitted one is the only literal escape vector still reachable through the serving path.
Anti-Patterns Detected: None in the diff. The header-byte-injection technique in the test helper is exactly right.
Removed Code Concerns: None — purely additive (86 / 5).
Scope Assessment: Focused.
Sibling code surveyed:
crates/core/src/bin/commands/update.rs:953already has avalidate_extract_pathhelper using canonicalize +starts_with. Different call path (binary self-update); already correctly hardened. Useful as a precedent for the style this PR should also adopt.crates/fdev/src/commands.rs:140andapp_packaging::decode_web()debug-log at:128are read-only; no risk.WebApp::unpackispub use'd inserver.rs:37and reachable as a library API for downstream embedders. New precondition (dstmust exist beforecanonicalize) is satisfied internally but worth a doc comment.
Documentation
- Code docs: incomplete — new guard has a single inline comment; the security-relevant intent (and the deliberately not-yet-addressed symlink case) is undocumented.
- PR description: claims the PR matches the issue's fix sketch but does not call out the symlink/hardlink bullet that is omitted.
- Architecture / threat-model docs: none exist; the issue itself was filed as the de facto record. No external docs to update.
Recommendations
Must Fix (Blocking)
- Close the symlink linkname bypass. Two acceptable approaches:
- (Preferred, smaller) Reject
EntryType::SymlinkandEntryType::Linkoutright inWebApp::unpack. Web contract archives have no legitimate use for these entry types.let entry_type = entry.header().entry_type(); if entry_type.is_symlink() || entry_type.is_hard_link() { return Err(WebContractError::UnpackingError(anyhow::anyhow!( "archive contains a {entry_type:?} entry; not supported for web contracts: {path:?}" ))); }
- (Alternative) Also harden
variable_contentto canonicalize+starts_with(canonical_base)matchingsandbox_content_body:545-562. This is the more general fix and is independently valuable, but is larger surface and adds TOCTOU concerns the unpack-side rejection avoids entirely. If you do this, please ALSO reject symlink entries — defense in depth.
- (Preferred, smaller) Reject
- Add a symlink regression test. Inject a tar entry with
EntryType::Symlinkand linkname../../etc/passwd; assert unpack rejects it and no symlink exists in or abovedst.
Should Fix (Important)
- Pin the duplicate-entry /
set_overwrite(false)behavior with a test OR keep upstream's overwrite default. Right now it's a silent behavior change with no coverage. If a real fdev / build pipeline ever emits duplicate entries, this fails legitimate webapps. - Mirror the "nothing leaked" assertion in
unpack_rejects_absolute_path_entry— use a writable absolute target and assert no file written. Today it only checks the error variant, so a future regression to a warn-only check could still pass on CI. - Clean up partial state on unpack error, OR document that the next
unpack_if_stalecall is responsible. The hash-file pattern atpath_handlers.rs:288-294shows the team already cares about consistency here. - Update the PR description to make explicit what is and isn't covered (drop the implication that absolute /
..paths were directly exploitable throughArchive::unpack— they weren't; tar already neutered them silently. The PR's contribution is converting silent skips into hard errors plus … well, it would also be closing the symlink case if recommendation 1 lands).
Consider (Suggestions)
- Add a comment at
app_packaging.rs:79linkingset_overwrite(false)to the caller's pre-unpackremove_dir_all. Prevents a future refactor ofpath_handlers.rs:278from silently introducing a regression. - Cap the helper's name length with
debug_assert!(name.len() <= 100)to avoid a future test panicking on slice OOB. - File a follow-up to consider cap-std-based sandboxing of contract unpacks (the tar crate's own docs recommend this for fully-untrusted archives).
What's Good
- Single-file, well-scoped diff. No anti-patterns. No removed tests.
- Test helper construction (writing the header name slot directly then re-cksumming) is exactly the right technique for the threat model.
cargo test -p freenet --lib app_packaging::tests::passes locally in <1s. No flake risk.- Implementation closely follows the issue's fix sketch for the parts it covers.
- First-time-contributor work shows good craftsmanship — the only structural critique is omission, not error.
Verdict
State: Needs Changes — Re-review Required After Fix
The PR is a genuine improvement and the existing code is sound, but it is incomplete: issue #3946 explicitly named the symlink/hardlink escape filter as a required defense, and that defense is the one case that is actually exploitable end-to-end through variable_content. Per ~/.claude/rules/finish-the-fix.md, this should be addressed in the same PR — the user-visible symptom ("malicious archive escapes the sandbox") is only closed when the symlink case is closed too.
The fix is mechanically tiny (~6 lines + a test). Recommend the author:
- Add symlink/hardlink entry-type rejection in
WebApp::unpack. - Add the symlink regression test.
- Optionally also harden
variable_contentto mirrorsandbox_content_body's canonicalize+starts_with check (separate concern, possibly separate PR).
After the fix lands, a re-review on the new HEAD will confirm the gap is closed.
[AI-assisted - Claude]
Per @sanity's review of freenet#4204, the path-traversal guards I added only look at entry.path() and not at entry.link_name(). A symlink entry with a benign archive path (e.g. "app.js") but an escaping linkname (e.g. "../../../etc/passwd") passes the absolute/parent-dir check and is then created verbatim by tar::Entry::unpack_in. variable_content's read_to_string then follows the symlink and serves the linked file to the browser — full path-traversal exploit through the serving path. Reject EntryType::Symlink and EntryType::HardLink outright in WebApp::unpack. Web contracts have no legitimate use for either, so the narrowest fix is to refuse them rather than try to validate linknames. Hardlinks were already validated by the upstream tar crate, but rejecting both keeps the policy uniform and the reasoning local. Added unpack_rejects_symlink_entry regression test that injects a symlink with linkname ../escape.txt and asserts (1) the unpack returns an UnpackingError, (2) the symlink was never created inside dst, and (3) the linkname's target above dst doesn't exist. The test helper writes the GNU header linkname slot directly, same technique the existing tests use for the path slot, since Builder::append_link validates targets. Signed-off-by: Charlie Tonneslan <cst0520@gmail.com>
|
Thanks for the thorough review. The symlink case is a real bypass and you're right it's the one that actually escapes through What changed:
Confirmed locally: On the rest of your "should fix" list — happy to address in this PR if you want, but flagging that each is a separable concern:
On the PR description: you're right that the original framing overstated the literal-escape case for absolute/ Let me know if you'd like any of the should-fix items folded in before re-review. |
Fixes #3946.
Problem
WebApp::unpackcallstar::Archive::unpack(dst)directly on attacker-controlled archive bytes. The dangerous case is symlink entries: an entry with a benign archive path (e.g."app.js") but an escaping linkname (e.g."../../../etc/passwd") passes any path-component check onentry.path()and gets created verbatim byunpack_in.path_handlers::variable_contentthen resolves the served path through the symlink and serves the linked file to the browser — a full sandbox escape over HTTP.Absolute and
..-bearing archive paths are not directly exploitable throughArchive::unpack(tar silently stripsRootDirand skipsParentDir), but converting those silent skips into hard errors is still the right policy: a contract that ships such an entry is broken or malicious, and surfacing it loudly is cheaper than letting it succeed-but-half.Solution
Rewrite
WebApp::unpackto iterate entries manually and apply four guards before eachunpack_in:set_overwrite(false)so duplicate paths within a single archive can't clobber earlier entriesset_preserve_mtime(false)(defense in depth against header-driven timestamp games)EntryType::SymlinkandEntryType::HardLinkentries outright — web contracts have no legitimate use for either, and refusing them is narrower than trying to validate linknames against the destinationentry.path()is absolute or containsComponent::ParentDirThe symlink rejection is the load-bearing one for actually closing the exploit; the others harden silent classes of malformed archives.
Testing
Three regression tests in
crates/core/src/server/app_packaging.rs::tests:unpack_writes_a_benign_archive— the happy path is preservedunpack_rejects_parent_dir_traversal—..-entry is rejected and nothing leaks todst.parent()unpack_rejects_absolute_path_entry—/etc/passwd-entry is rejectedunpack_rejects_symlink_entry— symlink with archive path"app.js"andlinkname = "../escape.txt"is rejected, the symlink is not created insidedst, and nothing leaks abovedstThe test helper writes GNU header fields directly (name slot, and for the symlink test also the linkname slot) and re-cksums, because
Builder::append_data/Builder::append_linkreject the malicious shapes the guards are defending against.cargo test -p freenet --lib app_packaging::tests::— all four pass.cargo fmt && cargo clippy -p freenet --lib -- -D warnings— clean.Fixes
Closes #3946.