Skip to content

fix(server): guard web contract unpack against path traversal#4204

Open
c-tonneslan wants to merge 4 commits into
freenet:mainfrom
c-tonneslan:fix/web-contract-path-traversal
Open

fix(server): guard web contract unpack against path traversal#4204
c-tonneslan wants to merge 4 commits into
freenet:mainfrom
c-tonneslan:fix/web-contract-path-traversal

Conversation

@c-tonneslan

@c-tonneslan c-tonneslan commented May 20, 2026

Copy link
Copy Markdown

Fixes #3946.

Problem

WebApp::unpack calls tar::Archive::unpack(dst) directly on attacker-controlled archive bytes. The dangerous case is symlink entries: an entry with a benign archive path (e.g. "app.js") but an escaping linkname (e.g. "../../../etc/passwd") passes any path-component check on entry.path() and gets created verbatim by unpack_in. path_handlers::variable_content then resolves the served path through the symlink and serves the linked file to the browser — a full sandbox escape over HTTP.

Absolute and ..-bearing archive paths are not directly exploitable through Archive::unpack (tar silently strips RootDir and skips ParentDir), but converting those silent skips into hard errors is still the right policy: a contract that ships such an entry is broken or malicious, and surfacing it loudly is cheaper than letting it succeed-but-half.

Solution

Rewrite WebApp::unpack to iterate entries manually and apply four guards before each unpack_in:

  • set_overwrite(false) so duplicate paths within a single archive can't clobber earlier entries
  • set_preserve_mtime(false) (defense in depth against header-driven timestamp games)
  • Reject EntryType::Symlink and EntryType::HardLink entries outright — web contracts have no legitimate use for either, and refusing them is narrower than trying to validate linknames against the destination
  • Reject entries whose entry.path() is absolute or contains Component::ParentDir

The symlink rejection is the load-bearing one for actually closing the exploit; the others harden silent classes of malformed archives.

Testing

Three regression tests in crates/core/src/server/app_packaging.rs::tests:

  • unpack_writes_a_benign_archive — the happy path is preserved
  • unpack_rejects_parent_dir_traversal..-entry is rejected and nothing leaks to dst.parent()
  • unpack_rejects_absolute_path_entry/etc/passwd-entry is rejected
  • unpack_rejects_symlink_entry — symlink with archive path "app.js" and linkname = "../escape.txt" is rejected, the symlink is not created inside dst, and nothing leaks above dst

The test helper writes GNU header fields directly (name slot, and for the symlink test also the linkname slot) and re-cksums, because Builder::append_data / Builder::append_link reject the malicious shapes the guards are defending against.

cargo test -p freenet --lib app_packaging::tests:: — all four pass.
cargo fmt && cargo clippy -p freenet --lib -- -D warnings — clean.

Fixes

Closes #3946.

WebApp::unpack called tar's unpack() straight on the archive, with no
overwrite or path filtering. A contract author could publish an archive
with a `../../etc/passwd` entry, or an absolute path, and escape the
webapp_cache_dir() sandbox. Since freenet#3942 this is reachable from any
subresource URL, not just top-level contract navigation.

Unpack entries one at a time instead: reject any entry whose path is
absolute or contains a `..` component, disable overwrite, and extract
each remaining entry with unpack_in into the canonicalized destination.

Closes freenet#3946

Signed-off-by: Charlie Tonneslan <cst0520@gmail.com>
@CLAassistant

CLAassistant commented May 20, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@github-actions

github-actions Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

Now I have all the information I need to complete the review.


Rule Review: hardlink test coverage gap

Rules checked: git-workflow.md, code-style.md, testing.md
Files reviewed: 1 (crates/core/src/server/app_packaging.rs)

Warnings

  • crates/core/src/server/app_packaging.rs:93 — The production guard rejects both is_symlink() and is_hard_link(), and the commit title (fix(server): also reject symlink/hardlink entries in WebApp unpack) explicitly names both types. The test suite has unpack_rejects_symlink_entry but no corresponding unpack_rejects_hardlink_entry. If || entry_type.is_hard_link() were accidentally dropped from the condition, all four tests would still pass. Per testing.md: fix: PRs must cover each named bug; per the review instructions, an untested distinct error path in the same if arm is a WARNING. A minimal hardlink test (analogous to web_app_with_symlink but with tar::EntryType::Link and a linkname) is needed. (rule: testing.md — bug-fix regression coverage)

Info

  • .pr-review-tmp/commits.txt:3 — The oldest commit subject server: guard web contract unpack against path traversal uses server: as a conventional-commit type, which is not one of the valid types (feat, fix, docs, refactor, test, build). The PR title fix(server): is correct; only this individual commit message is non-conforming. (rule: git-workflow.md — conventional commit format)

  • crates/core/src/server/app_packaging.rs:254–262unpack_rejects_absolute_path_entry asserts the error type but does not assert that dir.path() remains empty after the failed unpack (contrast with unpack_rejects_parent_dir_traversal which checks !dir.path().join("escape.txt").exists()). Since the absolute-path check fires before entry.unpack_in(), the directory should stay empty; adding assert!(dir.path().read_dir().unwrap().next().is_none()) would make the invariant explicit. (rule: testing.md — test boundary conditions)


Rule review against .claude/rules/. WARNING findings block merge. ⚠️ 1 warning(s) — fix or add review-override label

The traversal guard rejects both absolute paths and `..` components,
but only the `..` branch had a test. Add unpack_rejects_absolute_path_entry
so the absolute-path error path is exercised too.

Signed-off-by: Charlie Tonneslan <cst0520@gmail.com>
@c-tonneslan c-tonneslan changed the title server: guard web contract unpack against path traversal fix(server): guard web contract unpack against path traversal May 20, 2026
@c-tonneslan

Copy link
Copy Markdown
Author

Thanks for the review. Both addressed:

  • Added unpack_rejects_absolute_path_entry so the absolute-path branch of the guard is tested alongside the .. one. Same web_app helper, which writes the header name field directly so the absolute path gets past tar::Builder.
  • Retitled the PR to fix(server): ... so the squash-merge subject is a valid conventional-commit type.

@sanity sanity left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comprehensive PR Review: #4204

Summary

  • PR Title: fix(server): guard web contract unpack against path traversal
  • Type: fix (security)
  • CI Status: Most checks action_required — first-time contributor; CLA and Snyk pass. CI has been approved on this review pass; full results pending.
  • Linked Issues: #3946
  • Review tier: Full (security / state-authorization surface)
  • Reviewers run: code-first, testing, skeptical, big-picture, plus Codex CLI external pass
  • HEAD SHA reviewed: e9a4c3ca4cfcc0a35b69e2aaed56aba98c3e783f

Code-First Analysis

Independent Understanding: WebApp::unpack is rewritten to iterate tar entries manually, canonicalize the destination, set set_overwrite(false) and set_preserve_mtime(false), and reject any entry whose entry.path() is absolute or contains Component::ParentDir. Three unit tests cover benign / .. / absolute paths.

Stated Intent: Close issue #3946 ("Tar unpack of web contract lacks path-traversal guards"), which lists four required protections: set_overwrite(false), absolute-path filter, .. filter, and a symlink/hardlink escape filter.

Alignment: Partial. The PR implements 3 of the 4 documented protections. The symlink/hardlink filter is silently omitted (not deferred in the PR description, just not done).

Gaps:

  • entry.link_name() is not inspected. A symlink entry with a benign archive path and a ../../etc/passwd linkname passes the new guard and is created verbatim by tar::Entry::unpack_in inside the cache dir.
  • Hardlinks are separately fine because the upstream tar crate already validates hardlink targets via validate_inside_dst (tar-0.4.46/src/entry.rs:541-547) — but the symlink path (entry.rs:560) writes the linkname unchecked.

Testing Assessment

Coverage Level: Adequate for the two branches actually implemented; misses the documented attack class.

Test Type Status Notes
Unit ⚠️ 3 tests, all pass (cargo test -p freenet --lib app_packaging::tests::). Covers .. and absolute.
Integration No integration test driving a malicious WebApp state through path_handlers::variable_content.
Simulation N/A
E2E N/A

Test helper soundness — verified. The helper writes the GNU header name field directly via header.as_gnu_mut().unwrap().name[..] and then set_cksum(), producing a valid GNU header. header.rs:1553 (truncate-at-NUL) and header.rs:1742 (bytes2path on Unix) confirm the absolute-path test entry decodes to Path::new("/etc/passwd") for which is_absolute() returns true. Both tests do exercise the intended guard branches.

Missing Tests (in priority order):

  1. Symlink with escaping link_name — even if the guard rejected symlinks outright, this would pin the behavior. Right now an EntryType::Symlink entry with link_name = "../../etc/passwd" passes the guard.
  2. Absolute-path test should also assert nothing leaked. The .. test correctly asserts !dir.path().join("escape.txt").exists() at line 239; the absolute-path test (3408-style) only asserts the error variant. Mirror the pattern using a writable absolute target (e.g. dir.path().parent().unwrap().join("escape.txt")).
  3. Mixed-component path foo/../bar — the check rejects it, but no test pins the behavior.

Skeptical Findings

Risk Level: High — confirmed exploitable end-to-end.

Concern Severity Location Details
Symlink linkname traversal bypasses the new guard High crates/core/src/server/app_packaging.rs:89-99 entry.path() is checked but entry.link_name() is not. tar::Entry::unpack_in creates the symlink with the verbatim target. See "Exploit walkthrough" below.
variable_content serves through symlinks without canonicalizing High crates/core/src/server/path_handlers.rs:350,361,378 base_path.join(relative_path) then tokio::fs::read_to_string (for .js) or tower_http::services::fs::ServeFile::new(&file_path). Both follow symlinks. No starts_with(canonical_base) check.
Threat model in PR description overstated Informational PR body Tar already silently strips Component::RootDir (entry.rs:409) and skips Component::ParentDir (entry.rs:415), so the literal /etc/passwd and ../escape.txt entries described in the PR were converted-to-no-op, not direct escapes. The actual escape is the symlink case, which the PR does not cover.
set_overwrite(false) is a silent behavior change for archives with duplicate paths Medium app_packaging.rs:79 Previous Archive::unpack used the upstream default (overwrite). Duplicate entries within a single archive now hard-fail. The sole caller's remove_dir_all only covers cross-unpack state, not within-tarball duplicates. No test pins the new behavior.
Manual entry loop drops Archive::unpack's deferred-directory-perms handling Medium app_packaging.rs:81-100 vs. tar::Archive::unpack Tar normally applies directory entries last so restrictive dir permissions don't block descendant extraction. The new loop applies them in stream order. An archive with a mode 000 directory entry before its file children would now fail. Theoretical, but a real behavior change.
Partial-unpack state leak on rejection Medium app_packaging.rs:81-100 If a 100-entry archive's 101st entry is ..escape.txt, entries 1-100 stay written. The caller's remove_dir_all (path_handlers.rs:278) only runs on the next unpack. Not exploitable on its own (the *.hash file is written last so variable_content's freshness gate at :326 blocks serving), but worth a cleanup-on-error in the unpack itself.
Path::is_absolute() is platform-dependent Low app_packaging.rs:93 A C:\... entry would not be flagged on Unix. freenet is Unix-primary for this path, but worth a comment.

Exploit walkthrough (the high-severity bug):

  1. Attacker publishes a web contract whose tarball contains a single symlink entry: archive path app.js, linkname ../../../etc/passwd.
  2. Any user (or any page on freenet that embeds <script src="/v1/contract/web/<key>/app.js">) triggers variable_contentensure_contract_cachedunpack_if_staleWebApp::unpack.
  3. The PR's guard inspects entry.path() = "app.js". Not absolute, no .. component → passes.
  4. entry.unpack_in(&dst) creates <cache>/<key>/app.js -> ../../../etc/passwd.
  5. variable_content resolves file_path = base_path.join("app.js"), sees the .js extension, calls tokio::fs::read_to_string(&file_path) (line 361), which follows the symlink and returns /etc/passwd contents to the browser.

The codebase already understands this attack pattern: sandbox_content_body (lines 545-559) does the right canonicalize+starts_with check with a TOCTOU-resistant File::open(&canonical_file) (line 562), and sandbox_content_rejects_symlink_escape (line 3408) is its regression test — with a comment that literally says: "The canonicalize + starts_with check must catch this even though the component-level ParentDir check would not." That comment applies verbatim to this PR's situation. variable_content is the unprotected sibling.


Big Picture Assessment

Goal Alignment: Partial — fixes 3 of 4 documented protections; the omitted one is the only literal escape vector still reachable through the serving path.

Anti-Patterns Detected: None in the diff. The header-byte-injection technique in the test helper is exactly right.

Removed Code Concerns: None — purely additive (86 / 5).

Scope Assessment: Focused.

Sibling code surveyed:

  • crates/core/src/bin/commands/update.rs:953 already has a validate_extract_path helper using canonicalize + starts_with. Different call path (binary self-update); already correctly hardened. Useful as a precedent for the style this PR should also adopt.
  • crates/fdev/src/commands.rs:140 and app_packaging::decode_web() debug-log at :128 are read-only; no risk.
  • WebApp::unpack is pub use'd in server.rs:37 and reachable as a library API for downstream embedders. New precondition (dst must exist before canonicalize) is satisfied internally but worth a doc comment.

Documentation

  • Code docs: incomplete — new guard has a single inline comment; the security-relevant intent (and the deliberately not-yet-addressed symlink case) is undocumented.
  • PR description: claims the PR matches the issue's fix sketch but does not call out the symlink/hardlink bullet that is omitted.
  • Architecture / threat-model docs: none exist; the issue itself was filed as the de facto record. No external docs to update.

Recommendations

Must Fix (Blocking)

  1. Close the symlink linkname bypass. Two acceptable approaches:
    • (Preferred, smaller) Reject EntryType::Symlink and EntryType::Link outright in WebApp::unpack. Web contract archives have no legitimate use for these entry types.
      let entry_type = entry.header().entry_type();
      if entry_type.is_symlink() || entry_type.is_hard_link() {
          return Err(WebContractError::UnpackingError(anyhow::anyhow!(
              "archive contains a {entry_type:?} entry; not supported for web contracts: {path:?}"
          )));
      }
    • (Alternative) Also harden variable_content to canonicalize+starts_with(canonical_base) matching sandbox_content_body:545-562. This is the more general fix and is independently valuable, but is larger surface and adds TOCTOU concerns the unpack-side rejection avoids entirely. If you do this, please ALSO reject symlink entries — defense in depth.
  2. Add a symlink regression test. Inject a tar entry with EntryType::Symlink and linkname ../../etc/passwd; assert unpack rejects it and no symlink exists in or above dst.

Should Fix (Important)

  1. Pin the duplicate-entry / set_overwrite(false) behavior with a test OR keep upstream's overwrite default. Right now it's a silent behavior change with no coverage. If a real fdev / build pipeline ever emits duplicate entries, this fails legitimate webapps.
  2. Mirror the "nothing leaked" assertion in unpack_rejects_absolute_path_entry — use a writable absolute target and assert no file written. Today it only checks the error variant, so a future regression to a warn-only check could still pass on CI.
  3. Clean up partial state on unpack error, OR document that the next unpack_if_stale call is responsible. The hash-file pattern at path_handlers.rs:288-294 shows the team already cares about consistency here.
  4. Update the PR description to make explicit what is and isn't covered (drop the implication that absolute / .. paths were directly exploitable through Archive::unpack — they weren't; tar already neutered them silently. The PR's contribution is converting silent skips into hard errors plus … well, it would also be closing the symlink case if recommendation 1 lands).

Consider (Suggestions)

  1. Add a comment at app_packaging.rs:79 linking set_overwrite(false) to the caller's pre-unpack remove_dir_all. Prevents a future refactor of path_handlers.rs:278 from silently introducing a regression.
  2. Cap the helper's name length with debug_assert!(name.len() <= 100) to avoid a future test panicking on slice OOB.
  3. File a follow-up to consider cap-std-based sandboxing of contract unpacks (the tar crate's own docs recommend this for fully-untrusted archives).

What's Good

  • Single-file, well-scoped diff. No anti-patterns. No removed tests.
  • Test helper construction (writing the header name slot directly then re-cksumming) is exactly the right technique for the threat model.
  • cargo test -p freenet --lib app_packaging::tests:: passes locally in <1s. No flake risk.
  • Implementation closely follows the issue's fix sketch for the parts it covers.
  • First-time-contributor work shows good craftsmanship — the only structural critique is omission, not error.

Verdict

State: Needs Changes — Re-review Required After Fix

The PR is a genuine improvement and the existing code is sound, but it is incomplete: issue #3946 explicitly named the symlink/hardlink escape filter as a required defense, and that defense is the one case that is actually exploitable end-to-end through variable_content. Per ~/.claude/rules/finish-the-fix.md, this should be addressed in the same PR — the user-visible symptom ("malicious archive escapes the sandbox") is only closed when the symlink case is closed too.

The fix is mechanically tiny (~6 lines + a test). Recommend the author:

  1. Add symlink/hardlink entry-type rejection in WebApp::unpack.
  2. Add the symlink regression test.
  3. Optionally also harden variable_content to mirror sandbox_content_body's canonicalize+starts_with check (separate concern, possibly separate PR).

After the fix lands, a re-review on the new HEAD will confirm the gap is closed.

[AI-assisted - Claude]

Per @sanity's review of freenet#4204, the path-traversal guards I added only
look at entry.path() and not at entry.link_name(). A symlink entry with
a benign archive path (e.g. "app.js") but an escaping linkname (e.g.
"../../../etc/passwd") passes the absolute/parent-dir check and is then
created verbatim by tar::Entry::unpack_in. variable_content's
read_to_string then follows the symlink and serves the linked file to
the browser — full path-traversal exploit through the serving path.

Reject EntryType::Symlink and EntryType::HardLink outright in
WebApp::unpack. Web contracts have no legitimate use for either, so the
narrowest fix is to refuse them rather than try to validate linknames.
Hardlinks were already validated by the upstream tar crate, but
rejecting both keeps the policy uniform and the reasoning local.

Added unpack_rejects_symlink_entry regression test that injects a
symlink with linkname ../escape.txt and asserts (1) the unpack returns
an UnpackingError, (2) the symlink was never created inside dst, and
(3) the linkname's target above dst doesn't exist.

The test helper writes the GNU header linkname slot directly, same
technique the existing tests use for the path slot, since
Builder::append_link validates targets.

Signed-off-by: Charlie Tonneslan <cst0520@gmail.com>
@c-tonneslan

Copy link
Copy Markdown
Author

Thanks for the thorough review. The symlink case is a real bypass and you're right it's the one that actually escapes through variable_content. Pushed 3b838fc.

What changed:

  • WebApp::unpack now rejects EntryType::Symlink and EntryType::HardLink outright, before the path-component check. Web contracts have no legitimate use for either, so refusing them is narrower than trying to validate linknames against the destination.
  • Hardlinks were already validated by the upstream tar crate per your read, but rejecting both keeps the policy uniform and the reasoning local to one file.
  • Added unpack_rejects_symlink_entry: injects a symlink with linkname = "../escape.txt", asserts the unpack returns UnpackingError, the symlink was never created inside dst, and nothing leaked to dst.parent().

Confirmed locally: cargo test -p freenet --lib app_packaging::tests:: passes all four tests; cargo fmt && cargo clippy -p freenet --lib -- -D warnings is clean.

On the rest of your "should fix" list — happy to address in this PR if you want, but flagging that each is a separable concern:

  • Duplicate-entry behavior change from set_overwrite(false). Agreed it's silent. I'd lean toward keeping the false-by-default and adding a test that documents the new behavior. Let me know if you want that in this PR or a follow-up.
  • "Nothing leaked" assertion for the absolute-path test. Straightforward to add — would mirror the parent-dir test's assert!(!path.exists()) pattern. Will add if you want.
  • Cleanup on partial unpack error. This is more invasive (touches the success path), and as you noted the freshness gate at path_handlers.rs:326 blocks serving partial state. I'd prefer to defer this to a follow-up unless you'd rather have it here.
  • variable_content canonicalize + starts_with hardening. This is the defense-in-depth fix you mentioned. With symlinks now rejected at unpack time, the surface is significantly narrower, but the TOCTOU-resistant pattern from sandbox_content_body is genuinely better as a second line. I'd file it as a separate issue/PR since it touches a different module and a different caller pattern; let me know your preference.

On the PR description: you're right that the original framing overstated the literal-escape case for absolute/.. paths (tar already neuters those). The actual exploit was the symlink case, which is now closed. I'll update the description to reflect that more accurately.

Let me know if you'd like any of the should-fix items folded in before re-review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tar unpack of web contract lacks path-traversal guards

3 participants