Skip to content

Hardening: input validation and bounds tightening across 36 subsystems (round 4)#31339

Merged
Jarred-Sumner merged 96 commits into
mainfrom
claude/hardening-sweep-4
May 25, 2026
Merged

Hardening: input validation and bounds tightening across 36 subsystems (round 4)#31339
Jarred-Sumner merged 96 commits into
mainfrom
claude/hardening-sweep-4

Conversation

@Jarred-Sumner

Copy link
Copy Markdown
Collaborator

Continues the hardening series (rounds 1–3) with another pass of input validation, bounds checking, and lifetime tightening across the codebase. 73 changes, each in its own commit, plus regression tests covering 55 of them (each test verified to fail on the released bun and pass on this branch).

Areas touched: install / lockfile / tarball extraction / bunx / upgrade, sql (postgres, sqlite), tls / sockets / dns, http2 / http3 / http client / server, markdown, shell, node:fs / zlib / spawn / streams, crypto, vm, yaml / semver / valkey / undici / url / cookies, dev server, transpiler cache, string handling.

The general shape of the changes: reject malformed or out-of-range inputs earlier, bound buffer and recursion growth, keep borrowed buffers alive (or copy them) for the duration of the operations that read them, and validate names/paths/headers before they are used to address files, hosts, or memory.

@Jarred-Sumner

Copy link
Copy Markdown
Collaborator Author

Removed 405 of the 435 comment lines this branch added under src/, and reverted the handful of pre-existing comments it had reworded back to their original wording. The only added comments kept are the // SAFETY: lines sitting directly above unsafe blocks (the workspace denies clippy::undocumented_unsafe_blocks, so deleting those fails the clippy job), 2 pre-existing comments that just moved with relocated code, and the // "maxRedirects: number" label that matches the existing option-block labels in fetch_impl. Commit: 0d8dc03.

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional findings (outside current diff — PR may have been updated during review):

  • 🟡 src/install/TarballStream.rs:1090-1096 — The same "use O_NOFOLLOW instead of a pre-emptive unlink" feedback applied to extract_tarball.rs in 802304e wasn't carried over to this parallel .bun-tag write — it still does unlinkat() then File::write_file() (which opens without O_NOFOLLOW). Consider switching to the same openat(... | O::NOFOLLOW) + write_all pattern for consistency and to drop the extra syscall; functionally both approaches are safe here.

    Extended reasoning...

    What the issue is

    Jarred's inline review on the equivalent .bun-tag write in extract_tarball.rs (comment 3296185424) said:

    Use O_TRUNC and O_NOFOLLOW instead and handle the error. do not add extra syscalls.

    That was applied in commit 802304e: extract_tarball.rs:420-426 now opens .bun-tag with sys::O::WRONLY | O::CREAT | O::TRUNC | O::NOFOLLOW and writes through the returned fd, so a planted symlink fails the open with ELOOP and falls into the existing error branch. The author's reply (3296231996) even noted this uses the "same flags open_output_file in TarballStream.rs uses for extracted entries".

    However, the parallel .bun-tag write in TarballStream.rs:1090-1101 (the streaming extractor's GitHub-tarball path) was not updated. It still does:

    let _ = bun_sys::unlinkat(self.dest.unwrap(), bun_core::zstr!(".bun-tag"));
    if bun_sys::File::write_file(
        self.dest.unwrap(),
        bun_core::zstr!(".bun-tag"),
        self.resolved_github_dirname,
    )
    .is_err()

    where File::write_file (src/sys/file.rs:432) opens with O::WRONLY | O::CREAT | O::TRUNC — no O::NOFOLLOW.

    Why this is not a correctness or security bug

    Both approaches are functionally safe. The threat being defended against is a malicious GitHub tarball that ships a .bun-tag entry as a symlink pointing outside the extraction directory. By the time this code runs, extraction is complete and the extraction temp directory is private to this single-threaded process — nothing can recreate the symlink between the unlinkat() and the subsequent open(), so there is no TOCTOU window. The unlink-then-open sequence reliably defeats the planted-symlink attack.

    Why it's worth flagging anyway

    This is a consistency miss against an explicit maintainer directive on the identical scenario, plus one extra syscall per GitHub-tarball install on the streaming path:

    1. The maintainer specifically asked "do not add extra syscalls" on the sibling code path, and that preference was applied to one of two parallel implementations but not the other.
    2. The two extractors now defend against the same threat (.bun-tag as a planted symlink) using different mechanisms, which is a maintenance hazard — a future reader has to reason about why they differ.
    3. The O_NOFOLLOW approach is also slightly more robust as a pattern (the protection is in the open itself, not in a separate prior step that could be reordered).

    Step-by-step proof

    1. PR adds two .bun-tag symlink protections: one in extract_tarball.rs (the libarchive-based extractor) and one in TarballStream.rs (the streaming extractor), both originally using unlinkat() + write.
    2. Jarred reviews the extract_tarball.rs hunk at inline 3296185424: "Use O_TRUNC and O_NOFOLLOW instead and handle the error. do not add extra syscalls."
    3. Commit 802304e ("install: open the package tag file directly during extraction") applies that to extract_tarball.rs only — git show 802304ed --stat touches src/install/extract_tarball.rs but not src/install/TarballStream.rs.
    4. At HEAD, TarballStream.rs:1093 still has the unlinkat and line 1094 still calls File::write_file, which per src/sys/file.rs:432 opens without O::NOFOLLOW.
    5. So the same review feedback applies here and was not carried over.

    How to fix

    Mirror the extract_tarball.rs change:

    if bun_sys::File::openat(
        self.dest.unwrap(),
        bun_core::zstr!(".bun-tag"),
        bun_sys::O::WRONLY | bun_sys::O::CREAT | bun_sys::O::TRUNC | bun_sys::O::NOFOLLOW,
        0o664,
    )
    .and_then(|f| f.write_all(self.resolved_github_dirname))
    .is_err()
    {
        let _ = bun_sys::unlinkat(self.dest.unwrap(), bun_core::zstr!(".bun-tag"));
    }

    Drop the pre-emptive unlinkat at line 1093. Net syscalls become open+write+close, matching the sibling path.

Comment thread test/bundler/transpiler/transpiler.test.js
@Jarred-Sumner

Copy link
Copy Markdown
Collaborator Author

We should always be using the fchmodat syscall instead of worrying about glibc versions.

@Jarred-Sumner

Copy link
Copy Markdown
Collaborator Author

Switched chmod_on_ok to a single sys::fchmodat(Fd::cwd(), abs_target, mode, AT_SYMLINK_NOFOLLOW) call (the wrapper in src/sys/lib.rs already issues fchmodat with an EINTR retry, same as the neighboring chmod) and dropped the lchmod + lstat fallback. Done in abae510. One caveat: glibc < 2.32 returns ENOTSUP for AT_SYMLINK_NOFOLLOW on every path including regular files, so on those systems this code path won't add the executable bit to bin targets published without it; the error is ignored as before.

@Jarred-Sumner

Copy link
Copy Markdown
Collaborator Author

One caveat: glibc < 2.32 returns ENOTSUP for AT_SYMLINK_NOFOLLOW on every path including regular files, so on those systems this code path won't add the executable bit to bin targets published without it; the error is ignored as bef

What about the syscall version?

@Jarred-Sumner

Copy link
Copy Markdown
Collaborator Author

The raw fchmodat syscall doesn't take a flags argument at all — that's why glibc has to emulate AT_SYMLINK_NOFOLLOW in userspace, and only does so from 2.32 on. The syscall that does take flags is fchmodat2 (nr 452, Linux 6.6+): the kernel handles AT_SYMLINK_NOFOLLOW itself, so a regular-file bin target gets its mode set on any libc, and a symlink target returns ENOTSUP (the skip we want). sys::lchmod now invokes fchmodat2 directly via libc::syscall and only falls back to the libc fchmodat wrapper on ENOSYS (kernel < 6.6) — on those, glibc >= 2.32 still works for regular files and older glibc is no worse than before. chmod_on_ok calls sys::lchmod. Done in 378b48f.

Comment thread src/install/TarballStream.rs Outdated
Comment thread test/js/node/tls/test-node-extra-ca-certs.test.ts
@Jarred-Sumner

Copy link
Copy Markdown
Collaborator Author

The Windows lanes lost .bun-tag because O::NOFOLLOW in the NT openat translation replaces the create-options with FILE_OPEN_REPARSE_POINT instead of OR-ing it in, dropping FILE_SYNCHRONOUS_IO_NONALERT — the handle comes back asynchronous, the following WriteFile (null OVERLAPPED) fails, and the error branch unlinks the freshly created tag. Fixed by applying O::NOFOLLOW only on non-Windows at both tag-write sites (the no-follow open is kept on POSIX); Linux tests pass locally, Windows verification is deferred to CI. Commit: 05f502a

Comment thread test/bundler/transpiler/transpiler.test.js
Comment thread test/cli/install/bun-install-registry.test.ts
@Jarred-Sumner Jarred-Sumner merged commit cd1ad59 into main May 25, 2026
79 of 81 checks passed
@Jarred-Sumner Jarred-Sumner deleted the claude/hardening-sweep-4 branch May 25, 2026 20:26
springmin pushed a commit to springmin/bun that referenced this pull request May 26, 2026
* oven/main (10 new commits):
  Optimize TextEncoder.encode: restore SIMD ASCII fast paths lost in the Rust port (oven-sh#31385)
  js_parser: sanitize auto-generated default export name for digit-named modules (oven-sh#31403)
  fetch: run checkServerIdentity before writing the request (oven-sh#31325)
  ffi: avoid copying the threadsafe callback wrapper on the calling thread (oven-sh#31332)
  install: gate the exit-callback cache teardown to the main thread (oven-sh#31376)
  fix(node:module): don't register native helpers as their own constructors (oven-sh#31393)
  css: escape custom pseudo-class/element names when printing (oven-sh#31404)
  Deepen the lots-of-for-loop fixture so the transpiler stack-overflow tests throw on Windows (oven-sh#31382)
  Hardening: input validation and bounds tightening across 36 subsystems (round 4) (oven-sh#31339)
  Speed up FormData multipart serialization (oven-sh#31379)

Auto-merged: src/install/PackageManager.rs, src/runtime/cli/upgrade_command.rs, src/runtime/webcore/Blob.rs, src/sys/lib.rs
Jarred-Sumner added a commit that referenced this pull request May 27, 2026
…s (round 6) (#31417)

Tightens input validation, bounds checking, and state handling across
the runtime, package manager, and CLI. Continuation of #31339; same
structure (small per-area commits, regression tests included).

### Package manager / CLI
- install: validate package folder names derived from dependency aliases
before extraction (non-git resolutions)
- install: require integrity for off-registry tarball URLs when
migrating `package-lock.json` (yarn.lock migration unchanged — its v1
workspace/github entries legitimately omit integrity)
- install: validate migrated git committish values as single path
components
- install: constrain transitive `file:` dependency targets to their
declaring package
- install: JSON-escape registry/git strings when saving `bun.lock`;
escape quoted scalars in the yarn.lock printer
- install: byte-compare string-pool hits in manifest parsing and bound
version writes
- install: key lifecycle-script trust on the declared dependency alias
and resolution type
- install: resolve bin link targets and skip entries that escape the
package directory
- install: bound cache-folder name formatting and tarball decompression
output
- bunx: re-validate cache directory ownership after creation
- run: resolve the package-script shell from the original PATH instead
of `node_modules/.bin`
- upgrade-related items intentionally not included

### HTTP / TLS / networking
- fetch: drop caller-supplied Transfer-Encoding for fixed-size bodies
(single framing header on the wire)
- fetch: exclude requests carrying custom TLS options from the
experimental HTTP/3 path
- node:http: validate `options.port` in ClientRequest
(ERR_SOCKET_BAD_PORT)
- node:http2: zero-fill outbound DATA frame padding
- node:dns: reject hostnames containing embedded NUL bytes (slightly
stricter than Node, which truncates at the NUL; never affects a valid
hostname)
- tls: re-establish per-loop BIO state before driving handshakes after
JS callbacks
- server: disarm request-body callbacks before handing the response to
the file-stream path
- server: restrict the development-mode `/bun:info` route to loopback
clients
- valkey: checkpoint partial-line scan progress; dev server: bound
error-report path normalization
- bun-vscode: bind the diagnostics socket to a fresh per-session path
and verify it before advertising

### Web / runtime APIs
- Blob/Body: copy content types into the Blob they describe; bound
deserialization lengths; serialize only a slice's own window; tolerate
odd-offset UTF-16 text decoding
- S3: validate the `type` option with the same check other Blob
constructors use
- FormData: handle duplicate index-like field names in serialization
- WebCrypto: refuse to serialize non-extractable CryptoKey material
through script-visible serializers
- node:crypto: fail closed on ECDH errors; stop after rejecting
unsupported RSA decryption padding
- node:fs: pin async write sources; bound recursive-readdir path joins
- SQL: guard column-identifier tags when building row objects
- escapeHTML: process lone surrogates without skipping the following
character

### Parsers / shell / misc
- shell: copy stdin redirect buffers up front; pin output redirect
buffers; treat interpolated `cd` arguments literally
- markdown: sanitize link/image/fence metadata in the ANSI renderer;
make emphasis resolution linear on adversarial input
- JSON5/JSONC: stack-checked AST conversion; YAML: bound merge-key
materialization
- CSS: clamp tokenizer advances at end of input; glob: derive entry
offsets from the joined path
- resolver: validate exports-map targets after wildcard substitution

### Tests
34 new regression tests across the touched areas, each verified to fail
on the released build and pass here. A few fixes have no standalone test
where the behavior difference is not observable from JS (noted
per-commit in the shard history); CI exercises the affected suites.

`cargo check` and cross-target checks pass on all 10 CI targets; the
touched test suites pass locally with no new failures versus main.

---------

Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants