Hardening: input validation and bounds tightening across 36 subsystems (round 4)#31339
Conversation
|
Removed 405 of the 435 comment lines this branch added under src/, and reverted the handful of pre-existing comments it had reworded back to their original wording. The only added comments kept are the |
There was a problem hiding this comment.
Additional findings (outside current diff — PR may have been updated during review):
-
🟡
src/install/TarballStream.rs:1090-1096— The same "use O_NOFOLLOW instead of a pre-emptive unlink" feedback applied toextract_tarball.rsin 802304e wasn't carried over to this parallel.bun-tagwrite — it still doesunlinkat()thenFile::write_file()(which opens withoutO_NOFOLLOW). Consider switching to the sameopenat(... | O::NOFOLLOW)+write_allpattern for consistency and to drop the extra syscall; functionally both approaches are safe here.Extended reasoning...
What the issue is
Jarred's inline review on the equivalent
.bun-tagwrite inextract_tarball.rs(comment 3296185424) said:Use O_TRUNC and O_NOFOLLOW instead and handle the error. do not add extra syscalls.
That was applied in commit 802304e:
extract_tarball.rs:420-426now opens.bun-tagwithsys::O::WRONLY | O::CREAT | O::TRUNC | O::NOFOLLOWand writes through the returned fd, so a planted symlink fails the open withELOOPand falls into the existing error branch. The author's reply (3296231996) even noted this uses the "same flagsopen_output_filein TarballStream.rs uses for extracted entries".However, the parallel
.bun-tagwrite inTarballStream.rs:1090-1101(the streaming extractor's GitHub-tarball path) was not updated. It still does:let _ = bun_sys::unlinkat(self.dest.unwrap(), bun_core::zstr!(".bun-tag")); if bun_sys::File::write_file( self.dest.unwrap(), bun_core::zstr!(".bun-tag"), self.resolved_github_dirname, ) .is_err()
where
File::write_file(src/sys/file.rs:432) opens withO::WRONLY | O::CREAT | O::TRUNC— noO::NOFOLLOW.Why this is not a correctness or security bug
Both approaches are functionally safe. The threat being defended against is a malicious GitHub tarball that ships a
.bun-tagentry as a symlink pointing outside the extraction directory. By the time this code runs, extraction is complete and the extraction temp directory is private to this single-threaded process — nothing can recreate the symlink between theunlinkat()and the subsequentopen(), so there is no TOCTOU window. The unlink-then-open sequence reliably defeats the planted-symlink attack.Why it's worth flagging anyway
This is a consistency miss against an explicit maintainer directive on the identical scenario, plus one extra syscall per GitHub-tarball install on the streaming path:
- The maintainer specifically asked "do not add extra syscalls" on the sibling code path, and that preference was applied to one of two parallel implementations but not the other.
- The two extractors now defend against the same threat (
.bun-tagas a planted symlink) using different mechanisms, which is a maintenance hazard — a future reader has to reason about why they differ. - The
O_NOFOLLOWapproach is also slightly more robust as a pattern (the protection is in the open itself, not in a separate prior step that could be reordered).
Step-by-step proof
- PR adds two
.bun-tagsymlink protections: one inextract_tarball.rs(the libarchive-based extractor) and one inTarballStream.rs(the streaming extractor), both originally usingunlinkat()+ write. - Jarred reviews the
extract_tarball.rshunk at inline 3296185424: "Use O_TRUNC and O_NOFOLLOW instead and handle the error. do not add extra syscalls." - Commit 802304e ("install: open the package tag file directly during extraction") applies that to
extract_tarball.rsonly —git show 802304ed --stattouchessrc/install/extract_tarball.rsbut notsrc/install/TarballStream.rs. - At HEAD,
TarballStream.rs:1093still has theunlinkatand line 1094 still callsFile::write_file, which persrc/sys/file.rs:432opens withoutO::NOFOLLOW. - So the same review feedback applies here and was not carried over.
How to fix
Mirror the
extract_tarball.rschange:if bun_sys::File::openat( self.dest.unwrap(), bun_core::zstr!(".bun-tag"), bun_sys::O::WRONLY | bun_sys::O::CREAT | bun_sys::O::TRUNC | bun_sys::O::NOFOLLOW, 0o664, ) .and_then(|f| f.write_all(self.resolved_github_dirname)) .is_err() { let _ = bun_sys::unlinkat(self.dest.unwrap(), bun_core::zstr!(".bun-tag")); }
Drop the pre-emptive
unlinkatat line 1093. Net syscalls become open+write+close, matching the sibling path.
|
We should always be using the fchmodat syscall instead of worrying about glibc versions. |
|
Switched |
What about the syscall version? |
|
The raw |
|
The Windows lanes lost |
* oven/main (10 new commits): Optimize TextEncoder.encode: restore SIMD ASCII fast paths lost in the Rust port (oven-sh#31385) js_parser: sanitize auto-generated default export name for digit-named modules (oven-sh#31403) fetch: run checkServerIdentity before writing the request (oven-sh#31325) ffi: avoid copying the threadsafe callback wrapper on the calling thread (oven-sh#31332) install: gate the exit-callback cache teardown to the main thread (oven-sh#31376) fix(node:module): don't register native helpers as their own constructors (oven-sh#31393) css: escape custom pseudo-class/element names when printing (oven-sh#31404) Deepen the lots-of-for-loop fixture so the transpiler stack-overflow tests throw on Windows (oven-sh#31382) Hardening: input validation and bounds tightening across 36 subsystems (round 4) (oven-sh#31339) Speed up FormData multipart serialization (oven-sh#31379) Auto-merged: src/install/PackageManager.rs, src/runtime/cli/upgrade_command.rs, src/runtime/webcore/Blob.rs, src/sys/lib.rs
…s (round 6) (#31417) Tightens input validation, bounds checking, and state handling across the runtime, package manager, and CLI. Continuation of #31339; same structure (small per-area commits, regression tests included). ### Package manager / CLI - install: validate package folder names derived from dependency aliases before extraction (non-git resolutions) - install: require integrity for off-registry tarball URLs when migrating `package-lock.json` (yarn.lock migration unchanged — its v1 workspace/github entries legitimately omit integrity) - install: validate migrated git committish values as single path components - install: constrain transitive `file:` dependency targets to their declaring package - install: JSON-escape registry/git strings when saving `bun.lock`; escape quoted scalars in the yarn.lock printer - install: byte-compare string-pool hits in manifest parsing and bound version writes - install: key lifecycle-script trust on the declared dependency alias and resolution type - install: resolve bin link targets and skip entries that escape the package directory - install: bound cache-folder name formatting and tarball decompression output - bunx: re-validate cache directory ownership after creation - run: resolve the package-script shell from the original PATH instead of `node_modules/.bin` - upgrade-related items intentionally not included ### HTTP / TLS / networking - fetch: drop caller-supplied Transfer-Encoding for fixed-size bodies (single framing header on the wire) - fetch: exclude requests carrying custom TLS options from the experimental HTTP/3 path - node:http: validate `options.port` in ClientRequest (ERR_SOCKET_BAD_PORT) - node:http2: zero-fill outbound DATA frame padding - node:dns: reject hostnames containing embedded NUL bytes (slightly stricter than Node, which truncates at the NUL; never affects a valid hostname) - tls: re-establish per-loop BIO state before driving handshakes after JS callbacks - server: disarm request-body callbacks before handing the response to the file-stream path - server: restrict the development-mode `/bun:info` route to loopback clients - valkey: checkpoint partial-line scan progress; dev server: bound error-report path normalization - bun-vscode: bind the diagnostics socket to a fresh per-session path and verify it before advertising ### Web / runtime APIs - Blob/Body: copy content types into the Blob they describe; bound deserialization lengths; serialize only a slice's own window; tolerate odd-offset UTF-16 text decoding - S3: validate the `type` option with the same check other Blob constructors use - FormData: handle duplicate index-like field names in serialization - WebCrypto: refuse to serialize non-extractable CryptoKey material through script-visible serializers - node:crypto: fail closed on ECDH errors; stop after rejecting unsupported RSA decryption padding - node:fs: pin async write sources; bound recursive-readdir path joins - SQL: guard column-identifier tags when building row objects - escapeHTML: process lone surrogates without skipping the following character ### Parsers / shell / misc - shell: copy stdin redirect buffers up front; pin output redirect buffers; treat interpolated `cd` arguments literally - markdown: sanitize link/image/fence metadata in the ANSI renderer; make emphasis resolution linear on adversarial input - JSON5/JSONC: stack-checked AST conversion; YAML: bound merge-key materialization - CSS: clamp tokenizer advances at end of input; glob: derive entry offsets from the joined path - resolver: validate exports-map targets after wildcard substitution ### Tests 34 new regression tests across the touched areas, each verified to fail on the released build and pass here. A few fixes have no standalone test where the behavior difference is not observable from JS (noted per-commit in the shard history); CI exercises the affected suites. `cargo check` and cross-target checks pass on all 10 CI targets; the touched test suites pass locally with no new failures versus main. --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Continues the hardening series (rounds 1–3) with another pass of input validation, bounds checking, and lifetime tightening across the codebase. 73 changes, each in its own commit, plus regression tests covering 55 of them (each test verified to fail on the released bun and pass on this branch).
Areas touched: install / lockfile / tarball extraction / bunx / upgrade, sql (postgres, sqlite), tls / sockets / dns, http2 / http3 / http client / server, markdown, shell, node:fs / zlib / spawn / streams, crypto, vm, yaml / semver / valkey / undici / url / cookies, dev server, transpiler cache, string handling.
The general shape of the changes: reject malformed or out-of-range inputs earlier, bound buffer and recursion growth, keep borrowed buffers alive (or copy them) for the duration of the operations that read them, and validate names/paths/headers before they are used to address files, hosts, or memory.