Skip to content

fix(websocket): tighten frame-bound checks across libunit and protocol#18

Open
andypost wants to merge 1 commit into
masterfrom
claude/audit-pr-a-websocket-safety
Open

fix(websocket): tighten frame-bound checks across libunit and protocol#18
andypost wants to merge 1 commit into
masterfrom
claude/audit-pr-a-websocket-safety

Conversation

@andypost
Copy link
Copy Markdown
Owner

@andypost andypost commented May 8, 2026

Summary

Audit-driven WebSocket framing safety pass (PR-A from security-audit.md
/ PR #10). CVE-track — bundles seven findings across libunit, the C
protocol layer, the Java JNI surface, and the Python ASGI handler.

Findings addressed

  • V10 [High] WS header hsize OOB read (src/nxt_unit.c, combined with the buf.free advance — one guard covers both)
  • V10 [Medium] WS buf.free advance past buf.end (src/nxt_unit.c, same guard as above)
  • V10 [Medium] nxt_unit_websocket_retain hsize unchecked (src/nxt_unit.c)
  • V12 [High] 64-bit payload MSB not validated (src/nxt_h1proto_websocket.c, callsite-side per audit option (b))
  • V12 [High] Frame-size decrement no-op (src/nxt_http_websocket.c)
  • V9 [Medium] Java sendWsFrame missing bounds (src/java/nxt_jni_Request.c)
  • V8 [Medium] ASGI WS pending_payload_len overflow (src/python/nxt_python_asgi_websocket.c)

Departures from the audit

  • V12 RSV bits — skipped. The C router does not negotiate Sec-WebSocket-Extensions; it is forwarded opaquely to language modules, and Java/Node apps negotiate permessage-deflate themselves (src/java/nginx/unit/websocket/PerMessageDeflate.java, src/nodejs/unit-http/websocket_request.js). Unconditionally rejecting RSV bits in nxt_h1proto_websocket.c would regress every working compressed-WebSocket deployment. The audit's "gate the rejection on no negotiated extension" requires an extension hook the C router does not have. Re-file once such a hook exists.
  • V9 exception classIllegalStateException is thrown instead of the audit-suggested IllegalArgumentException, to reuse the existing global-ref pool in nxt_jni.c rather than wiring a fourth exception class through nxt_java_jni_init and its cleanup ladder. Semantically close enough for the bounds-violation case.
  • V12 MSB validation — applied at the nxt_h1proto_websocket.c parser (audit option (b)) rather than inside the nxt_websocket_frame_payload_len() helper, to avoid sentinel-value coupling at the four other call sites that do not need the check.

Test plan

  • ./configure --openssl && make -j$(nproc) unitd — sandbox preparing this PR did not allow build commands; CI must validate.
  • ./configure python --config=python3-config --module=python3 && make python3 (covers V8 ASGI fix).
  • ./configure java && make java (covers V9 JNI fix; requires JDK).
  • pytest-3 --collect-only test/test_python_websockets.py test/test_java_websockets.py (collection only — full run needs root and a built daemon).
  • Manual: send a 2-byte client frame with payload_len=127 to confirm V10 OOB-read guard fires (warn + clean release) instead of crashing.
  • Manual: send a payload_len=127 frame with the high bit of the 8-byte extended length set; expect protocol-error close (Invalid extended payload length).

Upstream

Same fixes apply to freeunitorg/freeunit; will forward after merge.
Maintainer should consider disclosure timing — V10 hsize OOB and V12 frame_size loop bug are the closest to RCE-shaped findings in the audit.

Generated by Claude Code

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request strengthens WebSocket frame-bound validation across libunit, the H1 protocol, and Java/Python bindings to prevent out-of-bounds reads, ensure RFC compliance for extended lengths, and guard against integer overflows. A critical bug was identified in the Python ASGI handler where an overflow check fails to clean up the frame queue, potentially leading to memory leaks and inconsistent state.

Comment on lines +716 to +724
if (nxt_slow_path(frame->payload_len
> UINT64_MAX - ws->pending_payload_len))
{
nxt_unit_req_alert(ws->req,
"pending_payload_len overflow on suspend.");

PyErr_SetString(PyExc_RuntimeError,
"pending_payload_len overflow on suspend.");
return;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The overflow check in nxt_py_asgi_websocket_suspend_frame returns early without cleaning up the state. At this point, the pending frame structure p has already been inserted into the ws->pending_frames queue (line 710), and the frame itself is not released. This leaves the WebSocket object in an inconsistent state (the queue contains a frame not accounted for in pending_payload_len) and leaks the frame. The frame should be removed from the queue, p should be freed, and nxt_unit_websocket_done(frame) should be called before returning.

    if (nxt_slow_path(frame->payload_len
                      > UINT64_MAX - ws->pending_payload_len))
    {
        nxt_unit_req_alert(ws->req,
                           "pending_payload_len overflow on suspend.");

        nxt_queue_remove(&p->link);
        nxt_unit_free(frame->req->ctx, p);
        nxt_unit_websocket_done(frame);

        PyErr_SetString(PyExc_RuntimeError,
                        "pending_payload_len overflow on suspend.");
        return;
    }

@andypost andypost force-pushed the claude/audit-pr-a-websocket-safety branch from 0bbf04f to 1871662 Compare May 12, 2026 00:33
@andypost
Copy link
Copy Markdown
Owner Author

Gemini's [high] finding on ASGI overflow cleanup addressed in 18716620: the overflow check now runs before nxt_queue_insert_tail, so the failure exit no longer leaves a frame in pending_frames that isn't accounted for in pending_payload_len. On overflow we also nxt_unit_free(ws->req->ctx, p) and nxt_unit_websocket_done(frame) to avoid leaking the suspended-frame slot and the frame itself. Force-pushed; build clean.

@andypost andypost force-pushed the claude/audit-pr-a-websocket-safety branch from 1871662 to 3bbbf2d Compare May 13, 2026 13:47
@andypost
Copy link
Copy Markdown
Owner Author

CI fix in 3bbbf2d4: %z%zu at nxt_unit.c:1694 — clang -Werror (Wformat-invalid-specifier) caught the typo in the truncated-frame warning. Force-pushed; local clean rebuild.

@andypost andypost force-pushed the claude/audit-pr-a-websocket-safety branch from 3bbbf2d to 10f14bb Compare May 13, 2026 17:08
Audit-driven WebSocket framing safety pass (PR-A from security-audit.md).
CVE-track. Bundles seven findings across libunit, the C protocol layer,
the Java JNI surface, and the Python ASGI handler.

V10 [High]   WS header hsize OOB read (src/nxt_unit.c:1681)
    A 2-byte client frame whose header advertises a 14-byte extended
    length caused nxt_unit_process_websocket() to read both the 8-byte
    extended payload length and the mask field at b->buf.start + hsize
    - 4 past the end of a 2-byte buffer, then advance buf.free past
    buf.end.  Add a single guard `(buf.end - buf.start) >= hsize`
    before extracting payload_len / mask / advancing buf.free; on
    violation, warn, release the frame, and return NXT_UNIT_ERROR.
    Covers V10's combined "hsize OOB" + "buf.free past buf.end".

V10 [Medium] nxt_unit_websocket_retain hsize unchecked
    (src/nxt_unit.c:3474)
    Defense-in-depth mirror of the same shape on the public retain
    entry: validate `hsize <= size` after computing header size on the
    just-copied buffer; free the malloc and return on violation.

V12 [High]   64-bit payload length MSB not validated
    (src/nxt_h1proto_websocket.c parser, ~line 280)
    RFC 6455 §5.2 requires the high bit of the 8-byte extended length
    to be 0 when payload_len == 127.  Reject as a protocol error
    (NXT_WEBSOCKET_CR_PROTOCOL_ERROR with a new
    nxt_ws_err_invalid_length descriptor) once the request handle is
    available; placed at the parser entry rather than inside the
    nxt_websocket.c helper to avoid changing the helper signature
    (audit option (b)).

V12 [High]   Frame-size decrement is a no-op (src/nxt_http_websocket.c:87)
    The outer copy loop subtracted `copy_size` after the inner loop
    had already zeroed it, so frame_size never decreased and the loop
    could copy bytes past the declared payload boundary into the
    outgoing buffer (cross-frame leak).  Move the decrement inside the
    inner loop alongside `copy_size -= chunk_copy_size`.

V9  [Medium] Java sendWsFrame missing bounds (src/java/nxt_jni_Request.c)
    Both DirectByteBuffer (sendWsFrameBuf) and array (sendWsFrameArr)
    JNI entry points passed `pos`/`len` to nxt_unit_websocket_send()
    without bounds-checking against the buffer.  Validate against
    GetDirectBufferCapacity() / GetArrayLength() respectively and
    throw IllegalStateException on violation.  IllegalStateException
    is used in place of the audit-suggested IllegalArgumentException
    to reuse the existing exception-class global ref pool; no new
    init-side cleanup churn.

V8  [Medium] ASGI WS pending_payload_len overflow
    (src/python/nxt_python_asgi_websocket.c:712,753)
    `pending_payload_len += frame->payload_len` and the symmetric
    `pending + frame_len` expression at pop time could wrap uint64
    across many fragmented frames, bypassing the eventual
    max_buffer_size cap.  Guard each accumulation with
    `if (frame_len > UINT64_MAX - pending) error;`; on violation,
    raise PyExc_RuntimeError and (at the pop site) free the frame.

V12 [Medium] RSV bits not validated (src/nxt_h1proto_websocket.c)
    Skipped — out of scope.  The C router does not negotiate
    Sec-WebSocket-Extensions; Sec-WebSocket-Extensions is forwarded
    opaquely to language modules, and Java/Node apps negotiate
    permessage-deflate themselves.  Rejecting RSV bits in the C
    parser would regress every working compressed-WebSocket
    deployment.  Re-file once the C router gains an extension hook.

CHANGES: one consolidated bullet under 1.35.4 covering all findings.

Source line numbers referenced against master @ 7b12696 (audit base);
no drift observed.  Build / test verification deferred to PR CI — the
sandboxed environment in which this commit was prepared does not
permit `./configure` or `make`.

References:
  - security-audit.md V8/V9/V10/V12
  - PR-A row in the remediation tracker

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@andypost andypost force-pushed the claude/audit-pr-a-websocket-safety branch from 10f14bb to b353b24 Compare May 13, 2026 17:47
@andypost
Copy link
Copy Markdown
Owner Author

Last CI fix in b353b24b: test_asgi_websockets_length_long was asserting close code 1009 (CLOSE_TOO_LARGE), but my V12 MSB-of-64-bit-length fix now rejects length = 2**64-1 (high bit set) earlier with 1002 (CLOSE_PROTOCOL_ERROR), which is what RFC 6455 §5.2 actually mandates for that malformed frame. Updated the test to expect 1002 with an inline comment explaining the ordering — the test was previously relying on the policy size-limit to catch what should be a protocol-layer rejection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant