Skip to content

fix: avoid quadratic memory growth in streamText text() without breaking chunk streaming#14119

Closed
okxint wants to merge 1 commit intovercel:mainfrom
okxint:fix/streamtext-memory-efficient-output
Closed

fix: avoid quadratic memory growth in streamText text() without breaking chunk streaming#14119
okxint wants to merge 1 commit intovercel:mainfrom
okxint:fix/streamtext-memory-efficient-output

Conversation

@okxint
Copy link
Copy Markdown

@okxint okxint commented Apr 3, 2026

Summary

Follows up on #13878 which was closed because returning undefined from parsePartialOutput() broke incremental chunk publishing.

This PR takes a different approach — instead of suppressing partial output entirely, it avoids the expensive JSON.stringify(result.partial) path for text-type outputs while preserving incremental text-delta chunk streaming.

The original issue: for a ~110KB response arriving in ~13,000 chunks, JSON.stringify was called on the accumulated text string on every chunk, creating ~350MB of intermediate string copies that landed in V8's large_object_space.

The fix: For plain text output, every text-delta always changes the partial result, so the JSON.stringify-based dedup comparison is unnecessary. We short-circuit it and publish immediately. Structured outputs (object, array, choice, json) still use the existing JSON.stringify dedup path since partial JSON parsing can produce identical results across consecutive chunks.

How did you test this?

  • Verified text-delta chunks still stream incrementally (not batched at end)
  • Confirmed memory usage doesn't grow quadratically with response size
  • All existing stream-text tests pass (337 tests)
  • All output tests pass (60 tests)

…le preserving chunk streaming

For plain text output, every text-delta always changes the partial result,
so we can skip the JSON.stringify comparison and publish immediately. This
avoids creating O(n) intermediate string copies per chunk (~350MB of
large_object_space allocations for a ~110KB response arriving in ~13k chunks).

Structured outputs (object, array, choice, json) still use the existing
JSON.stringify dedup path since partial JSON parsing can produce identical
results across consecutive chunks.
@tigent tigent bot added ai/core core functions like generateText, streamText, etc. Provider utils, and provider spec. bug Something isn't working as documented maintenance CI, internal documentation, automations, etc labels Apr 3, 2026
@aayush-kapoor
Copy link
Copy Markdown
Collaborator

this issue has a PR in place #14123 - will be closing this as well

would love to know how you verfied this is the right fix? how did you confirm memory usage doesn't grow quadratically with response size?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai/core core functions like generateText, streamText, etc. Provider utils, and provider spec. bug Something isn't working as documented maintenance CI, internal documentation, automations, etc

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants