fix: avoid quadratic memory growth in streamText text() without breaking chunk streaming#14119
Closed
okxint wants to merge 1 commit intovercel:mainfrom
Closed
fix: avoid quadratic memory growth in streamText text() without breaking chunk streaming#14119okxint wants to merge 1 commit intovercel:mainfrom
okxint wants to merge 1 commit intovercel:mainfrom
Conversation
…le preserving chunk streaming For plain text output, every text-delta always changes the partial result, so we can skip the JSON.stringify comparison and publish immediately. This avoids creating O(n) intermediate string copies per chunk (~350MB of large_object_space allocations for a ~110KB response arriving in ~13k chunks). Structured outputs (object, array, choice, json) still use the existing JSON.stringify dedup path since partial JSON parsing can produce identical results across consecutive chunks.
Collaborator
|
this issue has a PR in place #14123 - will be closing this as well would love to know how you verfied this is the right fix? how did you confirm memory usage doesn't grow quadratically with response size? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follows up on #13878 which was closed because returning
undefinedfromparsePartialOutput()broke incremental chunk publishing.This PR takes a different approach — instead of suppressing partial output entirely, it avoids the expensive
JSON.stringify(result.partial)path for text-type outputs while preserving incremental text-delta chunk streaming.The original issue: for a ~110KB response arriving in ~13,000 chunks,
JSON.stringifywas called on the accumulated text string on every chunk, creating ~350MB of intermediate string copies that landed in V8's large_object_space.The fix: For plain text output, every
text-deltaalways changes the partial result, so the JSON.stringify-based dedup comparison is unnecessary. We short-circuit it and publish immediately. Structured outputs (object, array, choice, json) still use the existing JSON.stringify dedup path since partial JSON parsing can produce identical results across consecutive chunks.How did you test this?