Pass mp context to strategy by sjmonson · Pull Request #651 · vllm-project/guidellm

sjmonson · 2026-03-20T19:13:54Z

Summary

Fixes spawn and forkserver multi-process contexts.

Details

I was hoping that after #647 we could switch to forkserver by default. However it turns out that forkserver and spawn will import the calling processes entrypoint (E.g. __main__.py) so we run into the same blocker as #641. However, I was able to confirm that striping every heavy import out of __main__.py solves the issue. So we should be good to switch in v0.7.0.

On my machine there is about a ~10s overhead for forkserver and slightly more for spawn, which is not the worst for a default. However, the overhead may be more on other systems:

`time guidellm benchmark run --profile poisson --rate 5 --data prompt_tokens=128,output_tokens=128 --max-seconds 30 --outputs json`

Context	real	user	sys
Fork	0m37.874s	0m17.356s	0m1.883s
Forkserver	0m47.344s	0m14.862s	0m0.860s
Spawn	0m49.515s	1m51.230s	0m8.915s

`time guidellm benchmark run --profile concurrent --rate 400 --data prompt_tokens=128,output_tokens=128 --max-seconds 30 --outputs json`

Context	real	user	sys
Fork	0m39.324s	0m37.602s	0m5.623s
Forkserver	0m49.609s	0m19.710s	0m1.311s
Spawn	0m50.399s	2m9.724s	0m11.374s

`time guidellm benchmark run --profile concurrent --rate 400 --data prompt_tokens=128,output_tokens=128 --max-seconds 120 --outputs json`

Context	real	user	sys
Fork	2m15.309s	1m42.911s	0m15.957s
Forkserver	2m25.964s	0m38.891s	0m2.802s
Spawn	2m27.454s	3m24.325s	0m22.531s

Test Plan

Set GUIDELLM__MP_CONTEXT_TYPE=forkserver and confirm benchmarks run.

"I certify that all code in this PR is my own, except as noted below."

Use of AI

Includes AI-assisted code completion
Includes code generated by an AI application
Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes ## WRITTEN BY AI ##)

dbutenhof

OK, so this just does all of the multiprocessing operations through the get_context() object rather than using the globals, and shouldn't actually change things.

Does anyone call the GuideLLM entrypoints from a Python module? Because that expands the whole issue of __main__ restrictions to the caller... while for our CLI we only worry about our own. (One possibility I suppose would be for our ABI entrypoints to check whether the get_start_method() aligns with our expectations...)

The base branch was changed.

jaredoconnell

It appears to work fine. I have one comment.

For extra context, here is how long the command takes on my mac:
fork: 40.30 real 8.60 user 2.39 sys
forkserver: 48.44 real 7.56 user 1.39 sys
spawn: 51.70 real 50.97 user 23.85 sys

src/guidellm/schemas/response.py

sjmonson · 2026-03-20T20:33:11Z

Does anyone call the GuideLLM entrypoints from a Python module? Because that expands the whole issue of main restrictions to the caller... while for our CLI we only worry about our own. (One possibility I suppose would be for our ABI entrypoints to check whether the get_start_method() aligns with our expectations...)

@dbutenhof I tried a version main that dropped submodule imports into the click run function (had to strip out all argument validation and other CLI entrypoints) and the benchmark still ran fine. So I am not really sure the default Python behavior is actually helping us. In the future if we "lazy load" every submodule in __main__.py it will effectively be like not importing any of them since the worker will not have any reason to call the CLI functions.

Signed-off-by: Samuel Monson <smonson@redhat.com>

sjmonson force-pushed the fix/mp_context branch from f532cd9 to d04b8f1 Compare March 20, 2026 19:15

sjmonson requested review from dbutenhof and jaredoconnell March 20, 2026 19:15

sjmonson force-pushed the fix/mp_context branch from d04b8f1 to b31773f Compare March 20, 2026 19:38

sjmonson changed the base branch from main to fix/flaky_test March 20, 2026 19:39

dbutenhof previously approved these changes Mar 20, 2026

View reviewed changes

dbutenhof added this to the v0.6.0 milestone Mar 20, 2026

dbutenhof assigned sjmonson Mar 20, 2026

Base automatically changed from fix/flaky_test to main March 20, 2026 20:24

sjmonson force-pushed the fix/mp_context branch from b31773f to 88eb29c Compare March 20, 2026 20:25

jaredoconnell previously approved these changes Mar 20, 2026

View reviewed changes

src/guidellm/schemas/response.py Outdated Show resolved Hide resolved

sjmonson added 2 commits March 20, 2026 16:36

Pass mp context to strategy

9972dbb

Signed-off-by: Samuel Monson <smonson@redhat.com>

Fix unit tests

3acacc5

Signed-off-by: Samuel Monson <smonson@redhat.com>

sjmonson dismissed jaredoconnell’s stale review via 3acacc5 March 20, 2026 20:37

sjmonson force-pushed the fix/mp_context branch from 88eb29c to 3acacc5 Compare March 20, 2026 20:37

dbutenhof approved these changes Mar 20, 2026

View reviewed changes

sjmonson merged commit 1d579f6 into main Mar 20, 2026
17 checks passed

sjmonson deleted the fix/mp_context branch March 20, 2026 20:40

dbutenhof added feature Represents a new user-visible feature cleanup Internal refactoring or improvement, including CI, that's not directly user-visible. labels Apr 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pass mp context to strategy#651

Pass mp context to strategy#651
sjmonson merged 2 commits intomainfrom
fix/mp_context

sjmonson commented Mar 20, 2026

Uh oh!

dbutenhof left a comment

Uh oh!

jaredoconnell left a comment

Uh oh!

Uh oh!

sjmonson commented Mar 20, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

sjmonson commented Mar 20, 2026

Summary

Details

time guidellm benchmark run --profile poisson --rate 5 --data prompt_tokens=128,output_tokens=128 --max-seconds 30 --outputs json

time guidellm benchmark run --profile concurrent --rate 400 --data prompt_tokens=128,output_tokens=128 --max-seconds 30 --outputs json

time guidellm benchmark run --profile concurrent --rate 400 --data prompt_tokens=128,output_tokens=128 --max-seconds 120 --outputs json

Test Plan

Use of AI

Uh oh!

dbutenhof left a comment

Choose a reason for hiding this comment

Uh oh!

jaredoconnell left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sjmonson commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

`time guidellm benchmark run --profile poisson --rate 5 --data prompt_tokens=128,output_tokens=128 --max-seconds 30 --outputs json`

`time guidellm benchmark run --profile concurrent --rate 400 --data prompt_tokens=128,output_tokens=128 --max-seconds 30 --outputs json`

`time guidellm benchmark run --profile concurrent --rate 400 --data prompt_tokens=128,output_tokens=128 --max-seconds 120 --outputs json`

sjmonson commented Mar 20, 2026 •

edited

Loading