Conversation
f532cd9 to
d04b8f1
Compare
d04b8f1 to
b31773f
Compare
dbutenhof
left a comment
There was a problem hiding this comment.
OK, so this just does all of the multiprocessing operations through the get_context() object rather than using the globals, and shouldn't actually change things.
Does anyone call the GuideLLM entrypoints from a Python module? Because that expands the whole issue of __main__ restrictions to the caller... while for our CLI we only worry about our own. (One possibility I suppose would be for our ABI entrypoints to check whether the get_start_method() aligns with our expectations...)
b31773f to
88eb29c
Compare
jaredoconnell
left a comment
There was a problem hiding this comment.
It appears to work fine. I have one comment.
For extra context, here is how long the command takes on my mac:
fork: 40.30 real 8.60 user 2.39 sys
forkserver: 48.44 real 7.56 user 1.39 sys
spawn: 51.70 real 50.97 user 23.85 sys
@dbutenhof I tried a version main that dropped submodule imports into the click run function (had to strip out all argument validation and other CLI entrypoints) and the benchmark still ran fine. So I am not really sure the default Python behavior is actually helping us. In the future if we "lazy load" every submodule in |
Signed-off-by: Samuel Monson <smonson@redhat.com>
Signed-off-by: Samuel Monson <smonson@redhat.com>
88eb29c to
3acacc5
Compare
Summary
Fixes spawn and forkserver multi-process contexts.
Details
I was hoping that after #647 we could switch to
forkserverby default. However it turns out thatforkserverandspawnwill import the calling processes entrypoint (E.g.__main__.py) so we run into the same blocker as #641. However, I was able to confirm that striping every heavy import out of__main__.pysolves the issue. So we should be good to switch in v0.7.0.On my machine there is about a ~10s overhead for
forkserverand slightly more forspawn, which is not the worst for a default. However, the overhead may be more on other systems:time guidellm benchmark run --profile poisson --rate 5 --data prompt_tokens=128,output_tokens=128 --max-seconds 30 --outputs jsontime guidellm benchmark run --profile concurrent --rate 400 --data prompt_tokens=128,output_tokens=128 --max-seconds 30 --outputs jsontime guidellm benchmark run --profile concurrent --rate 400 --data prompt_tokens=128,output_tokens=128 --max-seconds 120 --outputs jsonTest Plan
Set
GUIDELLM__MP_CONTEXT_TYPE=forkserverand confirm benchmarks run.Use of AI
## WRITTEN BY AI ##)