Skip to content

fix: robust worker_init detection for Parsl — support production pixi/conda/venv environments #22

Description

@maxscheurer

Summary

_default_worker_init() in mdfactory/orchestration/tui.py attempts to auto-detect a pixi environment for Parsl worker initialization, but makes fragile assumptions that only work in dev scenarios (source checkout with .pixi/envs/default/ present). For production installs (pip install mdfactory[parsl]), it silently returns "".

Current Behavior

project_root = Path(mdfactory.__file__).parent.parent
pixi_env = project_root / ".pixi" / "envs" / "default"
if pixi_env.exists():
    return f'eval "$(pixi shell-hook --manifest-path {project_root} -e default)"'

Assumptions that break in production:

  1. mdfactory.__file__ is in a source checkout (not site-packages)
  2. The parent of mdfactory/ is the project root (breaks with editable installs via symlinks)
  3. .pixi/envs/default/ existing means pixi is the correct activation method
  4. The default environment is the right one (could be dev or custom)
  5. pixi is available on PATH on compute nodes

Desired Behavior

Robust worker environment detection that works for production mdfactory installations on HPC clusters. Should handle at least:

  • pixi: detect via PIXI_PROJECT_MANIFEST env var or pixi on PATH
  • conda/mamba: detect via CONDA_PREFIX / CONDA_DEFAULT_ENV
  • venv: detect via VIRTUAL_ENV
  • module system: user-provided module loads (no auto-detection possible)

The result is used as a pre-filled default in the interactive TUI wizard — the user can always edit it. So a wrong guess is not catastrophic, but an empty default on a cluster where we could detect the environment is a missed UX opportunity.

Acceptance Criteria

  • Worker init detection works for pip install mdfactory[parsl] (non-dev install)
  • Detects pixi via environment variables rather than filesystem assumptions
  • Falls back gracefully (empty string) when no environment manager is detected
  • Works correctly when mdfactory build --slurm tui is run from a compute node vs. login node
  • Add a test that exercises the detection logic with mocked env vars

Technical Notes

  • The function lives in mdfactory/orchestration/tui.py line 23
  • It's called from both _configure_with_cluster() and _configure_manual() as a questionary text prompt default
  • Related: Parsl's worker_init runs as a shell snippet inside each SLURM job before workers start

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions