Skip to content

[security] fix(personalization): stop replaying exported env var values#151

Merged
tjb-tech merged 1 commit intoHKUDS:mainfrom
shaun0927:fix/personalization-secret-sanitization
Apr 17, 2026
Merged

[security] fix(personalization): stop replaying exported env var values#151
tjb-tech merged 1 commit intoHKUDS:mainfrom
shaun0927:fix/personalization-secret-sanitization

Conversation

@shaun0927
Copy link
Copy Markdown
Contributor

Closes #149.

Summary

This PR narrows personalization env-var extraction so exported environment variables are stored by name only, not by raw NAME=value payload.

Concretely:

  • export OPENAI_API_KEY=... is now remembered as OPENAI_API_KEY
  • raw env-var values are no longer persisted into local_rules
  • later runtime system prompts no longer replay those raw secret values
  • regression tests cover both extraction and prompt assembly

Why

PR #65 introduced personalization so OpenHarness can remember local environment context such as hosts, paths, and endpoints. That feature direction makes sense.

The problem is that the current implementation also captures full exported values. For secrets, that creates a different class of behavior:

  • the value is written to facts.json
  • the value is rendered into rules.md
  • the value is replayed into future build_runtime_system_prompt(...) output

So a one-off shell snippet like export OPENAI_API_KEY=... can become a persistent prompt-side secret disclosure path.

Root cause

src/openharness/personalization/extractor.py currently captures the full NAME=value payload for export ... patterns.

That makes sense for general parsing, but it is too broad for a personalization feature whose output is persisted and later injected into prompts.

Change

  • update env-var extraction to keep only the variable name
  • keep the rest of the personalization flow unchanged
  • add a regression test showing the extractor does not retain secret values
  • add a prompt-level regression test showing exported secret values are not replayed into the runtime system prompt

Before / After

Before

  • export OPENAI_API_KEY=sk-test-secret -> personalization stores OPENAI_API_KEY=sk-test-secret
  • later sessions replay sk-test-secret via local rules in the system prompt

After

  • export OPENAI_API_KEY=sk-test-secret -> personalization stores OPENAI_API_KEY
  • later sessions can still see that the variable exists, but not the secret value

Validation

  • PYTHONPATH=src pytest -q tests/test_personalization/test_extractor.py tests/test_prompts/test_claudemd.py
  • PYTHONPATH=src ruff check src tests
  • targeted regression coverage added for extraction + prompt assembly

Notes

  • I intentionally kept this PR narrow: it fixes the confirmed secret-persistence path without redesigning the whole personalization feature.
  • Full pytest -q on this local environment is currently blocked by unrelated collection errors caused by a missing optional pyperclip dependency; I did not change any of the affected command/UI code paths.

Personalization currently captures full `export NAME=value` payloads and
re-injects them through local rules into future system prompts. Narrow the
environment-variable extraction to the variable name so the feature can still
remember environment hints without persisting secret material.

Constraint: Keep personalization's environment-hint workflow intact
Rejected: Remove env_var extraction entirely | larger behavior change than needed for a first fix
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Do not persist raw env var values without a separate sensitivity model and tests
Tested: PYTHONPATH=src pytest -q tests/test_personalization/test_extractor.py tests/test_prompts/test_claudemd.py
Tested: PYTHONPATH=src ruff check src tests
Not-tested: Full pytest suite in this environment (collection fails because optional pyperclip dependency is missing)
Related: HKUDS#149
@shaun0927 shaun0927 force-pushed the fix/personalization-secret-sanitization branch from 27b1f49 to 334101e Compare April 16, 2026 14:53
@tjb-tech tjb-tech merged commit 3b39550 into HKUDS:main Apr 17, 2026
arik08 pushed a commit to arik08/MyHarness that referenced this pull request Apr 26, 2026
Personalization currently captures full `export NAME=value` payloads and
re-injects them through local rules into future system prompts. Narrow the
environment-variable extraction to the variable name so the feature can still
remember environment hints without persisting secret material.

Constraint: Keep personalization's environment-hint workflow intact
Rejected: Remove env_var extraction entirely | larger behavior change than needed for a first fix
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Do not persist raw env var values without a separate sensitivity model and tests
Tested: PYTHONPATH=src pytest -q tests/test_personalization/test_extractor.py tests/test_prompts/test_claudemd.py
Tested: PYTHONPATH=src ruff check src tests
Not-tested: Full pytest suite in this environment (collection fails because optional pyperclip dependency is missing)
Related: HKUDS#149
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: personalization stores exported secret values in local_rules and re-injects them into future prompts

2 participants