-
Notifications
You must be signed in to change notification settings - Fork 10
fix: terminate call-graph alias fixpoint on oscillating rebinds (#1247) #1259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 4 commits
Commits
Show all changes
26 commits
Select commit
Hold shift + click to select a range
9feb7bc
fix: terminate call-graph alias fixpoint on oscillating rebinds
mldangelo 467e66b
Merge remote-tracking branch 'origin/main' into mdangelo/codex/reviewβ¦
mldangelo-oai cebd962
fix: preserve propagation when bounding alias cycles
mldangelo-oai d78c997
fix: bound assignment alias propagation work
mldangelo-oai 26655ad
fix: fail closed on cyclic alias propagation
mldangelo-oai cbbbd10
fix: converge stable alias rebind states
mldangelo-oai 21fcf1a
Merge remote-tracking branch 'origin/main' into mdangelo/codex/reviewβ¦
mldangelo-oai c617e8e
fix: preserve findings on incomplete call-graph analysis
mldangelo-oai 3572995
fix: preserve startup-hook findings on analysis limits
mldangelo-oai 42e7702
Merge remote-tracking branch 'origin/main' into mdangelo/codex/reviewβ¦
mldangelo-oai 3acc6a8
fix: fail closed on conditional alias rebinding
mldangelo-oai f670f86
fix: retain aliases and findings across analysis limits
mldangelo-oai dbcdffb
fix: preserve deterministic loop-else alias results
mldangelo-oai 6cd7de3
fix: track ambiguous alias reads before overwrites
mldangelo-oai 78eb56d
fix: fail closed during torch reference filtering
mldangelo-oai e37bce1
fix: preserve findings across alias ambiguity limits
mldangelo-oai 8f2c0b4
fix: propagate ambiguous aliases in installed packages
mldangelo-oai 183869a
fix: preserve deterministic alias findings across limits
mldangelo-oai 95642a4
fix: resolve deterministic alias alternatives safely
mldangelo-oai 2d00183
fix: retain deterministic aliases before epilogues
mldangelo-oai b054f0a
Merge remote-tracking branch 'origin/main' into mdangelo/codex/reviewβ¦
mldangelo-oai 8ca3b53
fix: track ambiguous alias calls before overwrites
mldangelo-oai 3bbeb4d
test: stabilize generic zip raw-scan fixture
mldangelo-oai 71925b1
fix: preserve same-line terminal alias ordering
mldangelo-oai 06b7a3c
fix: resolve deterministic terminal alias branches
mldangelo-oai 66049f2
fix: handle one-sided terminal alias paths
mldangelo-oai File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
190 changes: 190 additions & 0 deletions
190
packages/modelaudit-picklescan/tests/test_call_graph_assignment_alias_cycle.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,190 @@ | ||
| """Regression: assignment-alias fixpoint must terminate on oscillating binds. | ||
|
|
||
| A module that binds the same name in both branches of an ``if``/``else`` | ||
| (e.g. the ``imaplib``/``http.server``/``nntplib`` stdlib ``__main__`` blocks) | ||
| makes ``_collect_assignment_aliases`` oscillate the alias between two values | ||
| without ever growing the dict. The dict-size guard never trips, so the | ||
| fixpoint loop spins forever and the whole scan hangs (GitHub issue #1247). | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| import ast | ||
| import threading | ||
| from importlib.util import find_spec | ||
|
|
||
| import pytest | ||
|
|
||
| import modelaudit_picklescan.call_graph as call_graph | ||
|
github-code-quality[bot] marked this conversation as resolved.
Fixed
|
||
| from modelaudit_picklescan import PickleReport, Severity, scan_bytes | ||
| from modelaudit_picklescan.api import _RUST_EXTENSION_MODULE | ||
| from modelaudit_picklescan.call_graph import ( | ||
| _CallGraphAnalysisLimitError, | ||
| _collect_assignment_aliases, | ||
| _collect_local_defs, | ||
| _module_level_statements, | ||
| ) | ||
|
|
||
| _OSCILLATING_MODULE_SOURCE = """\ | ||
| class A: | ||
| pass | ||
|
|
||
|
|
||
| class B: | ||
| pass | ||
|
|
||
|
|
||
| if cond: | ||
| m = A() | ||
| else: | ||
| m = B() | ||
| """ | ||
|
|
||
| _DEPENDENT_CYCLE_SOURCE = """\ | ||
| class A: | ||
| pass | ||
|
|
||
|
|
||
| class B: | ||
| pass | ||
|
|
||
|
|
||
| a = A() | ||
| a = b | ||
| b = B() | ||
| b = a | ||
| c = a | ||
| """ | ||
|
|
||
|
|
||
| def _run_with_timeout(target: object, timeout: float = 10.0) -> None: | ||
| thread = threading.Thread(target=target) # type: ignore[arg-type] | ||
| thread.daemon = True | ||
| thread.start() | ||
| thread.join(timeout) | ||
| if thread.is_alive(): | ||
| pytest.fail(f"call-graph analysis did not terminate within {timeout}s") | ||
|
|
||
|
|
||
| def test_collect_assignment_aliases_terminates_on_branch_rebind() -> None: | ||
| tree = ast.parse(_OSCILLATING_MODULE_SOURCE) | ||
| statements = _module_level_statements(tree) | ||
| local_defs = _collect_local_defs(statements) | ||
| local_class_targets = {"testmod.A", "testmod.B"} | ||
|
|
||
| result: dict[str, dict[str, str]] = {} | ||
|
|
||
| def _collect() -> None: | ||
| result["aliases"] = _collect_assignment_aliases( | ||
| statements, | ||
| "testmod", | ||
| {}, | ||
| local_defs, | ||
| local_class_targets, | ||
| ) | ||
|
|
||
| _run_with_timeout(_collect) | ||
|
|
||
| aliases = result["aliases"] | ||
| # The rebinding name still resolves to one of the two local classes; the | ||
| # exact branch is order-dependent, but the loop must converge. | ||
| assert aliases.get("m") in {"testmod.A", "testmod.B"} | ||
|
|
||
|
|
||
| def test_collect_assignment_aliases_terminates_after_cyclic_dependency_propagation() -> None: | ||
| tree = ast.parse(_DEPENDENT_CYCLE_SOURCE) | ||
| statements = _module_level_statements(tree) | ||
| local_defs = _collect_local_defs(statements) | ||
| local_class_targets = {"testmod.A", "testmod.B"} | ||
|
|
||
| result: dict[str, dict[str, str]] = {} | ||
|
|
||
| def _collect() -> None: | ||
| result["aliases"] = _collect_assignment_aliases( | ||
| statements, | ||
| "testmod", | ||
| {}, | ||
| local_defs, | ||
| local_class_targets, | ||
| ) | ||
|
|
||
| _run_with_timeout(_collect) | ||
|
|
||
| assert result["aliases"].get("c") in local_class_targets | ||
|
|
||
|
|
||
| def test_collect_assignment_aliases_fails_closed_on_long_period_cycles() -> None: | ||
| periods = (7, 11, 13, 17, 19) | ||
| source_lines: list[str] = [] | ||
| local_class_targets: set[str] = set() | ||
| for ring_index, period in enumerate(periods): | ||
| for position in range(period): | ||
| class_name = f"Ring{ring_index}Class{position}" | ||
| variable_name = f"ring_{ring_index}_{position}" | ||
| source_lines.extend((f"class {class_name}:", " pass", "", f"{variable_name} = {class_name}()", "")) | ||
| local_class_targets.add(f"testmod.{class_name}") | ||
| for position in range(period): | ||
| next_position = (position + 1) % period | ||
| source_lines.append(f"ring_{ring_index}_{position} = ring_{ring_index}_{next_position}") | ||
|
|
||
| tree = ast.parse("\n".join(source_lines)) | ||
| statements = _module_level_statements(tree) | ||
| local_defs = _collect_local_defs(statements) | ||
| result: dict[str, bool] = {} | ||
|
|
||
| def _collect() -> None: | ||
| with pytest.raises(_CallGraphAnalysisLimitError): | ||
| _collect_assignment_aliases( | ||
| statements, | ||
| "testmod", | ||
| {}, | ||
| local_defs, | ||
| local_class_targets, | ||
| ) | ||
| result["limited"] = True | ||
|
|
||
| _run_with_timeout(_collect) | ||
|
|
||
| assert result == {"limited": True} | ||
|
|
||
|
|
||
| def test_assignment_alias_limit_is_not_hidden_by_safe_entrypoint_wrapper(monkeypatch: pytest.MonkeyPatch) -> None: | ||
| def _raise_limit(_function_name: str) -> tuple[str, ...]: | ||
| raise _CallGraphAnalysisLimitError("assignment alias limit") | ||
|
|
||
| monkeypatch.setattr(call_graph, "_call_graph_entrypoints", _raise_limit) | ||
| call_graph._safe_call_graph_entrypoints.cache_clear() | ||
|
|
||
| with pytest.raises(_CallGraphAnalysisLimitError, match="assignment alias limit"): | ||
| call_graph._safe_call_graph_entrypoints("long_period.module") | ||
|
|
||
|
|
||
| def _stack_global_payload(module: str, name: str) -> bytes: | ||
| def _operand(value: str) -> bytes: | ||
| data = value.encode() | ||
| return b"\x8c" + bytes([len(data)]) + data | ||
|
|
||
| return b"\x80\x04" + _operand(module) + _operand(name) + b"\x93" + _operand("proof_of_bypass") + b"\x85R." | ||
|
|
||
|
|
||
| @pytest.mark.skipif( | ||
| find_spec(_RUST_EXTENSION_MODULE) is None, | ||
| reason="Rust picklescan extension is not built", | ||
| ) | ||
| @pytest.mark.skipif( | ||
| find_spec("imaplib") is None, | ||
| reason="imaplib stdlib module is unavailable", | ||
| ) | ||
| def test_scan_imaplib_reference_terminates_and_flags() -> None: | ||
| """The issue #1247 proof-of-concept must finish and stay flagged.""" | ||
| payload = _stack_global_payload("imaplib", "test") | ||
| result: dict[str, PickleReport] = {} | ||
|
|
||
| def _scan() -> None: | ||
| result["report"] = scan_bytes(payload) | ||
|
|
||
| _run_with_timeout(_scan) | ||
|
|
||
| report = result["report"] | ||
| severities = {finding.severity for finding in report.findings} | ||
| assert Severity.CRITICAL in severities | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.