Skip to content

Optimize --changed-dependents=transitive when filter is specified#23224

Open
jasonwbarnett wants to merge 1 commit intopantsbuild:mainfrom
altana-ai:optimize-changed-filter-pr
Open

Optimize --changed-dependents=transitive when filter is specified#23224
jasonwbarnett wants to merge 1 commit intopantsbuild:mainfrom
altana-ai:optimize-changed-filter-pr

Conversation

@jasonwbarnett
Copy link
Copy Markdown
Contributor

Summary

Optimizes --changed-since with --changed-dependents=transitive when a target filter (--tag, --filter-target-type) is also specified.

The current implementation builds the full reverse dependency graph for ALL targets in the repo via map_addresses_to_dependents(). This requires resolving dependencies (including Python import inference) for every single target, which is very slow in large repos (52K+ targets).

Approach

When specs_filter.is_specified and dependents == TRANSITIVE, we invert the approach:

  1. Filter AllUnexpandedTargets to find targets matching the filter (e.g., python_test targets without the integration tag)
  2. Forward BFS from those matched targets, resolving deps only for reachable targets
  3. Build a restricted reverse dependency map from just the visited subgraph
  4. BFS from the changed file owners through the reverse map to find which matched targets are transitively affected

This skips dependency resolution for targets unreachable from the filtered set (Docker, Helm, Shell targets, etc. when filtering for python_test).

Results

Tested on a monorepo with ~53K targets and ~2,800 matching test targets:

Metric Before After
Wall time (from sources) 3m39s 2m42s
Targets requiring dep resolution ~53K ~24K
Improvement ~26% faster
  • Output is identical to the unoptimized path (verified via sorted diff)
  • All 10 integration tests in changed_integration_test.py pass
  • Only activates when specs_filter.is_specified AND dependents == TRANSITIVE; all other code paths unchanged

Context

This addresses the same performance issue as the experimental branch ts/try-faster-filter-dependents, but with a different strategy:

  • That branch computed individual TransitiveTargetsRequest per matched target (N separate BFS walks)
  • This PR does a single combined forward BFS from all matched targets, builds a local reverse map, then walks from owners — one BFS instead of N

Related: #15544

🤖 Generated with Claude Code

When using --changed-since with --changed-dependents=transitive AND a
target filter (--tag, --filter-target-type), the current implementation
builds the full reverse dependency graph for ALL targets in the repo via
map_addresses_to_dependents(). This requires resolving dependencies
(including Python import inference) for every single target, which is
very slow in large repos (52K+ targets).

This commit adds an optimized code path that inverts the approach: instead
of building the full reverse graph, it performs a forward BFS from only the
targets matching the filter, builds a restricted reverse dependency map from
just the visited subgraph, then walks from the changed owners to find which
filtered targets are affected.

For a monorepo with ~53K targets and ~2,800 matching test targets, this
reduces the number of targets requiring dependency resolution from 53K to
~24K (the forward closure of test targets), yielding a ~26% speedup in
wall-clock time.

The optimization only activates when specs_filter.is_specified AND
dependents == TRANSITIVE. All other code paths are unchanged.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant