Skip to content

Add persistent dependency inference cache for incremental --changed-dependents#23228

Open
jasonwbarnett wants to merge 6 commits intopantsbuild:mainfrom
altana-ai:persistent-dep-cache
Open

Add persistent dependency inference cache for incremental --changed-dependents#23228
jasonwbarnett wants to merge 6 commits intopantsbuild:mainfrom
altana-ai:persistent-dep-cache

Conversation

@jasonwbarnett
Copy link
Copy Markdown
Contributor

Summary

Adds an opt-in persistent disk cache for the dependency graph computed by map_addresses_to_dependents(). When enabled via --incremental-dependents-enabled, the forward dependency graph is serialized to ~/.cache/pants/incremental_dep_graph_v2.json after each run and loaded on the next run. Only targets whose source files have changed (by SHA-256 content hash) need their dependencies re-resolved.

This dramatically reduces wall time for --changed-dependents=transitive in large repos with many targets.

Motivation

In a monorepo with ~53K targets, pants --changed-since=HEAD~3 --changed-dependents=transitive filter takes ~3.5 minutes because map_addresses_to_dependents() calls resolve_dependencies() for every target — even when pantsd is warm. The rule engine's in-memory memoization is invalidated by any filesystem change, and the AllUnexpandedTargetsAddressToDependents cascade forces full recomputation each time.

The persistent cache breaks this cycle: even on a cold pantsd start (fresh CI agent), previously computed dependency edges are reused for unchanged targets.

Results

Tested on a monorepo with 52,927 targets:

Run Time Dep graph Targets resolved
Cold (no cache) 3m22s 149s 52,927 fresh
Warm (from cache) 43s 1.6s 0 fresh, 52,927 cached
  • ~5x speedup on warm cache
  • 100% identical output vs uncached run
  • Cache file: 29MB on disk, 1.3MB compressed (portable via S3 for CI)
  • SHA-256 content hashing makes cache portable across machines (git clone mtimes don't matter)

Design

New subsystem: --incremental-dependents-enabled

Opt-in flag. When disabled (default), behavior is completely unchanged.

Cache format

JSON file at ~/.cache/pants/incremental_dep_graph_v2.json:

{
  "version": 2,
  "buildroot": "/path/to/repo",
  "entries": {
    "src/python/foo/bar.py:lib": {
      "fingerprint": "<sha256>",
      "deps": ["src/python/baz/qux.py:lib", "3rdparty/python:requests"]
    }
  }
}

Fingerprinting

Each target's cache key is SHA-256 of:

  • Its BUILD file content
  • Its source file content (for file-level generated targets)

This is ~1 second for 18K files and is fully portable across machines.

Safety

  • Cache misses are safe: target gets re-resolved via resolve_dependencies() as normal
  • Stale cache entries (deleted targets) are dropped during the merge step
  • Atomic writes prevent corruption (write to .tmp, then os.replace)
  • Version field enables cache format evolution

Files changed

  • src/python/pants/backend/project_info/dependents.py — Modified map_addresses_to_dependents() to use incremental mode when enabled
  • src/python/pants/backend/project_info/incremental_dependents.pyNew: cache persistence, fingerprinting, IncrementalDependents subsystem

CI usage

The cache can be shared across ephemeral CI agents via S3:

# Download cache from S3 (100ms)
aws s3 cp s3://ci-cache/dep-graph/latest.json.gz ~/.cache/pants/incremental_dep_graph_v2.json.gz
gunzip ~/.cache/pants/incremental_dep_graph_v2.json.gz

# Run with cache (43s vs 3.5min)
pants --incremental-dependents-enabled \
    --changed-since=HEAD~3 --changed-dependents=transitive filter

🤖 Generated with Claude Code

Jason Barnett and others added 4 commits April 7, 2026 20:32
Implement IncrementalDependents subsystem that persists the forward
dependency graph to disk. When enabled via --incremental-dependents-enabled,
only targets whose BUILD files or source files have changed (based on
mtime+size fingerprinting) need their dependencies re-resolved. This
dramatically reduces wall time for --changed-dependents=transitive in
large monorepos by avoiding redundant dependency inference on unchanged
targets across pantsd restarts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…arse

Address.parse() fails on bare spec strings like "src/python/foo.py:bar"
because it expects "//" prefix. Instead, build a spec→Address lookup dict
from AllUnexpandedTargets for O(1) resolution of cached dep specs.

Also simplify CachedEntry to store deps as spec strings directly rather
than structured JSON tuples, and remove now-unused serialization helpers.

Results: 52927-target monorepo
- Cold cache: 3m12s (same as before, writes 29MB cache)
- Warm cache: 38s (dep graph in 1.6s, 52927 targets from cache)
- 5x speedup on warm cache, 100% identical output

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mtime-based fingerprinting fails across machines because git clone sets
all file mtimes to the checkout timestamp, making the cache useless on
CI agents. SHA-256 content hashing costs only ~5 seconds more for 18K
files but makes the cache fully portable.

Benchmark (52,927 targets):
- Cold cache: 3m22s (writes cache)
- Warm cache: 43s (sha256 fingerprints, 100% cache hits)
- Cross-machine: cache is portable via S3 (1.3MB compressed)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
These .claude/worktrees/ entries were accidentally staged by git add -A
and are not part of the persistent dep cache changes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jasonwbarnett jasonwbarnett force-pushed the persistent-dep-cache branch from 159893c to 18fdfb0 Compare April 7, 2026 20:32
- Unit tests for CachedEntry, save/load roundtrip, JSON edge cases
- Unit tests for SHA-256 file hashing
- Unit tests for compute_source_fingerprint (BUILD changes, source changes, stability)
- Integration tests verifying incremental mode matches standard mode
  for direct deps, transitive deps, empty inputs, and special-cased deps
- Fix missing Address import in incremental_dependents.py

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@cburroughs
Copy link
Copy Markdown
Contributor

~53K targets

pants --changed-since=HEAD~3

Could you elaborate (here or in an issue) on your setup? Of the 53k targets, what is the rough breakdown by types?

I've most often seen --changed-since used with a filter to select on an "uncommon" type, such as "deploy all the helm stuff" or "publish all the docker images". From #23224 I take it you are filtering on a common type (like python_sources), is that correct?

I know you have looked at this from a few different angles, does performance get worse with:

  • The depth of --changed-since=HEAD~3 (is 4 slower than 3)
  • Length of total git history?
  • Some other https://github.com/github/git-sizer style issue?
  • Or only/primarily the number of targets?

If I wanted to make a case like yours -- or even more pathological! -- what would I need?

- Replace IncrementalDependents subsystem with PANTS_INCREMENTAL_DEPENDENTS
  env var to avoid "No such options scope" errors in tests that use
  dependents rules without registering the subsystem
- Add release notes entry to docs/notes/2.32.x.md
- Fix unused import (textwrap) and formatting issues caught by CI linters
- All tests pass: dependents_test, incremental_dependents_test,
  py_constraints_test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jasonwbarnett
Copy link
Copy Markdown
Contributor Author

~53K targets

pants --changed-since=HEAD~3

Could you elaborate (here or in an issue) on your setup? Of the 53k targets, what is the rough breakdown by types?

I've most often seen --changed-since used with a filter to select on an "uncommon" type, such as "deploy all the helm stuff" or "publish all the docker images". From #23224 I take it you are filtering on a common type (like python_sources), is that correct?

I know you have looked at this from a few different angles, does performance get worse with:

* The depth of --changed-since=HEAD~3 (is 4 slower than 3)

* Length of total git history?

* Some other https://github.com/github/git-sizer style issue?

* Or only/primarily the number of targets?

If I wanted to make a case like yours -- or even more pathological! -- what would I need?

I'm going to write up a comprehensive explanation of how I arrived at the conclusion and all of the supporting evidence. Give me a couple of hours to get it done.

@jasonwbarnett
Copy link
Copy Markdown
Contributor Author

jasonwbarnett commented Apr 8, 2026

Performance Investigation: --changed-dependents=transitive on a 53K-target Monorepo

Environment

  • Pants versions tested: 2.30.0 (pre-built binary), 2.32.0.dev7 (from source, upstream main)
  • Hardware: 8-CPU arm64 Linux, 30GB RAM
  • Repository: ~35K git commits, 53K Pants targets
  • All tests run with --no-pantsd to simulate CI (ephemeral agents, no warm daemon)

Target Breakdown

Target Type Count % of Total
python_source 20,850 39.4%
file 15,422 29.1%
resource 8,620 16.3%
python_test 2,945 5.6%
python_sources (generators) 1,767 3.3%
python_requirement 1,263 2.4%
python_tests (generators) 693 1.3%
shell_source 303 0.6%
docker_image 92 0.2%
Other (resources, distributions, etc.) 972 1.8%
Total 52,927 100%

Benchmark Results

All times are wall-clock elapsed seconds. Pants version 2.32.0.dev7 from source unless noted.

Test 1: The Core Bottleneck — --changed-dependents vs not

Command Time Output
filter (no dependents) 37s 0 targets
filter --changed-dependents=direct 2m53s 0 targets
filter --changed-dependents=transitive 2m40s 440 targets

Key finding: Adding --changed-dependents=direct jumps from 37s to 2m53s — a 4.7x increase — even when the result is 0 additional targets. The entire cost is building the full reverse dependency graph via map_addresses_to_dependents().

Test 2: The --changed-since Depth Does NOT Matter

Range Changed Files Time Output
HEAD~1 9 files 2m49s 0 targets
HEAD~3 53 files 2m40s 440 targets
HEAD~10 79 files 2m56s 811 targets

Times are within noise. The depth of --changed-since is irrelevant — the bottleneck is always map_addresses_to_dependents() which processes all 53K targets regardless of how many files changed.

Test 3: The Filter Type Does NOT Matter

Filter Time Output
--filter-target-type=+python_test 2m40s 440 targets
--filter-target-type=+docker_image 2m38s 20 targets

Same cost whether finding 440 test targets or 20 Docker targets. The filter is applied AFTER the full dependency graph is built.

Test 4: dependents Goal Shows the Same Bottleneck

Command Time Output
dependents --transitive <single-file> 2m50s 1,601 dependents
dependencies <single-file> 38s 10 dependencies
list :: 27s 52,927 targets

Computing forward dependencies for a single target: 38 seconds.
Computing reverse dependents for a single target: 2m50s (requires building the full reverse graph for ALL 53K targets).

Test 5: Warm pantsd Does NOT Help

Run Time
Cold pantsd 2m44s
Warm pantsd (identical command) 2m39s
Warm pantsd (different range) 2m51s

Warm pantsd provides essentially zero benefit for this operation. The map_addresses_to_dependents rule is recomputed on every invocation because it depends on AllUnexpandedTargets, which the Pants source describes as "relatively expensive to compute and frequently invalidated".

Test 6: Pre-built Binary vs From-Source

Version Time
Pants 2.30.0 (pre-built binary) 2m56s
Pants 2.32.0.dev7 (from source) 2m40s

No meaningful difference. The bottleneck is the same in both versions.

Test 7: Work Unit Timing (from -linfo logs)

"Map all targets to their dependents" — reported as a long-running task at:
  60.2s elapsed
  90.1s elapsed
  120.0s elapsed

This single rule (map_addresses_to_dependents) accounts for ~120 seconds out of ~160 seconds of total execution (75% of wall time).

Root Cause Analysis

What map_addresses_to_dependents() Does

@rule(desc="Map all targets to their dependents")
async def map_addresses_to_dependents(all_targets: AllUnexpandedTargets) -> AddressToDependents:
    dependencies_per_target = await concurrently(
        resolve_dependencies(
            DependenciesRequest(tgt.get(Dependencies), ...)
        )
        for tgt in all_targets  # ALL 52,927 targets
    )
    # Invert the forward deps to build the reverse map
    address_to_dependents = defaultdict(set)
    for tgt, dependencies in zip(all_targets, dependencies_per_target):
        for dependency in dependencies:
            address_to_dependents[dependency].add(tgt.address)
    return AddressToDependents(...)

This rule:

  1. Resolves AllUnexpandedTargets — every target in the repository (52,927)
  2. For each target, calls resolve_dependencies() which includes:
    • Parsing the target's BUILD file for explicit dependencies
    • Running dependency inference (Python import parsing, Docker COPY analysis, Shell source detection)
    • Resolving inferred module names to target addresses
  3. Inverts the forward dependency graph into a reverse mapping

Step 2 is the expensive part. Python import inference uses a Rust-based tree-sitter parser (fast per-file), but the per-target overhead of the rule engine — resolving imports to target addresses via the module mapper, handling ambiguity, validating results — adds up at 53K scale.

Why Warm pantsd Doesn't Help

map_addresses_to_dependents takes AllUnexpandedTargets as its sole input. AllUnexpandedTargets is a rule that scans the entire filesystem for BUILD files and resolves all targets. The Pants engine's InvalidationWatcher (inotify-based) detects any filesystem change and invalidates AllUnexpandedTargets, which cascades to invalidate AddressToDependents.

Even without actual file changes, the engine must re-verify that all BUILD files are unchanged, re-hash target definitions, and confirm the cached result is still valid. At 53K targets, this verification itself is non-trivial.

Why the Filter Doesn't Help

The filter (--filter-target-type=+python_test, --tag="-integration") is applied after map_addresses_to_dependents completes. The full reverse graph for all 53K targets is built first, then the result is filtered down.

This is a deliberate design choice (see pantsbuild/pants#15544): filtering before building the graph would cause missed dependents when a filtered-out target is an intermediate link in the dependency chain.

Conditions to Reproduce

To reproduce this performance characteristic, you need:

  1. Many targets (>30K, ideally >50K). The cost scales roughly linearly with target count.
  2. --changed-dependents=direct or --changed-dependents=transitive. Without this flag, the operation is fast (~30-40s) because it only finds owners of changed files, not their dependents.
  3. Any amount of changed files — even 0 changed files triggers the full graph build if --changed-dependents is set.

The target type distribution doesn't matter much. file and resource targets (which make up 45% of our targets) have trivial dependency inference, but they still contribute to the 53K targets that map_addresses_to_dependents must process.

Synthetic Reproduction

To create a synthetic test case:

# Create 50K targets in a fresh repo
mkdir big-repo && cd big-repo

pants init

# Add Python backend and interpreter constraints
cat > pants.toml <<'EOF'
[GLOBAL]
pants_version = "2.31.0"
backend_packages = ["pants.backend.python"]

[python]
interpreter_constraints = ["==3.11.*"]
EOF

# Ignore pants cache so git diff doesn't explode
echo '/.pants.*' > .gitignore

for i in $(seq 1 500); do
    mkdir -p "pkg${i}"
    for j in $(seq 1 100); do
        echo "x = $j" > "pkg${i}/file${j}.py"
    done
    echo 'python_sources()' > "pkg${i}/BUILD.pants"
done

git init && git add . && git commit -m "init"

# Modify an existing file
sed -i 's/x = 1/x = 999/' pkg1/file1.py
git add . && git commit -m "change"

time pants --changed-since=HEAD~ --changed-dependents=transitive list

@cburroughs

Summary

The performance issue is real, reproducible, and caused by map_addresses_to_dependents() resolving dependencies for ALL targets in the repo whenever --changed-dependents is used. The cost is O(N) where N = total target count, regardless of:

  • How many files changed
  • What target type is being filtered for
  • Whether pantsd is warm or cold
  • The depth of the git history

At 53K targets, this costs ~2m40s. The rule engine's in-memory caching doesn't help because AllUnexpandedTargets is invalidated on every invocation.

@cburroughs
Copy link
Copy Markdown
Contributor

Thanks, this is very helpful. I'll do some analysis, but I'm out for the next of the week and may not be able to post anything before then. A few clarifying questions:

  • https://github.com/altana-ai/pants-changed-dependents-perf-repro looks like all python_sources instead of the same percentage breakdown as in your repo, do I have that right?
  • The first thing I ran into was pants/pantsd needed 20 GiB of memory for something like pants --no-pantsd --changed-since=7722688 --changed-dependents=transitive. Are you also running into that problem for your real repo?

@jasonwbarnett
Copy link
Copy Markdown
Contributor Author

jasonwbarnett commented Apr 8, 2026

Thanks, this is very helpful. I'll do some analysis, but I'm out for the next of the week and may not be able to post anything before then. A few clarifying questions:

* https://github.com/altana-ai/pants-changed-dependents-perf-repro looks like all python_sources instead of the same percentage breakdown as in your repo, do I have that right?

* The first thing I ran into was pants/pantsd needed 20 GiB of memory for something like `pants --no-pantsd --changed-since=7722688 --changed-dependents=transitive`.  Are you also running into that problem for your real repo?
  • Yep, it's not the same breakdown. This was easier to reproduce and demonstrates behavior that is very similar.
  • Yep, I find that it eats up a ton of memory. Apparently some inefficient memory thing within the internals of pants that both @benjyw and @sureshjoshi seem to be vaguely aware of.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants