Summary
CodeGraph currently requires users to discover OS-level file descriptor / watcher exhaustion and manually mitigate it by killing daemons or adding --no-watch. That is too fragile for normal agent workflows where a developer may have many MCP clients and many project roots open at once.
This is a guardrail/defaults issue related to, but distinct from, the raw FD leak reports in #496 / #555 and the stale-process accumulation report in #579.
Environment where observed
- macOS 26.5 / Darwin 25.5.0, Apple Silicon arm64
- CodeGraph
0.9.8 plus older 0.9.7 daemons from prior sessions
kern.maxfiles: 491520
kern.maxfilesperproc: 245760
- Several MCP-capable agent clients used across multiple project roots
What happened
Even after moving away from a huge root-level workspace index to per-project indexes, CodeGraph still created enough watcher / open-file pressure that daemon logs showed system-level ENFILE: file table overflow errors.
Sanitized live snapshot:
before cleanup:
kern.num_files: 15771
kern.maxfiles: 491520
project daemon FD counts:
codegraph serve --mcp --path <large-go-project>: 2901 numeric FDs
codegraph serve --mcp --path <medium-flutter-project>: 599 numeric FDs
codegraph serve --mcp --path <small-swift-project>: 96 numeric FDs
The largest daemon was almost entirely regular-file descriptors:
REG 2887
PIPE 4
DIR 3
KQUEUE 3
CHR 2
unix 2
numeric_fds 2901
The daemon log contained repeated errors like:
[CodeGraph] File watcher error {
error: "Error: ENFILE: file table overflow, realpath '<workspace>/internal/...'"
}
[CodeGraph] File watcher error {
error: "Error: ENFILE: file table overflow, lstat '<workspace>/internal/.../client_test.go'"
}
[CodeGraph] File watcher error {
error: "Error: ENFILE: file table overflow, scandir '<workspace>/internal/...'"
}
Killing stale/high-FD project daemons released pressure immediately:
after killing two stale project daemons:
kern.num_files: 12164
after killing the remaining project watcher:
kern.num_files: 12042
--no-watch was an effective local mitigation:
codegraph serve --mcp --no-watch
Why this needs product-level guardrails
This should not require manual OS debugging. On macOS, exhausting the global file table causes failures in unrelated processes. The user sees shells, browsers, IDEs, Docker, and background services behaving badly, not a clear CodeGraph error.
Also, powerful hardware does not make the failure mode acceptable. A machine with a high global FD limit still experienced CodeGraph-originated ENFILE logs. Developers using agents often have many repos and sessions active; CodeGraph should degrade gracefully under that workload.
Suggested behavior
I think CodeGraph should fail safe by default:
-
Budget-aware watcher startup
- Estimate files/directories to watch before enabling the watcher.
- Compare against OS limits (
kern.maxfiles, kern.maxfilesperproc, current kern.num_files on macOS; inotify limits on Linux).
- If the projected watcher/indexer cost is risky, refuse watch mode or auto-fallback to
--no-watch with a clear warning.
-
FD telemetry in status/debug output
- Add something like
codegraph status --resources or codegraph debug resources.
- Report current process FD count, watcher count if available, path root, client count, idle-timeout state, and OS budget percentage.
-
Default MCP install should be conservative
- For stdio MCP installs on macOS, consider installing
args = ["serve", "--mcp", "--no-watch"] until watcher FD usage is bounded.
- Or ask during
codegraph install: "Enable live watcher? This can be expensive on large workspaces."
-
Release resources when idle
- If
clients=0, close watchers and open file handles.
- Rehydrate watcher state only when a client attaches or a query requires sync.
-
Hard cap and warning
- A daemon should never be allowed to hold tens of thousands of file descriptors without a loud warning.
- A configurable cap such as
CODEGRAPH_MAX_OPEN_FDS / --max-open-fds would be better than letting the OS global table fail.
Expected outcome
Users who work across many projects should be able to install CodeGraph once and not periodically debug global OS file-table exhaustion. If watch mode is too expensive for a workspace, CodeGraph should detect that, explain it, and keep the index/query path usable in no-watch/manual-sync mode.
Related:
Summary
CodeGraph currently requires users to discover OS-level file descriptor / watcher exhaustion and manually mitigate it by killing daemons or adding
--no-watch. That is too fragile for normal agent workflows where a developer may have many MCP clients and many project roots open at once.This is a guardrail/defaults issue related to, but distinct from, the raw FD leak reports in #496 / #555 and the stale-process accumulation report in #579.
Environment where observed
0.9.8plus older0.9.7daemons from prior sessionskern.maxfiles: 491520kern.maxfilesperproc: 245760What happened
Even after moving away from a huge root-level workspace index to per-project indexes, CodeGraph still created enough watcher / open-file pressure that daemon logs showed system-level
ENFILE: file table overflowerrors.Sanitized live snapshot:
The largest daemon was almost entirely regular-file descriptors:
The daemon log contained repeated errors like:
Killing stale/high-FD project daemons released pressure immediately:
--no-watchwas an effective local mitigation:Why this needs product-level guardrails
This should not require manual OS debugging. On macOS, exhausting the global file table causes failures in unrelated processes. The user sees shells, browsers, IDEs, Docker, and background services behaving badly, not a clear CodeGraph error.
Also, powerful hardware does not make the failure mode acceptable. A machine with a high global FD limit still experienced CodeGraph-originated
ENFILElogs. Developers using agents often have many repos and sessions active; CodeGraph should degrade gracefully under that workload.Suggested behavior
I think CodeGraph should fail safe by default:
Budget-aware watcher startup
kern.maxfiles,kern.maxfilesperproc, currentkern.num_fileson macOS; inotify limits on Linux).--no-watchwith a clear warning.FD telemetry in status/debug output
codegraph status --resourcesorcodegraph debug resources.Default MCP install should be conservative
args = ["serve", "--mcp", "--no-watch"]until watcher FD usage is bounded.codegraph install: "Enable live watcher? This can be expensive on large workspaces."Release resources when idle
clients=0, close watchers and open file handles.Hard cap and warning
CODEGRAPH_MAX_OPEN_FDS/--max-open-fdswould be better than letting the OS global table fail.Expected outcome
Users who work across many projects should be able to install CodeGraph once and not periodically debug global OS file-table exhaustion. If watch mode is too expensive for a workspace, CodeGraph should detect that, explain it, and keep the index/query path usable in no-watch/manual-sync mode.
Related: