MCP watch mode needs resource guardrails and safer defaults on macOS

## Summary

CodeGraph currently requires users to discover OS-level file descriptor / watcher exhaustion and manually mitigate it by killing daemons or adding `--no-watch`. That is too fragile for normal agent workflows where a developer may have many MCP clients and many project roots open at once.

This is a guardrail/defaults issue related to, but distinct from, the raw FD leak reports in #496 / #555 and the stale-process accumulation report in #579.

## Environment where observed

- macOS 26.5 / Darwin 25.5.0, Apple Silicon arm64
- CodeGraph `0.9.8` plus older `0.9.7` daemons from prior sessions
- `kern.maxfiles: 491520`
- `kern.maxfilesperproc: 245760`
- Several MCP-capable agent clients used across multiple project roots

## What happened

Even after moving away from a huge root-level workspace index to per-project indexes, CodeGraph still created enough watcher / open-file pressure that daemon logs showed system-level `ENFILE: file table overflow` errors.

Sanitized live snapshot:

```text
before cleanup:
  kern.num_files: 15771
  kern.maxfiles: 491520

project daemon FD counts:
  codegraph serve --mcp --path <large-go-project>:        2901 numeric FDs
  codegraph serve --mcp --path <medium-flutter-project>:   599 numeric FDs
  codegraph serve --mcp --path <small-swift-project>:       96 numeric FDs
```

The largest daemon was almost entirely regular-file descriptors:

```text
REG      2887
PIPE        4
DIR         3
KQUEUE      3
CHR         2
unix        2
numeric_fds 2901
```

The daemon log contained repeated errors like:

```text
[CodeGraph] File watcher error {
  error: "Error: ENFILE: file table overflow, realpath '<workspace>/internal/...'"
}
[CodeGraph] File watcher error {
  error: "Error: ENFILE: file table overflow, lstat '<workspace>/internal/.../client_test.go'"
}
[CodeGraph] File watcher error {
  error: "Error: ENFILE: file table overflow, scandir '<workspace>/internal/...'"
}
```

Killing stale/high-FD project daemons released pressure immediately:

```text
after killing two stale project daemons:
  kern.num_files: 12164

after killing the remaining project watcher:
  kern.num_files: 12042
```

`--no-watch` was an effective local mitigation:

```text
codegraph serve --mcp --no-watch
```

## Why this needs product-level guardrails

This should not require manual OS debugging. On macOS, exhausting the global file table causes failures in unrelated processes. The user sees shells, browsers, IDEs, Docker, and background services behaving badly, not a clear CodeGraph error.

Also, powerful hardware does not make the failure mode acceptable. A machine with a high global FD limit still experienced CodeGraph-originated `ENFILE` logs. Developers using agents often have many repos and sessions active; CodeGraph should degrade gracefully under that workload.

## Suggested behavior

I think CodeGraph should fail safe by default:

1. **Budget-aware watcher startup**
   - Estimate files/directories to watch before enabling the watcher.
   - Compare against OS limits (`kern.maxfiles`, `kern.maxfilesperproc`, current `kern.num_files` on macOS; inotify limits on Linux).
   - If the projected watcher/indexer cost is risky, refuse watch mode or auto-fallback to `--no-watch` with a clear warning.

2. **FD telemetry in status/debug output**
   - Add something like `codegraph status --resources` or `codegraph debug resources`.
   - Report current process FD count, watcher count if available, path root, client count, idle-timeout state, and OS budget percentage.

3. **Default MCP install should be conservative**
   - For stdio MCP installs on macOS, consider installing `args = ["serve", "--mcp", "--no-watch"]` until watcher FD usage is bounded.
   - Or ask during `codegraph install`: "Enable live watcher? This can be expensive on large workspaces."

4. **Release resources when idle**
   - If `clients=0`, close watchers and open file handles.
   - Rehydrate watcher state only when a client attaches or a query requires sync.

5. **Hard cap and warning**
   - A daemon should never be allowed to hold tens of thousands of file descriptors without a loud warning.
   - A configurable cap such as `CODEGRAPH_MAX_OPEN_FDS` / `--max-open-fds` would be better than letting the OS global table fail.

## Expected outcome

Users who work across many projects should be able to install CodeGraph once and not periodically debug global OS file-table exhaustion. If watch mode is too expensive for a workspace, CodeGraph should detect that, explain it, and keep the index/query path usable in no-watch/manual-sync mode.

Related:

- #496
- #555
- #579


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MCP watch mode needs resource guardrails and safer defaults on macOS #628

Summary

Environment where observed

What happened

Why this needs product-level guardrails

Suggested behavior

Expected outcome

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

MCP watch mode needs resource guardrails and safer defaults on macOS #628

Description

Summary

Environment where observed

What happened

Why this needs product-level guardrails

Suggested behavior

Expected outcome

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions