Background
The current Copilot session history is compacted by plain truncation.
In spx-gui/src/components/copilot/copilot.ts, we currently sample API messages by keeping the most recent messages within a size limit:
- old messages are dropped entirely once the limit is exceeded
- this is simple, but it can lose important context from earlier turns
- the loss is abrupt and not semantically aware
This can reduce Copilot answer quality for longer conversations, especially when early user goals, constraints, or tool results are still relevant.
What this issue is about
Improve how Copilot history is compacted before sending requests to the model.
The goal is to preserve more useful context for long-running conversations while still respecting context-window limits.
Possible directions
We do not have to limit ourselves to LLM-based summarization. Possible approaches include:
- history summarization with LLM, replacing older turns with a compact semantic summary
- hybrid compaction, e.g. keep recent turns in full and compress older turns
- role-aware or message-type-aware compaction, e.g. preserve user goals and important tool results more aggressively than routine assistant text
- structure-aware filtering, e.g. deduplicate repeated context messages or large low-value content
- checkpoint-style memory, where the session periodically stores a compact state summary for later rounds
Expected outcome
Compared with the current plain truncation approach, the new strategy should:
- preserve important conversation state more reliably
- degrade more gracefully as history grows
- remain predictable in cost and latency
- fit the existing Copilot request flow cleanly
Context
Current implementation reference:
- spx-gui/src/components/copilot/copilot.ts
Background
The current Copilot session history is compacted by plain truncation.
In
spx-gui/src/components/copilot/copilot.ts, we currently sample API messages by keeping the most recent messages within a size limit:This can reduce Copilot answer quality for longer conversations, especially when early user goals, constraints, or tool results are still relevant.
What this issue is about
Improve how Copilot history is compacted before sending requests to the model.
The goal is to preserve more useful context for long-running conversations while still respecting context-window limits.
Possible directions
We do not have to limit ourselves to LLM-based summarization. Possible approaches include:
Expected outcome
Compared with the current plain truncation approach, the new strategy should:
Context
Current implementation reference: