Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
112 changes: 112 additions & 0 deletions docs/design/fork-subagent/fork-subagent-design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
# Fork Subagent Design

> Implicit fork subagent that inherits the parent's full conversation context and shares prompt cache for cost-efficient parallel task execution.

## Overview

When the Agent tool is called without `subagent_type`, it triggers an implicit **fork** — a background subagent that inherits the parent's conversation history, system prompt, and tool definitions. The fork uses `CacheSafeParams` to ensure its API requests share the same prefix as the parent's, enabling DashScope prompt cache hits.

## Architecture

```
Parent conversation: [SystemPrompt | Tools | Msg1 | Msg2 | ... | MsgN (model)]
↑ identical prefix for all forks ↑

Fork A: [...MsgN | placeholder results | "Research A"] ← shared cache
Fork B: [...MsgN | placeholder results | "Modify B"] ← shared cache
Fork C: [...MsgN | placeholder results | "Test C"] ← shared cache
```

## Key Components

### 1. FORK_AGENT (`forkSubagent.ts`)

Synthetic agent config, not registered in `builtInAgents`. Has a fallback `systemPrompt` but in practice uses the parent's rendered system prompt via `generationConfigOverride`.

### 2. CacheSafeParams Integration (`agent.ts` + `forkedQuery.ts`)

```
agent.ts (fork path)
├── getCacheSafeParams() ← parent's generationConfig snapshot
│ ├── generationConfig ← systemInstruction + tools + temp/topP
│ └── history ← (not used — we build extraHistory instead)
├── forkGenerationConfig ← passed as generationConfigOverride
└── forkToolsOverride ← FunctionDeclaration[] extracted from tools
AgentHeadless.execute(context, signal, {
extraHistory, ← parent conversation history
generationConfigOverride, ← parent's exact systemInstruction + tools
toolsOverride, ← parent's exact tool declarations
})
AgentCore.createChat(context, {
extraHistory,
generationConfigOverride, ← bypasses buildChatSystemPrompt()
}) AND skips getInitialChatHistory()
│ (extraHistory already has env context)
new GeminiChat(config, generationConfig, startHistory)
↑ byte-identical to parent's config
```

### 3. History Construction (`agent.ts` + `forkSubagent.ts`)

The fork's `extraHistory` must end with a model message to maintain Gemini API's user/model alternation when `agent-headless` sends the `task_prompt`.

Three cases:

| Parent history ends with | extraHistory construction | task_prompt |
| ----------------------------- | ---------------------------------------------------------------------- | ------------------------------ |
| `model` (no function calls) | `[...rawHistory]` (unchanged) | `buildChildMessage(directive)` |
| `model` (with function calls) | `[...rawHistory, model(clone), user(responses+directive), model(ack)]` | `'Begin.'` |
| `user` (unusual) | `rawHistory.slice(0, -1)` (drop trailing user) | `buildChildMessage(directive)` |

### 4. Recursive Fork Prevention (`forkSubagent.ts`)

`isInForkChild()` scans conversation history for the `<fork-boilerplate>` tag. If found, the fork attempt is rejected with an error message.

### 5. Background Execution (`agent.ts`)

Fork uses `void executeSubagent()` (fire-and-forget) and returns `FORK_PLACEHOLDER_RESULT` immediately to the parent. Errors in the background task are caught, logged, and reflected in the display state.

## Data Flow

```
1. Model calls Agent tool (no subagent_type)
2. agent.ts: import forkSubagent.js
3. agent.ts: getCacheSafeParams() → forkGenerationConfig + forkToolsOverride
4. agent.ts: build extraHistory from parent's getHistory(true)
5. agent.ts: build forkTaskPrompt (directive or 'Begin.')
6. agent.ts: createAgentHeadless(FORK_AGENT, ...)
7. agent.ts: void executeSubagent() — background
8. agent.ts: return FORK_PLACEHOLDER_RESULT to parent immediately
9. Background:
a. AgentHeadless.execute(context, signal, {extraHistory, generationConfigOverride, toolsOverride})
b. AgentCore.createChat() — uses parent's generationConfig (cache-shared)
c. runReasoningLoop() — uses parent's tool declarations
d. Fork executes tools, produces result
e. updateDisplay() with final status
```

## Graceful Degradation

If `getCacheSafeParams()` returns null (first turn, no history yet), the fork falls back to:

- `FORK_AGENT.systemPrompt` for system instruction
- `prepareTools()` for tool declarations

This ensures the fork always works, even without cache sharing.

## Files

| File | Role |
| ---------------------------------------------------- | ------------------------------------------------------------------------------------- |
| `packages/core/src/agents/runtime/forkSubagent.ts` | FORK_AGENT config, buildForkedMessages(), isInForkChild(), buildChildMessage() |
| `packages/core/src/tools/agent.ts` | Fork path: CacheSafeParams retrieval, extraHistory construction, background execution |
| `packages/core/src/agents/runtime/agent-headless.ts` | execute() options: generationConfigOverride, toolsOverride |
| `packages/core/src/agents/runtime/agent-core.ts` | CreateChatOptions.generationConfigOverride |
| `packages/core/src/followup/forkedQuery.ts` | CacheSafeParams infrastructure (existing, no changes) |
38 changes: 37 additions & 1 deletion docs/users/features/sub-agents.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,18 +12,54 @@ Subagents are independent AI assistants that:
- **Work autonomously** - Once given a task, they work independently until completion or failure
- **Provide detailed feedback** - You can see their progress, tool usage, and execution statistics in real-time

## Fork Subagent (Implicit Fork)

In addition to named subagents, Qwen Code supports **implicit forking** — when the AI omits the `subagent_type` parameter, it triggers a fork that inherits the parent's full conversation context.

### How Fork Differs from Named Subagents

| | Named Subagent | Fork Subagent |
| ------------- | --------------------------------- | ----------------------------------------------------- |
| Context | Starts fresh, no parent history | Inherits parent's full conversation history |
| System prompt | Uses its own configured prompt | Uses parent's exact system prompt (for cache sharing) |
| Execution | Blocks the parent until done | Runs in background, parent continues immediately |
| Use case | Specialized tasks (testing, docs) | Parallel tasks that need the current context |

### When Fork is Used

The AI automatically uses fork when it needs to:

- Run multiple research tasks in parallel (e.g., "investigate module A, B, and C")
- Perform background work while continuing the main conversation
- Delegate tasks that require understanding of the current conversation context

### Prompt Cache Sharing

All forks share the parent's exact API request prefix (system prompt, tools, conversation history), enabling DashScope prompt cache hits. When 3 forks run in parallel, the shared prefix is cached once and reused — saving 80%+ token costs compared to independent subagents.

### Recursive Fork Prevention

Fork children cannot create further forks. This is enforced at runtime — if a fork attempts to spawn another fork, it receives an error instructing it to execute tasks directly.

### Current Limitations

- **No result feedback**: Fork results are reflected in the UI progress display but are not automatically fed back into the main conversation. The parent AI sees a placeholder message and cannot act on the fork's output.
- **No worktree isolation**: Forks share the parent's working directory. Concurrent file modifications from multiple forks may conflict.

## Key Benefits

- **Task Specialization**: Create agents optimized for specific workflows (testing, documentation, refactoring, etc.)
- **Context Isolation**: Keep specialized work separate from your main conversation
- **Context Inheritance**: Fork subagents inherit the full conversation for context-heavy parallel tasks
- **Prompt Cache Sharing**: Fork subagents share the parent's cache prefix, reducing token costs
- **Reusability**: Save and reuse agent configurations across projects and sessions
- **Controlled Access**: Limit which tools each agent can use for security and focus
- **Progress Visibility**: Monitor agent execution with real-time progress updates

## How Subagents Work

1. **Configuration**: You create Subagents configurations that define their behavior, tools, and system prompts
2. **Delegation**: The main AI can automatically delegate tasks to appropriate Subagents
2. **Delegation**: The main AI can automatically delegate tasks to appropriate Subagents — or implicitly fork when no specific subagent type is needed
3. **Execution**: Subagents work independently, using their configured tools to complete tasks
4. **Results**: They return results and execution summaries back to the main conversation

Expand Down
40 changes: 31 additions & 9 deletions packages/core/src/agents/runtime/agent-core.ts
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,15 @@ export interface CreateChatOptions {
* conversational context (e.g., from the main session that spawned it).
*/
extraHistory?: Content[];
/**
* When provided, replaces the auto-built generationConfig
* (systemInstruction, temperature, etc.) with this exact config.
* Used by fork subagents to share the parent conversation's cache
* prefix for DashScope prompt caching.
*/
generationConfigOverride?: GenerateContentConfig & {
systemInstruction?: string | Content;
};
}

/**
Expand Down Expand Up @@ -222,30 +231,43 @@ export class AgentCore {
);
}

const envHistory = await getInitialChatHistory(this.runtimeContext);
// When generationConfigOverride is provided (fork path), the extraHistory
// already contains the parent's env context. Skip getInitialChatHistory
// to avoid duplicating the env messages and breaking cache prefix match.
const envHistory = options?.generationConfigOverride
? []
: await getInitialChatHistory(this.runtimeContext);

const startHistory = [
...envHistory,
...(options?.extraHistory ?? []),
...(this.promptConfig.initialMessages ?? []),
];

const systemInstruction = this.promptConfig.systemPrompt
? this.buildChatSystemPrompt(context, options)
: undefined;
// If an override is provided (fork path), use it directly for cache
// sharing. Otherwise, build the config from this agent's promptConfig.
// Note: buildChatSystemPrompt is called OUTSIDE the try/catch so template
// errors propagate to the caller (not swallowed by reportError).
let generationConfig: GenerateContentConfig & {
systemInstruction?: string | Content;
};

try {
const generationConfig: GenerateContentConfig & {
systemInstruction?: string | Content;
} = {
if (options?.generationConfigOverride) {
generationConfig = options.generationConfigOverride;
} else {
const systemInstruction = this.promptConfig.systemPrompt
? this.buildChatSystemPrompt(context, options)
: undefined;
generationConfig = {
temperature: this.modelConfig.temp,
topP: this.modelConfig.top_p,
};

if (systemInstruction) {
generationConfig.systemInstruction = systemInstruction;
}
}

try {
return new GeminiChat(
this.runtimeContext,
generationConfig,
Expand Down
14 changes: 12 additions & 2 deletions packages/core/src/agents/runtime/agent-headless.ts
Original file line number Diff line number Diff line change
Expand Up @@ -192,8 +192,18 @@ export class AgentHeadless {
async execute(
context: ContextState,
externalSignal?: AbortSignal,
options?: {
extraHistory?: Array<import('@google/genai').Content>;
/** Override generationConfig for cache sharing (fork subagent). */
generationConfigOverride?: import('@google/genai').GenerateContentConfig;
/** Override tool declarations for cache sharing (fork subagent). */
toolsOverride?: Array<import('@google/genai').FunctionDeclaration>;
},
): Promise<void> {
const chat = await this.core.createChat(context);
const chat = await this.core.createChat(context, {
extraHistory: options?.extraHistory,
generationConfigOverride: options?.generationConfigOverride,
});

if (!chat) {
this.terminateMode = AgentTerminateMode.ERROR;
Expand All @@ -212,7 +222,7 @@ export class AgentHeadless {
abortController.abort();
}

const toolsList = this.core.prepareTools();
const toolsList = options?.toolsOverride ?? this.core.prepareTools();

const initialTaskText = String(
(context.get('task_prompt') as string) ?? 'Get Started!',
Expand Down
116 changes: 116 additions & 0 deletions packages/core/src/agents/runtime/forkSubagent.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
import type { Content } from '@google/genai';

export const FORK_SUBAGENT_TYPE = 'fork';

export const FORK_BOILERPLATE_TAG = 'fork-boilerplate';
export const FORK_DIRECTIVE_PREFIX = 'Directive: ';

export const FORK_AGENT = {
name: FORK_SUBAGENT_TYPE,
description:
'Implicit fork — inherits full conversation context. Not selectable via subagent_type; triggered by omitting subagent_type.',
tools: ['*'],
systemPrompt:
'You are a forked worker process. Follow the directive in the conversation history. Execute tasks directly using available tools. Do not spawn sub-agents.',
level: 'session' as const,
};

export function isInForkChild(messages: Content[]): boolean {
return messages.some((m) => {
if (m.role !== 'user') return false;
return m.parts?.some(
(part) => part.text && part.text.includes(`<${FORK_BOILERPLATE_TAG}>`),
);
});
}

export const FORK_PLACEHOLDER_RESULT =
'Fork started — processing in background';

/**
* Build extra history messages for a forked subagent.
*
* When the last model message has function calls, we must include matching
* function responses in a user message (Gemini API requirement). The
* directive is embedded in this same user message to avoid consecutive
* user messages.
*
* When there are no function calls, we return [] — the parent history
* already ends with a model text message and the directive will be sent
* as the task_prompt by agent-headless (model → user alternation is OK).
*
* @param directive - The fork directive text (user's prompt)
* @param assistantMessage - The last model message from the parent history
* @returns Extra messages to append to history (may be empty)
*/
export function buildForkedMessages(
directive: string,
assistantMessage: Content,
): Content[] {
const toolUseParts =
assistantMessage.parts?.filter((part) => part.functionCall) || [];

if (toolUseParts.length === 0) {
// No function calls — no extra messages needed.
// The parent history already ends with this model message.
return [];
}

// Clone the assistant message to avoid mutating the original
const fullAssistantMessage: Content = {
role: assistantMessage.role,
parts: [...(assistantMessage.parts || [])],
};

// Build tool_result blocks for every tool_use, all with identical placeholder text.
// Include the directive text in the same user message to maintain
// proper user/model alternation.
const toolResultParts = toolUseParts.map((part) => ({
functionResponse: {
id: part.functionCall!.id,
name: part.functionCall!.name,
response: { output: FORK_PLACEHOLDER_RESULT },
},
}));

const toolResultMessage: Content = {
role: 'user',
parts: [
...toolResultParts,
{
text: buildChildMessage(directive),
},
],
};

return [fullAssistantMessage, toolResultMessage];
}

export function buildChildMessage(directive: string): string {
return `<${FORK_BOILERPLATE_TAG}>
STOP. READ THIS FIRST.

You are a forked worker process. You are NOT the main agent.

RULES (non-negotiable):
1. You ARE the fork. Do NOT spawn sub-agents; execute directly.
2. Do NOT converse, ask questions, or suggest next steps
3. Do NOT editorialize or add meta-commentary
4. USE your tools directly: Bash, Read, Write, etc.
5. If you modify files, commit your changes before reporting. Include the commit hash in your report.
6. Do NOT emit text between tool calls. Use tools silently, then report once at the end.
7. Stay strictly within your directive's scope. If you discover related systems outside your scope, mention them in one sentence at most — other workers cover those areas.
8. Keep your report under 500 words unless the directive specifies otherwise. Be factual and concise.
9. Your response MUST begin with "Scope:". No preamble, no thinking-out-loud.
10. REPORT structured facts, then stop

Output format (plain text labels, not markdown headers):
Scope: <echo back your assigned scope in one sentence>
Result: <the answer or key findings, limited to the scope above>
Key files: <relevant file paths — include for research tasks>
Files changed: <list with commit hash — include only if you modified files>
Issues: <list — include only if there are issues to flag>
</${FORK_BOILERPLATE_TAG}>

${FORK_DIRECTIVE_PREFIX}${directive}`;
}
1 change: 1 addition & 0 deletions packages/core/src/tools/agent.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -398,6 +398,7 @@ describe('AgentTool', () => {
expect(mockAgent.execute).toHaveBeenCalledWith(
mockContextState,
undefined, // signal parameter (undefined when not provided)
{ extraHistory: undefined }, // extraHistory
);

const llmText = partToString(result.llmContent);
Expand Down
Loading
Loading