Skip to content

Commit 736af4c

Browse files
stephentoubCopilot
andcommitted
Greatly expand E2E test coverage and harden tests across all 4 SDKs
Adds comprehensive end-to-end coverage of the SDK + Copilot CLI surface area across C#, Python, TypeScript, and Go, with the goal that every public API and every JSON-RPC method has at least one E2E test. Reorganizes the C# and Python test layouts to clearly separate unit and E2E tests, and adds `E2E` suffixes to test class/file names for clarity. What's added ------------ E2E coverage (parity across all 4 SDKs): - Session lifecycle: connect/disconnect, dispose semantics, multi-client scenarios, resume, force-stop, idle-then-suspend. - Session config: model selection (vision-enabled/disabled transitions), agent selection, allowed/denied tool sets, working directory, environment variables, system prompts, MCP servers. - RPC surface: agent (get/getCurrent/list/reload), session state (capabilities/getCurrent/getMetadata/list/reset), trust, model (get/getCurrent/list/setCurrent), permission (request/respond/list), hooks, plan, telemetry, command execute/elicit/respond. - Streaming: assistant.message_delta + reasoning_delta ordering and matching message IDs across delta and final events. - Suspend RPC: suspend during pending permission, suspend during pending external tool, suspend idle session, resume + continue conversation after suspend. - Hooks: pre/post tool, pre/post session, deny verification (asserts target file is unchanged after a deny). - Permissions: per-session auth tokens with auto-token opt-out for tests that explicitly verify the unauthenticated path. - GitHub references, attachments, custom request headers, trace context propagation. - Tool routing, command handlers, elicitation flow. Unit coverage: - Forward-compatibility for unknown discriminators and unknown event envelope types so unrecognized future events round-trip safely. Snapshot harness: - 100+ new YAML snapshots under `test/snapshots/` covering the new scenarios; existing snapshots normalized for portability. Test layout: - C# tests split into `dotnet/test/E2E/` and `dotnet/test/Unit/` folders; xUnit collection serializes E2E execution to avoid CLI process contention (with explanatory comment at the attribute site). - Python E2E tests live under `python/e2e/` with a shared harness proxy/context. `E2E` suffix added to classes/methods. - Go E2E tests use `_e2e_test.go` suffix and an `E2E` test-function suffix for clarity. - TypeScript E2E tests use `.e2e.test.ts` suffix; `createSdkTestContext` registers Vitest hooks that auto-stop `CopilotClient` and `CapiProxy`. What's fixed ------------ Snapshot drift root cause: - `test/harness/replayingCapiProxy.ts`: `writeCapturesToDisk` was called from both `updateConfig` and `stop` regardless of CI mode. When a test exercised only a subset of a multi-conversation snapshot, the file was silently rewritten with that subset, breaking later runs. Both writes are now guarded by `process.env.GITHUB_ACTIONS !== "true"`. `gh` CLI environment-dependent help text: - Added a `normalizeGhAuthMessages` tool result normalizer so the two forms of "auth required" help text emitted by `gh` (GitHub Actions vs. local dev) both map to a stable `\` placeholder before snapshot match. PerSessionAuth auto-token leak: - `dotnet/test/Harness/E2ETestContext.cs`: added an `autoInjectGitHubToken` parameter so `CreateAuthTestClient` can opt out of the CI auto-token injection that was silently authenticating tests verifying the unauthenticated path. Go SDK lifecycle and codegen: - Polling-based `waitForCapability` helper in `go/session_test.go` to replace race-prone fixed sleeps. - Telemetry marker constant aligned with snapshot. - Codegen / lifecycle bugs surfaced while porting C# E2E coverage. TypeScript cleanup hangs: - Replaced fixed `setTimeout` waits in lifecycle/session tests with bounded polling helpers (`waitFor` / `getLastSessionId` / `getSessionMetadata` polls). - Bumped the `should stop cleanly` test timeout to 60s to absorb slow-machine variability. Python: - `_wait_for` polling helper in `test_commands_and_elicitation.py` replaces flaky fixed sleeps. Verification ------------ Five consecutive full-suite runs across all 4 SDKs, all clean (the final run was executed with all 4 suites in parallel to confirm the fixes hold under contention): - C#: 319 passed / 0 failed / 4 skipped - Python: 356 passed / 0 failed / 8 skipped - TS: 328 passed / 0 failed / 9 skipped - Go: all packages ok Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent f8cf846 commit 736af4c

222 files changed

Lines changed: 19889 additions & 1526 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

dotnet/src/Client.cs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1775,6 +1775,7 @@ internal record PermissionRequestResponseV2(
17751775
[JsonSerializable(typeof(GetSessionMetadataResponse))]
17761776
[JsonSerializable(typeof(ModelCapabilitiesOverride))]
17771777
[JsonSerializable(typeof(PermissionRequestResult))]
1778+
[JsonSerializable(typeof(PermissionRequestResultKind))]
17781779
[JsonSerializable(typeof(PermissionRequestResponseV2))]
17791780
[JsonSerializable(typeof(ProviderConfig))]
17801781
[JsonSerializable(typeof(ResumeSessionRequest))]

dotnet/src/Generated/Rpc.cs

Lines changed: 5 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

dotnet/src/Types.cs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2787,6 +2787,7 @@ public class SystemMessageTransformRpcResponse
27872787
[JsonSerializable(typeof(ModelSupports))]
27882788
[JsonSerializable(typeof(ModelVisionLimits))]
27892789
[JsonSerializable(typeof(PermissionRequestResult))]
2790+
[JsonSerializable(typeof(PermissionRequestResultKind))]
27902791
[JsonSerializable(typeof(PingRequest))]
27912792
[JsonSerializable(typeof(PingResponse))]
27922793
[JsonSerializable(typeof(ProviderConfig))]

dotnet/test/AssemblyInfo.cs

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
/*---------------------------------------------------------------------------------------------
2+
* Copyright (c) Microsoft Corporation. All rights reserved.
3+
*--------------------------------------------------------------------------------------------*/
4+
5+
using Xunit;
6+
7+
// Each E2E test class fixture spins up its own Copilot CLI subprocess plus a CapiProxy
8+
// (replaying HTTP proxy) Node.js subprocess. With ~25 test classes, running them in parallel
9+
// would launch ~50 long-lived Node.js processes simultaneously and exhaust both file
10+
// descriptors and memory on developer machines and CI runners (especially Windows). Tests
11+
// within a class already run serially via xUnit's IClassFixture contract; this attribute
12+
// extends that to cross-class execution. Re-enable parallelization only after either
13+
// (a) sharing a single CLI subprocess across classes, or (b) gating concurrency with a
14+
// semaphore that limits concurrent fixtures to a small number (e.g. 2-3).
15+
[assembly: CollectionBehavior(DisableTestParallelization = true)]
Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,9 @@
66
using Xunit;
77
using Xunit.Abstractions;
88

9-
namespace GitHub.Copilot.SDK.Test;
9+
namespace GitHub.Copilot.SDK.Test.E2E;
1010

11-
public class AskUserTests(E2ETestFixture fixture, ITestOutputHelper output) : E2ETestBase(fixture, "ask_user", output)
11+
public class AskUserE2ETests(E2ETestFixture fixture, ITestOutputHelper output) : E2ETestBase(fixture, "ask_user", output)
1212
{
1313
[Fact]
1414
public async Task Should_Invoke_User_Input_Handler_When_Model_Uses_Ask_User_Tool()
Lines changed: 137 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,137 @@
1+
/*---------------------------------------------------------------------------------------------
2+
* Copyright (c) Microsoft Corporation. All rights reserved.
3+
*--------------------------------------------------------------------------------------------*/
4+
5+
using GitHub.Copilot.SDK.Test.Harness;
6+
using Xunit;
7+
using Xunit.Abstractions;
8+
9+
namespace GitHub.Copilot.SDK.Test.E2E;
10+
11+
/// <summary>
12+
/// Smoke coverage for the Copilot CLI built-in tools (bash, view, edit, create_file,
13+
/// grep, glob). Each test asks the model to use one tool and then verifies the model's
14+
/// final response reflects the tool's result. Mirrors
15+
/// <c>nodejs/test/e2e/builtin_tools.e2e.test.ts</c>.
16+
/// </summary>
17+
public class BuiltinToolsE2ETests(E2ETestFixture fixture, ITestOutputHelper output)
18+
: E2ETestBase(fixture, "builtin_tools", output)
19+
{
20+
[Fact]
21+
public async Task Should_Capture_Exit_Code_In_Output()
22+
{
23+
var session = await CreateSessionAsync();
24+
var msg = await session.SendAndWaitAsync(new MessageOptions
25+
{
26+
Prompt = "Run 'echo hello && echo world'. Tell me the exact output.",
27+
});
28+
var content = msg?.Data.Content ?? string.Empty;
29+
Assert.Contains("hello", content);
30+
Assert.Contains("world", content);
31+
}
32+
33+
[Fact]
34+
public async Task Should_Capture_Stderr_Output()
35+
{
36+
// The Copilot CLI runs commands through a shell tool that resolves to bash on
37+
// Linux/macOS and PowerShell on Windows. The TS prompt only works on bash, so
38+
// skip this test on Windows to mirror the TS `it.skipIf(process.platform === "win32")`.
39+
if (OperatingSystem.IsWindows())
40+
{
41+
return;
42+
}
43+
44+
var session = await CreateSessionAsync();
45+
var msg = await session.SendAndWaitAsync(new MessageOptions
46+
{
47+
Prompt = "Run 'echo error_msg >&2; echo ok' and tell me what stderr said. Reply with just the stderr content.",
48+
});
49+
Assert.Contains("error_msg", msg?.Data.Content ?? string.Empty);
50+
}
51+
52+
[Fact]
53+
public async Task Should_Read_File_With_Line_Range()
54+
{
55+
await File.WriteAllTextAsync(Path.Join(Ctx.WorkDir, "lines.txt"), "line1\nline2\nline3\nline4\nline5\n");
56+
var session = await CreateSessionAsync();
57+
var msg = await session.SendAndWaitAsync(new MessageOptions
58+
{
59+
Prompt = "Read lines 2 through 4 of the file 'lines.txt' in this directory. Tell me what those lines contain.",
60+
});
61+
var content = msg?.Data.Content ?? string.Empty;
62+
Assert.Contains("line2", content);
63+
Assert.Contains("line4", content);
64+
}
65+
66+
[Fact]
67+
public async Task Should_Handle_Nonexistent_File_Gracefully()
68+
{
69+
var session = await CreateSessionAsync();
70+
var msg = await session.SendAndWaitAsync(new MessageOptions
71+
{
72+
Prompt = "Try to read the file 'does_not_exist.txt'. If it doesn't exist, say 'FILE_NOT_FOUND'.",
73+
});
74+
var content = (msg?.Data.Content ?? string.Empty).ToUpperInvariant();
75+
// Match any of the common phrasings for a missing-file response.
76+
Assert.True(
77+
content.Contains("NOT FOUND")
78+
|| content.Contains("NOT EXIST")
79+
|| content.Contains("NO SUCH")
80+
|| content.Contains("FILE_NOT_FOUND")
81+
|| content.Contains("DOES NOT EXIST")
82+
|| content.Contains("ERROR"),
83+
$"Expected a 'not found'-style response, got: {msg?.Data.Content}");
84+
}
85+
86+
[Fact]
87+
public async Task Should_Edit_A_File_Successfully()
88+
{
89+
await File.WriteAllTextAsync(Path.Join(Ctx.WorkDir, "edit_me.txt"), "Hello World\nGoodbye World\n");
90+
var session = await CreateSessionAsync();
91+
var msg = await session.SendAndWaitAsync(new MessageOptions
92+
{
93+
Prompt = "Edit the file 'edit_me.txt': replace 'Hello World' with 'Hi Universe'. Then read it back and tell me its contents.",
94+
});
95+
Assert.Contains("Hi Universe", msg?.Data.Content ?? string.Empty);
96+
}
97+
98+
[Fact]
99+
public async Task Should_Create_A_New_File()
100+
{
101+
var session = await CreateSessionAsync();
102+
var msg = await session.SendAndWaitAsync(new MessageOptions
103+
{
104+
Prompt = "Create a file called 'new_file.txt' with the content 'Created by test'. Then read it back to confirm.",
105+
});
106+
Assert.Contains("Created by test", msg?.Data.Content ?? string.Empty);
107+
}
108+
109+
[Fact]
110+
public async Task Should_Search_For_Patterns_In_Files()
111+
{
112+
await File.WriteAllTextAsync(Path.Join(Ctx.WorkDir, "data.txt"), "apple\nbanana\napricot\ncherry\n");
113+
var session = await CreateSessionAsync();
114+
var msg = await session.SendAndWaitAsync(new MessageOptions
115+
{
116+
Prompt = "Search for lines starting with 'ap' in the file 'data.txt'. Tell me which lines matched.",
117+
});
118+
var content = msg?.Data.Content ?? string.Empty;
119+
Assert.Contains("apple", content);
120+
Assert.Contains("apricot", content);
121+
}
122+
123+
[Fact]
124+
public async Task Should_Find_Files_By_Pattern()
125+
{
126+
Directory.CreateDirectory(Path.Join(Ctx.WorkDir, "src"));
127+
await File.WriteAllTextAsync(Path.Join(Ctx.WorkDir, "src", "index.ts"), "export const index = 1;");
128+
await File.WriteAllTextAsync(Path.Join(Ctx.WorkDir, "README.md"), "# Readme");
129+
130+
var session = await CreateSessionAsync();
131+
var msg = await session.SendAndWaitAsync(new MessageOptions
132+
{
133+
Prompt = "Find all .ts files in this directory (recursively). List the filenames you found.",
134+
});
135+
Assert.Contains("index.ts", msg?.Data.Content ?? string.Empty);
136+
}
137+
}
Lines changed: 2 additions & 89 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,11 @@
44

55
using Xunit;
66

7-
namespace GitHub.Copilot.SDK.Test;
7+
namespace GitHub.Copilot.SDK.Test.E2E;
88

99
// These tests bypass E2ETestBase because they are about how the CLI subprocess is started
1010
// Other test classes should instead inherit from E2ETestBase
11-
public class ClientTests
11+
public class ClientE2ETests
1212
{
1313
[Fact]
1414
public async Task Should_Start_And_Connect_To_Server_Using_Stdio()
@@ -148,93 +148,6 @@ public async Task Should_List_Models_When_Authenticated()
148148
}
149149
}
150150

151-
[Fact]
152-
public void Should_Accept_GitHubToken_Option()
153-
{
154-
var options = new CopilotClientOptions
155-
{
156-
GitHubToken = "gho_test_token"
157-
};
158-
159-
Assert.Equal("gho_test_token", options.GitHubToken);
160-
}
161-
162-
[Fact]
163-
public void Should_Default_UseLoggedInUser_To_Null()
164-
{
165-
var options = new CopilotClientOptions();
166-
167-
Assert.Null(options.UseLoggedInUser);
168-
}
169-
170-
[Fact]
171-
public void Should_Allow_Explicit_UseLoggedInUser_False()
172-
{
173-
var options = new CopilotClientOptions
174-
{
175-
UseLoggedInUser = false
176-
};
177-
178-
Assert.False(options.UseLoggedInUser);
179-
}
180-
181-
[Fact]
182-
public void Should_Allow_Explicit_UseLoggedInUser_True_With_GitHubToken()
183-
{
184-
var options = new CopilotClientOptions
185-
{
186-
GitHubToken = "gho_test_token",
187-
UseLoggedInUser = true
188-
};
189-
190-
Assert.True(options.UseLoggedInUser);
191-
}
192-
193-
[Fact]
194-
public void Should_Throw_When_GitHubToken_Used_With_CliUrl()
195-
{
196-
Assert.Throws<ArgumentException>(() =>
197-
{
198-
_ = new CopilotClient(new CopilotClientOptions
199-
{
200-
CliUrl = "localhost:8080",
201-
GitHubToken = "gho_test_token"
202-
});
203-
});
204-
}
205-
206-
[Fact]
207-
public void Should_Throw_When_UseLoggedInUser_Used_With_CliUrl()
208-
{
209-
Assert.Throws<ArgumentException>(() =>
210-
{
211-
_ = new CopilotClient(new CopilotClientOptions
212-
{
213-
CliUrl = "localhost:8080",
214-
UseLoggedInUser = false
215-
});
216-
});
217-
}
218-
219-
[Fact]
220-
public void Should_Default_SessionIdleTimeoutSeconds_To_Null()
221-
{
222-
var options = new CopilotClientOptions();
223-
224-
Assert.Null(options.SessionIdleTimeoutSeconds);
225-
}
226-
227-
[Fact]
228-
public void Should_Accept_SessionIdleTimeoutSeconds_Option()
229-
{
230-
var options = new CopilotClientOptions
231-
{
232-
SessionIdleTimeoutSeconds = 600
233-
};
234-
235-
Assert.Equal(600, options.SessionIdleTimeoutSeconds);
236-
}
237-
238151
[Fact]
239152
public async Task Should_Not_Throw_When_Disposing_Session_After_Stopping_Client()
240153
{
Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
/*---------------------------------------------------------------------------------------------
2+
* Copyright (c) Microsoft Corporation. All rights reserved.
3+
*--------------------------------------------------------------------------------------------*/
4+
5+
using Xunit;
6+
using Xunit.Abstractions;
7+
8+
namespace GitHub.Copilot.SDK.Test.E2E;
9+
10+
public class ClientLifecycleE2ETests(E2ETestFixture fixture, ITestOutputHelper output)
11+
: E2ETestBase(fixture, "client_lifecycle", output)
12+
{
13+
[Fact]
14+
public async Task Should_Receive_Session_Created_Lifecycle_Event()
15+
{
16+
var created = new TaskCompletionSource<SessionLifecycleEvent>(TaskCreationOptions.RunContinuationsAsynchronously);
17+
using var subscription = Client.On(evt =>
18+
{
19+
if (evt.Type == SessionLifecycleEventTypes.Created)
20+
{
21+
created.TrySetResult(evt);
22+
}
23+
});
24+
25+
var session = await CreateSessionAsync();
26+
var evt = await created.Task.WaitAsync(TimeSpan.FromSeconds(10));
27+
28+
Assert.Equal(SessionLifecycleEventTypes.Created, evt.Type);
29+
Assert.Equal(session.SessionId, evt.SessionId);
30+
}
31+
32+
[Fact]
33+
public async Task Should_Filter_Session_Lifecycle_Events_By_Type()
34+
{
35+
var created = new TaskCompletionSource<SessionLifecycleEvent>(TaskCreationOptions.RunContinuationsAsynchronously);
36+
using var subscription = Client.On(SessionLifecycleEventTypes.Created, evt => created.TrySetResult(evt));
37+
38+
var session = await CreateSessionAsync();
39+
var evt = await created.Task.WaitAsync(TimeSpan.FromSeconds(10));
40+
41+
Assert.Equal(SessionLifecycleEventTypes.Created, evt.Type);
42+
Assert.Equal(session.SessionId, evt.SessionId);
43+
}
44+
45+
[Fact]
46+
public async Task Disposing_Lifecycle_Subscription_Stops_Receiving_Events()
47+
{
48+
var count = 0;
49+
var created = new TaskCompletionSource<SessionLifecycleEvent>(TaskCreationOptions.RunContinuationsAsynchronously);
50+
var subscription = Client.On(_ => Interlocked.Increment(ref count));
51+
subscription.Dispose();
52+
using var activeSubscription = Client.On(SessionLifecycleEventTypes.Created, evt => created.TrySetResult(evt));
53+
54+
var session = await CreateSessionAsync();
55+
var evt = await created.Task.WaitAsync(TimeSpan.FromSeconds(10));
56+
57+
Assert.Equal(session.SessionId, evt.SessionId);
58+
Assert.Equal(0, Interlocked.CompareExchange(ref count, 0, 0));
59+
}
60+
61+
[Theory]
62+
[InlineData(true)] // async dispose path (DisposeAsync)
63+
[InlineData(false)] // sync dispose path (Dispose)
64+
public async Task Dispose_Disconnects_Client_And_Disposes_Rpc_Surface(bool useAsyncDispose)
65+
{
66+
var client = Ctx.CreateClient();
67+
await client.StartAsync();
68+
69+
Assert.Equal(ConnectionState.Connected, client.State);
70+
71+
if (useAsyncDispose)
72+
{
73+
await client.DisposeAsync();
74+
}
75+
else
76+
{
77+
client.Dispose();
78+
}
79+
80+
Assert.Equal(ConnectionState.Disconnected, client.State);
81+
Assert.Throws<ObjectDisposedException>(() => client.Rpc);
82+
}
83+
}

0 commit comments

Comments
 (0)