Stabilize ToolTaskThatTimeoutAndRetry test#13489
Conversation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Make the retry test assert the timeout scenario explicitly, log per-attempt diagnostics, and keep the follow-up executions deterministic across retries. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR stabilizes the ToolTaskThatTimeoutAndRetry unit test in Utilities.UnitTests by removing timing-sensitive shell indirection and making the timeout/retry expectations explicit and easier to diagnose when failures occur in CI.
Changes:
- Replace the shell-based sleep invocation with direct OS tools (
sleepon Unix,ping.exeon Windows) and document thetimeout.exelimitation underToolTask. - Refactor the theory inputs and assertions to explicitly model “first attempt times out, subsequent attempts succeed”.
- Add per-attempt diagnostic output (including exit code) and clear the mock engine log between attempts to keep logs readable.
| private sealed class ToolTaskThatSleeps : ToolTask | ||
| { | ||
| // Windows prompt command to sleep: | ||
| private readonly string _windowsSleep = "/c start /wait timeout {0}"; |
There was a problem hiding this comment.
This start /wait was (at least part of!) the reason running tests in our repo popped up terminal windows over and over. Not sure if that was bugging anybody else but it drove me up the wall!
| /// These tests execute the same task instance multiple times using a direct OS sleep process rather than | ||
| /// shelling through PowerShell or cmd.exe. When configured to time out, the first execution is expected to | ||
| /// fail and all following executions must still succeed. |
There was a problem hiding this comment.
| /// These tests execute the same task instance multiple times using a direct OS sleep process rather than | |
| /// shelling through PowerShell or cmd.exe. When configured to time out, the first execution is expected to | |
| /// fail and all following executions must still succeed. | |
| /// These tests execute the same task instance multiple times, which will in turn run a command to sleep for a | |
| /// predefined amount of time. The first execution may time out, but all following ones won't. It is expected | |
| /// that all following executions return success.``` |
|
|
||
| int fastDelayMilliseconds = NativeMethodsShared.IsWindows ? 1_000 : 100; | ||
| int slowDelayMilliseconds = 5_000; | ||
| int timeoutMilliseconds = NativeMethodsShared.IsWindows ? 2_000 : 1_000; |
There was a problem hiding this comment.
@copilot does this need to vary? Shouldn't it be irrelevant for the fast delay, and so could be 2s in both OS cases?
There was a problem hiding this comment.
No, it doesn't need to vary. Removed the OS conditional — timeoutMilliseconds is now a flat 2_000 on all platforms (ac443a9). fastDelayMilliseconds is left as-is per the follow-up instruction.
| protected override string GenerateCommandLineCommands() => | ||
| NativeMethodsShared.IsUnixLike ? | ||
| string.Format(_unixSleep, RepeatCount < 2 ? InitialDelay / 1000.0 : FollowupDelay / 1000.0) : | ||
| string.Format(_windowsSleep, RepeatCount < 2 ? InitialDelay / 1000.0 : FollowupDelay / 1000.0); |
There was a problem hiding this comment.
Hilariously, before this change we tried to timeout 0.001 and it worked only by
- returning immediately
- but we ignored the error code.
…onds Agent-Logs-Url: https://github.com/dotnet/msbuild/sessions/766be90d-7dd3-492f-bb8d-4bf35765da5b Co-authored-by: rainersigwald <3347530+rainersigwald@users.noreply.github.com>
Agent-Logs-Url: https://github.com/dotnet/msbuild/sessions/81558234-b82a-4e91-8cbe-75dc53221865 Co-authored-by: rainersigwald <3347530+rainersigwald@users.noreply.github.com>
fastDelayMillisecondsfromNativeMethodsShared.IsWindows ? 1_000 : 100to a flat100