[vs18.5] Fix race condition in LoggingService.ProcessLoggingEvent after shutdo…#13497
[vs18.5] Fix race condition in LoggingService.ProcessLoggingEvent after shutdo…#13497
Conversation
…wn (#13450) ## Summary Fix a race condition crash (access violation 0xC0000005 / NullReferenceException) in `LoggingService.ProcessLoggingEvent` when the method is called after the logging service has been shut down. ## Problem `ProcessLoggingEvent` directly accesses `_eventQueue`, `_dequeueEvent`, and `_enqueueEvent` without any shutdown guard. During `ShutdownComponent()`, `CleanLoggingEventProcessing()` disposes and nullifies all of these fields. If an external callback (e.g., `Process.Exited` from a `ProjectCachePlugin`) attempts to log a message after shutdown has completed, the null field access causes an unrecoverable crash. ### Reproduction Scenario 1. A build runs with a `ProjectCachePlugin` 2. The build completes and `LoggingService.ShutdownComponent()` is called 3. The cache plugin's external process exits, triggering `Process.OnExited` on a ThreadPool thread 4. The callback logs "Project cache service process exited" via `LoggingServiceToPluginLoggerAdapter.LogMessage` 5. This calls `ProcessLoggingEvent()` which accesses `_eventQueue.Count` → **null reference crash** ### Debugger Evidence ``` _serviceState = Shutdown _eventQueue = null _dequeueEvent = null _enqueueEvent = null _emptyQueueEvent = null _loggingEventProcessingCancellation = null buildEvent.Message = "Project cache service process exited" Call: System.Diagnostics.Process.OnExited -> ProjectCachePlugin.dll -> LoggingServiceToPluginLoggerAdapter.LogMessage ``` ## Root Cause `ProcessLoggingEvent` lacked two protections that already exist elsewhere in the codebase: 1. **Shutdown state guard** - Other public methods (`RegisterLogger`, `RegisterDistributedLogger`, `InitializeNodeLoggers`, `InitializeComponent`) all check `_serviceState != LoggingServiceState.Shutdown`. `ProcessLoggingEvent` did not. 2. **Local field capture** - The `LoggingEventProc` queue pump thread already captures `_eventQueue`, `_dequeueEvent`, `_emptyQueueEvent`, and `_enqueueEvent` into local variables to prevent races with `CleanLoggingEventProcessing()`. `ProcessLoggingEvent` accessed the fields directly. ## Fix Three layers of defense, all following patterns already established in the same file: 1. **Early return** when `_serviceState == LoggingServiceState.Shutdown` - silently drops the event, since the service is no longer operational. 2. **Capture fields to locals** before use - prevents null dereference if `CleanLoggingEventProcessing()` runs concurrently between the state check and the field reads. 3. **Null-check the captured locals** - second defense against the TOCTOU race between step 1 and step 2. ## Risk **Low.** The fix only adds early-exit guards - no behavioral change for the normal (non-shutdown) code path. The patterns used are already established in the same file.
There was a problem hiding this comment.
Pull request overview
This PR hardens LoggingService.ProcessLoggingEvent against a shutdown-time race where internal queue fields are disposed/nullified, preventing crashes from late-arriving log events (for example from plugin callbacks running on thread pool threads after the build completes).
Changes:
- Add a shutdown guard and local-field capture in
ProcessLoggingEventto avoid null dereferences and handle disposed wait handles safely. - Add unit tests covering post-shutdown calls and concurrent shutdown/logging scenarios.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/Build/BackEnd/Components/Logging/LoggingService.cs | Adds shutdown/race defenses to async event enqueueing to prevent post-shutdown crashes. |
| src/Build.UnitTests/BackEnd/LoggingService_Tests.cs | Adds regression tests for logging after shutdown and during concurrent shutdown. |
| LoggingService loggingService = (LoggingService)LoggingService.CreateLoggingService(LoggerMode.Asynchronous, 1); | ||
| ((IBuildComponent)loggingService).InitializeComponent(mockHost); | ||
| loggingService.RegisterLogger(new ConsoleLogger()); | ||
|
|
There was a problem hiding this comment.
The new tests register ConsoleLogger, which writes to stdout and can introduce noisy/flaky test output. Prefer using MockLogger (optionally constructed with ITestOutputHelper) so assertions can be made against captured events without depending on console IO.
| startSignal.WaitOne(); | ||
| for (int i = 0; i < 100; i++) | ||
| { | ||
| try | ||
| { | ||
| BuildMessageEventArgs msg = new BuildMessageEventArgs($"Message {i}", null, null, MessageImportance.Low); | ||
| loggingService.ProcessLoggingEvent(msg); | ||
| } | ||
| catch (Exception ex) | ||
| { | ||
| Interlocked.CompareExchange(ref caughtException, ex, null); | ||
| break; | ||
| } | ||
| } | ||
| }); |
There was a problem hiding this comment.
logThread is created as a foreground Thread and waits on startSignal.WaitOne() with no timeout. If the test fails before startSignal.Set() (or if the thread blocks unexpectedly), it can leave a live foreground thread and hang the test host process. Set IsBackground = true and/or use a timed wait with an assertion so the thread can’t outlive the test.
| startSignal.WaitOne(); | |
| for (int i = 0; i < 100; i++) | |
| { | |
| try | |
| { | |
| BuildMessageEventArgs msg = new BuildMessageEventArgs($"Message {i}", null, null, MessageImportance.Low); | |
| loggingService.ProcessLoggingEvent(msg); | |
| } | |
| catch (Exception ex) | |
| { | |
| Interlocked.CompareExchange(ref caughtException, ex, null); | |
| break; | |
| } | |
| } | |
| }); | |
| try | |
| { | |
| bool signaled = startSignal.WaitOne(TimeSpan.FromSeconds(10)); | |
| signaled.ShouldBeTrue("Logging thread did not receive the start signal within the allotted time."); | |
| for (int i = 0; i < 100; i++) | |
| { | |
| try | |
| { | |
| BuildMessageEventArgs msg = new BuildMessageEventArgs($"Message {i}", null, null, MessageImportance.Low); | |
| loggingService.ProcessLoggingEvent(msg); | |
| } | |
| catch (Exception ex) | |
| { | |
| Interlocked.CompareExchange(ref caughtException, ex, null); | |
| break; | |
| } | |
| } | |
| } | |
| catch (Exception ex) | |
| { | |
| Interlocked.CompareExchange(ref caughtException, ex, null); | |
| } | |
| }) | |
| { | |
| IsBackground = true | |
| }; |
| #region ProcessLoggingEvent After Shutdown Tests | ||
|
|
||
| /// <summary> | ||
| /// Verify that ProcessLoggingEvent does not crash when called after the | ||
| /// LoggingService has been shut down (asynchronous mode). This reproduces | ||
| /// the race condition where an external callback (e.g., Process.Exited) | ||
| /// tries to log after ShutdownComponent has nullified internal state. | ||
| /// </summary> | ||
| [Fact] | ||
| public void ProcessLoggingEventAfterShutdown_Asynchronous_DoesNotThrow() |
There was a problem hiding this comment.
These added tests don't use TestEnvironment / ITestOutputHelper patterns that are used broadly in this test suite to ensure deterministic cleanup and capture diagnostic output in CI. Consider updating the test class to accept ITestOutputHelper and creating a TestEnvironment in the new tests so failures include useful logs and any test state is reliably reverted.
…wn (#13450)
Summary
Fix a race condition crash (access violation 0xC0000005 / NullReferenceException) in
LoggingService.ProcessLoggingEventwhen the method is called after the logging service has been shut down.Problem
ProcessLoggingEventdirectly accesses_eventQueue,_dequeueEvent, and_enqueueEventwithout any shutdown guard. DuringShutdownComponent(),CleanLoggingEventProcessing()disposes and nullifies all of these fields. If an external callback (e.g.,Process.Exitedfrom aProjectCachePlugin) attempts to log a message after shutdown has completed, the null field access causes an unrecoverable crash.Reproduction Scenario
ProjectCachePluginLoggingService.ShutdownComponent()is calledProcess.OnExitedon a ThreadPool threadLoggingServiceToPluginLoggerAdapter.LogMessageProcessLoggingEvent()which accesses_eventQueue.Count→ null reference crashDebugger Evidence
Root Cause
ProcessLoggingEventlacked two protections that already exist elsewhere in the codebase:Shutdown state guard - Other public methods (
RegisterLogger,RegisterDistributedLogger,InitializeNodeLoggers,InitializeComponent) all check_serviceState != LoggingServiceState.Shutdown.ProcessLoggingEventdid not.Local field capture - The
LoggingEventProcqueue pump thread already captures_eventQueue,_dequeueEvent,_emptyQueueEvent, and_enqueueEventinto local variables to prevent races withCleanLoggingEventProcessing().ProcessLoggingEventaccessed the fields directly.Fix
Three layers of defense, all following patterns already established in the same file:
_serviceState == LoggingServiceState.ShutdownCleanLoggingEventProcessing()runs concurrently between the state check and the field reads.Risk
Low. The fix only adds early-exit guards - no behavioral change for the normal (non-shutdown) code path. The patterns used are already established in the same file.
Fixes #
Context
Changes Made
Testing
Notes