fix: eliminate flaky OnetimeConfigUpdateUnittest via injectable clock#2551
fix: eliminate flaky OnetimeConfigUpdateUnittest via injectable clock#2551
Conversation
…ess error Agent-Logs-Url: https://github.com/alibaba/loongcollector/sessions/75a7db26-9448-439d-a9ea-33e319a7e2ce Co-authored-by: yyuuttaaoo <1827594+yyuuttaaoo@users.noreply.github.com>
Agent-Logs-Url: https://github.com/alibaba/loongcollector/sessions/75a7db26-9448-439d-a9ea-33e319a7e2ce Co-authored-by: yyuuttaaoo <1827594+yyuuttaaoo@users.noreply.github.com>
Agent-Logs-Url: https://github.com/alibaba/loongcollector/sessions/dbe89a33-7fcb-4c52-a407-7db4dd54b7db Co-authored-by: yyuuttaaoo <1827594+yyuuttaaoo@users.noreply.github.com>
… non-test impls with system_clock Agent-Logs-Url: https://github.com/alibaba/loongcollector/sessions/c9045f85-6c3a-4a59-9cb4-21453adba656 Co-authored-by: yyuuttaaoo <1827594+yyuuttaaoo@users.noreply.github.com>
…point, update test call sites Agent-Logs-Url: https://github.com/alibaba/loongcollector/sessions/c9205b39-a601-47e8-9b61-7580709691c6 Co-authored-by: yyuuttaaoo <1827594+yyuuttaaoo@users.noreply.github.com>
|
|
Removed duplicate include of 'common/TimeUtil.h'.
There was a problem hiding this comment.
Pull request overview
该 PR 旨在消除 onetime 配置相关单测在 CI 中因跨秒导致的偶现失败,通过在 UT 构建下提供可控“当前时间”,并让过期时间计算与断言使用同一时间源,从而使断言确定化。
Changes:
- 在
common/TimeUtil中增加 UT 可注入时钟(gCurrentTimeNs)及 RAII 覆盖器ScopedClockOverride,并让GetCurrentTimeIn*()在 UT 下走该注入点。 PipelineConfig::GetExpireTimeIfOneTime()使用GetCurrentTimeInSeconds()替代time(nullptr),避免生产代码与测试各自取时导致 off-by-1。- 更新
OnetimeConfigUpdateUnittest/PipelineConfigUnittest的部分断言:在调用更新逻辑时固定 now,并用固定值计算期望过期时间。
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| core/common/TimeUtil.h | UT 下引入可注入时钟声明与 ScopedClockOverride |
| core/common/TimeUtil.cpp | UT 下实现 gCurrentTimeNs,并让 GetCurrentTimeIn*() 可被覆盖 |
| core/config/PipelineConfig.cpp | onetime 过期时间计算改用 GetCurrentTimeInSeconds() |
| core/config/PipelineConfig.h | 新增 OnetimeConfigUpdateUnittest friend 声明 |
| core/unittest/config/OnetimeConfigUpdateUnittest.cpp | 用 ScopedClockOverride 固定更新时刻,断言使用固定 now |
| core/unittest/config/PipelineConfigUnittest.cpp | 用 ScopedClockOverride 固定部分场景 now,断言使用固定 now |
|
@copilot apply changes based on the comments in this thread |
… friend, fix remaining flaky assertion Agent-Logs-Url: https://github.com/alibaba/loongcollector/sessions/4d1f4f26-225f-43cd-a0e1-199437031b90 Co-authored-by: yyuuttaaoo <1827594+yyuuttaaoo@users.noreply.github.com>
已在 c32a7a4 中应用了以下修改:
|
| return std::chrono::duration_cast<std::chrono::microseconds>(std::chrono::system_clock::now().time_since_epoch()) | ||
| .count(); | ||
| #ifdef APSARA_UNIT_TEST_MAIN | ||
| return gCurrentTimeNs() / 1000ULL; |
There was a problem hiding this comment.
UT为什么要增加特殊分支?
因为UT要用override的time函数,可能不是system_clock::now(),非UT不允许override time函数。
这可以完全解决
- 两个system_clock::now()差了几十毫秒导致差值多了一秒或少了一秒,导致UT失败的问题
- 其他依赖sleep的UT,可以利用这个util改造成不依赖sleep的,减少UT运行时间
OnetimeConfigUpdateUnittest::OnCollectionConfigUpdatewas intermittently off by 1 second in CI because the production code and the test assertion each calledtime(nullptr)independently, and a second boundary could tick between them.Changes
common/TimeUtil.h— UnderAPSARA_UNIT_TEST_MAIN, adds:thread_local std::function<uint64_t()> gCurrentTimeNs— injectable clock returning nanoseconds since epoch,thread_localso background threads spawned by pipelines always use the real clock while the test thread can safely override without data racesScopedClockOverride— RAII guard that accepts astd::chrono::system_clock::time_point, pins the test thread's clock to a fixed nanosecond value for the duration of a test, and restores it on destruction; copy/assign deletedcommon/TimeUtil.cpp— Defines thethread_local gCurrentTimeNsdefault (realsystem_clock::now()). All fourGetCurrentTimeIn*()functions derive from it in UT builds by dividing to the appropriate precision.PipelineConfig.cpp— Bothtime(nullptr)calls inGetExpireTimeIfOneTime()replaced withGetCurrentTimeInSeconds(), routing through the injectable clock in test builds.PipelineConfig.h— Retains onlyfriend class PipelineConfigUnittest(removed unnecessaryfriend class OnetimeConfigUpdateUnittest, which never accesses protected/private members).OnetimeConfigUpdateUnittest.cpp— Capturessystem_clock::now()before eachUpdatePipelines()call, wraps the call inScopedClockOverride, and asserts usingduration_cast<seconds>of the captured value — making the expected value identical to what the production code computes.PipelineConfigUnittest.cpp— Same pattern applied to allmOnetimeExpireTimeassertions (including the previously missed "new config" scenario) that formerly compared against an independenttime(nullptr)call.This pull request was created from Copilot chat.