Skip to content

✅ delay /empty.css response to fix WebKit early-timings flake#4631

Merged
thomas-lebeau merged 1 commit into
mainfrom
thomas.lebeau/fix-webkit-empty-css-flake
May 20, 2026
Merged

✅ delay /empty.css response to fix WebKit early-timings flake#4631
thomas-lebeau merged 1 commit into
mainfrom
thomas.lebeau/fix-webkit-empty-css-flake

Conversation

@thomas-lebeau
Copy link
Copy Markdown
Collaborator

@thomas-lebeau thomas-lebeau commented May 15, 2026

Motivation

Fix the top 3 flaky E2E tests of the last 7 days: rum resources › retrieve early requests timings (async / npm / bundle), with 28 / 24 / 21 occurrences across many branches.

Root cause (verified on CI)

WebKit clamps PerformanceResourceTiming timestamps and performance.now() to 1ms resolution (privacy/Spectre mitigation). For the zero-byte /empty.css served over fast Linux CI loopback, the entire request occasionally completes inside a single 1ms tick. When that happens, every timestamp on the entry (startTime, requestStart, responseStart, responseEnd, …) clamps to the same value, and the SDK truthfully emits duration: 0 and download.start: 0. The existing Safari fallback in computeResourceEntryDuration (packages/rum-core/src/domain/resource/resourceUtils.ts:76) only fires when startTime < responseEnd, so it cannot help when they're equal.

Repro evidence

Verified on CI via PR #4630 (instrumentation + --repeat-each=30 --retries=0, webkit-pinned + chromium):

Without fix With fix
webkit-pinned 17 / 90 failed (~19%) 0 / 90 failed
chromium 0 / 90 failed 0 / 90 failed ✅

Sample failing entry from the without-fix run:

raw = {
  startTime: 6, fetchStart: 6, …, requestStart: 6,
  responseStart: 6, responseEnd: 6, duration: 0,
  nowSamples: [567, 567, 567, 567]
}
sdk = { duration: 0, download: {start: 0}, first_byte: {start: 0} }

Compare with Chromium (100 μs resolution — actual sub-ms digits):

startTime: 8.7, requestStart: 9.3, responseStart: 9.8, responseEnd: 10.1, duration: 1.4

Approach: server-side delay on /empty.css

Add a 50 ms setTimeout before the response in the mock server. With a 50 ms gap between startTime and responseEnd, WebKit's 1 ms clamping cannot collapse them — responseStart and responseEnd are guaranteed to fall in different ticks, so the SDK reports real non-zero duration and download.start.

Alternatives considered:

  • Relax the test to accept duration >= 0: would silently mask a real bug if the SDK started reporting duration: 0 on Chromium/Firefox (raised by automated review on the earlier iteration of this PR).
  • Drop entries when timestamps collapse: silent data loss for legit cached resources in production.
  • Floor duration to a minimum in the SDK: fabricates data for customer dashboards; doesn't fix download.start: 0 either.

The server delay keeps expectToHaveValidTimings strict on every browser, so any future regression that zeroes a duration would still be caught.

Changes

  • test/e2e/lib/framework/serverApps/mock.ts: wrap the /empty.css response in a 50 ms setTimeout. Comment explains the WebKit clamping and links the failure mode.

Test instructions

Checklist

  • Tested locally
  • Tested on staging
  • Added unit tests for this change.
  • Added e2e/integration tests for this change.
  • Updated documentation and/or relevant AGENTS.md file

@thomas-lebeau thomas-lebeau marked this pull request as ready for review May 15, 2026 07:43
@thomas-lebeau thomas-lebeau requested a review from a team as a code owner May 15, 2026 07:43
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1dedbfc870

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread test/e2e/scenario/rum/resources.scenario.ts Outdated
WebKit clamps `PerformanceResourceTiming` timestamps and `performance.now()`
to 1ms (privacy/Spectre mitigation). For the zero-byte `/empty.css` served
over fast (Linux CI) loopback, the entire request occasionally completes in
a single 1ms tick, collapsing every timestamp to the same value. The SDK
then truthfully emits `duration: 0` and `download.start: 0`, and the strict
`> 0` assertion in `expectToHaveValidTimings` fails — top flaky E2E this
week (28/24/21 occurrences for the async/npm/bundle variants).

Adding a 50ms server-side delay guarantees `startTime < responseEnd` even
under 1ms clamping. Keeps `expectToHaveValidTimings` strict, so a real
regression that zeroed duration would still be caught on every browser.
@thomas-lebeau thomas-lebeau force-pushed the thomas.lebeau/fix-webkit-empty-css-flake branch from 1dedbfc to 34b1150 Compare May 15, 2026 09:13
@thomas-lebeau thomas-lebeau changed the title ✅ stop asserting duration > 0 on WebKit early-timings test ✅ delay /empty.css response to fix WebKit early-timings flake May 15, 2026
@datadog-prod-us1-4
Copy link
Copy Markdown

datadog-prod-us1-4 Bot commented May 15, 2026

Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

🎯 Code Coverage (details)
Patch Coverage: 100.00%
Overall Coverage: 76.96% (+0.00%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 34b1150 | Docs | Datadog PR Page | Give us feedback!

@cit-pr-commenter-54b7da
Copy link
Copy Markdown

Bundles Sizes Evolution

📦 Bundle Name Base Size Local Size 𝚫 𝚫% Status
Rum 169.51 KiB 169.51 KiB 0 B 0.00%
Rum Profiler 5.97 KiB 5.97 KiB 0 B 0.00%
Rum Recorder 21.23 KiB 21.23 KiB 0 B 0.00%
Logs 54.70 KiB 54.70 KiB 0 B 0.00%
Rum Slim 127.85 KiB 127.85 KiB 0 B 0.00%
Worker 22.99 KiB 22.99 KiB 0 B 0.00%
🚀 CPU Performance

Pending...

🧠 Memory Performance

Pending...

🔗 RealWorld

@thomas-lebeau thomas-lebeau merged commit d58b21b into main May 20, 2026
23 checks passed
@thomas-lebeau thomas-lebeau deleted the thomas.lebeau/fix-webkit-empty-css-flake branch May 20, 2026 13:35
@github-actions github-actions Bot locked and limited conversation to collaborators May 20, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants