fix(cloudflare): correct browser-rendering getCrawl response schema by k3dom · Pull Request #331 · alchemy-run/distilled

k3dom · 2026-06-10T13:19:56Z

Disclaimer

This PR was fully generated using Claude Code. I however have been running these exact schema fixes as a pnpm patch against @distilled.cloud/cloudflare as crawls and can confirm they match the live API behavior.

Problem

The upstream cloudflare-typescript SDK types for browser-rendering's getCrawl operation don't match what the API actually returns, so decoding real responses fails:

records[].metadata is typed as required, but Cloudflare omits it for records that have not completed yet (queued/skipped/cancelled/...). Polling an in-progress crawl job therefore fails to decode with a ParseError until every record has finished.
cursor is typed as string, but the API returns a number (the next record index for pagination). It is absent on the last page.

Example of a real response that the current schema rejects:

{
  "result": {
    "id": "...",
    "browserSecondsUsed": 12,
    "finished": 3,
    "records": [
      { "status": "queued", "url": "https://example.com/page" }
    ],
    "skipped": 0,
    "status": "running",
    "total": 10,
    "cursor": 50
  }
}

Fix

Adds patches/browser-rendering/getCrawl.json using the existing patch mechanism:

{
  "response": {
    "properties": {
      "records[].metadata": { "optional": true },
      "cursor": { "type": "number", "nullable": true }
    }
  }
}

and regenerates the service (bun run generate). Only the getCrawl schema changes; the emitted specs/cloudflare/browser-rendering.openapi.yml is updated to match. Unrelated drift the generator produced in other emitted specs (containers/r2/secrets-store/workers info titles) was intentionally left out of this PR.

bun run check (tsgo + oxlint + oxfmt) passes.

Non-issues

Two other discrepancies between the live API and the SDK types were verified to not need patching:

Extra metadata keys (og:*, lastModified, ...) — the API returns more metadata fields than the SDK declares, but effect Schema's default decode ignores unknown keys, so they pass through without error.
Undocumented job statuses (running, cancelled_by_user) — the top-level job status is a plain Schema.String (not a literal union), so these decode fine as-is.

The upstream Cloudflare TypeScript SDK types for the getCrawl operation do not match what the API actually returns: - `records[].metadata` is typed as required, but Cloudflare omits it for records that have not completed (queued/skipped/cancelled/...), causing decode failures while polling an in-progress crawl. - `cursor` is typed as a string, but the API returns a number (the next record index). It is absent on the last page. Adds a patch file for the operation and regenerates the service. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

k3dom marked this pull request as ready for review June 10, 2026 13:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(cloudflare): correct browser-rendering getCrawl response schema#331

fix(cloudflare): correct browser-rendering getCrawl response schema#331
k3dom wants to merge 1 commit into
alchemy-run:mainfrom
k3dom:fix/browser-rendering-getcrawl-schema

k3dom commented Jun 10, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

k3dom commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Disclaimer

Problem

Fix

Non-issues

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

k3dom commented Jun 10, 2026 •

edited

Loading