Skip to content

fix!: prevent path traversal in FilenameSchema#25

Merged
shahar-brd merged 6 commits into
brightdata:devfrom
karaposu:dev
Jun 7, 2026
Merged

fix!: prevent path traversal in FilenameSchema#25
shahar-brd merged 6 commits into
brightdata:devfrom
karaposu:dev

Conversation

@karaposu

Copy link
Copy Markdown
Contributor

Basename filenames before stripping reserved chars so "../" and absolute paths can't escape cwd via saveResults / SnapshotAPI.download.

BREAKING CHANGE: filenames passed to saveResults and snapshot download are now reduced to their final path segment. Callers relying on nested subdirectory paths (e.g. "output/data.json") will now write to the basename only ("data.json") in the working directory.

karaposu added 3 commits June 6, 2026 10:20
Basename filenames before stripping reserved chars so "../" and absolute
paths can't escape cwd via saveResults / SnapshotAPI.download.

BREAKING CHANGE: filenames passed to saveResults and snapshot download
are now reduced to their final path segment. Callers relying on nested
subdirectory paths (e.g. "output/data.json") will now write to the
basename only ("data.json") in the working directory.
Adds the Bright Data Crawl API as a top-level service on bdclient,
mirroring brightdata.crawler from the Python SDK. Backed by the same
/datasets/v3/{scrape,trigger,progress,snapshot} endpoints already used
by the platform scrapers — dataset_id gd_m6gjtfmeh43we6cqc.

API surface:
- client.crawler.crawl(urls)             — sync /scrape → CrawlResult
- client.crawler.trigger(urls)           — async /trigger → ScrapeJob (CrawlJob alias)
- client.crawler.status(snapshotId)      — GET /progress → status string
- client.crawler.download(snapshotId)    — poll + fetch → CrawlResult

Design choices:
- CrawlResult is a new BaseResult subclass with pageCount + snapshotId,
  matching the per-service Result pattern used by ScrapeResult,
  SearchResult, and DiscoverResult.
- CrawlJob is a type alias for ScrapeJob — the snapshot-job wrapper is
  already generic over SnapshotOperations, no fork needed.
- crawl() and download() never throw on HTTP/network errors, matching
  the never-throws-on-orchestrated convention used by toResult().

Tests: 33 unit + 3 gated integration.
@shahar-brd shahar-brd merged commit 0c5d693 into brightdata:dev Jun 7, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants