Skip to content

Add Scrape Autopilot components#21238

Open
scrappilot wants to merge 4 commits into
PipedreamHQ:masterfrom
scrappilot:add-scrape-autopilot
Open

Add Scrape Autopilot components#21238
scrappilot wants to merge 4 commits into
PipedreamHQ:masterfrom
scrappilot:add-scrape-autopilot

Conversation

@scrappilot

@scrappilot scrappilot commented Jun 24, 2026

Copy link
Copy Markdown

Summary

  • Add Scrape Autopilot as a new Pipedream app with shared API key auth.
  • Add actions for scraping one URL, scraping multiple URLs, and checking credit balance.
  • Use Scrape Autopilot's REST API via @pipedream/platform axios; no official Node.js SDK is currently available.
  • Pagination is not applicable to these request/response endpoints.

Tests

  • node scripts/findBadKeys.js components/scrape_autopilot/actions/scrape-url/scrape-url.mjs,components/scrape_autopilot/actions/scrape-urls/scrape-urls.mjs,components/scrape_autopilot/actions/get-balance/get-balance.mjs
  • node scripts/findDuplicateKeys.js
  • node scripts/generate-package-report.js --package=scrape_autopilot --verbose
  • pnpm exec eslint components/scrape_autopilot

Checklist

Please check the following items before your PR can be reviewed:

Versioning

  • All components updated in this PR had their version updated (0.0.1 for new ones)
  • The app updated in this PR had its package.json's version updated

New app

App integration request submitted: #21239 (#21239)

  • The app updated in this PR is already integrated

CodeRabbit review

  • I have addressed or acknowledged all of CodeRabbit's review comments

Summary by CodeRabbit

  • New Features

    • Added Scrape Autopilot integration for authenticated scraping of public URLs.
    • Supports scraping a single URL or multiple URLs, with output options for Markdown, HTML, or plain text.
    • Added a balance check to view remaining credits.
  • Documentation

    • Added a setup guide and example use cases for common scraping and research workflows.

@vercel

vercel Bot commented Jun 24, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
pipedream-docs-redirect-do-not-edit Ignored Ignored Jun 30, 2026 1:57pm

Request Review

@pipedream-component-development

Copy link
Copy Markdown
Collaborator

Thank you so much for submitting this! We've added it to our backlog to review, and our team has been notified.

@pipedream-component-development

Copy link
Copy Markdown
Collaborator

Thanks for submitting this PR! When we review PRs, we follow the Pipedream component guidelines. If you're not familiar, here's a quick checklist:

@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: c53a8b17-8e0f-4898-ac2c-df05d9cb0699

📥 Commits

Reviewing files that changed from the base of the PR and between 6f0a633 and 3bc17c0.

📒 Files selected for processing (3)
  • components/scrape_autopilot/actions/get-balance/get-balance.mjs
  • components/scrape_autopilot/actions/scrape-url/scrape-url.mjs
  • components/scrape_autopilot/actions/scrape-urls/scrape-urls.mjs

📝 Walkthrough

Walkthrough

Adds a new scrape_autopilot Pipedream component integrating the ScrapePilot API, with an app module, three actions, a package manifest, and README documentation.

Changes

Scrape Autopilot Integration

Layer / File(s) Summary
App module, package config, and README
components/scrape_autopilot/scrape_autopilot.app.mjs, components/scrape_autopilot/package.json, components/scrape_autopilot/README.md
Defines the Pipedream app object with format prop definitions, auth headers from api_key, shared request helpers, and ScrapePilot API methods for single scrape, batch scrape, and balance retrieval. Adds the package manifest and README content.
Get Balance, Scrape URL, and Scrape URLs actions
components/scrape_autopilot/actions/get-balance/get-balance.mjs, components/scrape_autopilot/actions/scrape-url/scrape-url.mjs, components/scrape_autopilot/actions/scrape-urls/scrape-urls.mjs
Adds three actions: get-balance returns the credit balance, scrape-url posts a single URL with format and js, and scrape-urls normalizes and validates a batch of URLs before posting up to 10 entries with the same options.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related issues

  • [APP] Scrape Autopilot #21239 — The PR implements the Scrape Autopilot app and the requested actions for single-URL scraping, batch scraping, and balance lookup.

Suggested labels

prioritized, HIGH PRIORITY

Suggested reviewers

  • michelle0927
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title is concise and accurately reflects the new Scrape Autopilot app components added in this PR.
Description check ✅ Passed The description follows the template with a Summary, Tests, and checklist items, and it covers the new app and versioning status.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@components/scrape_autopilot/actions/get-balance/get-balance.mjs`:
- Line 5: The component description for getBalance (and the other two new action
definitions) is missing the required documentation link suffix. Update the
description text in each action’s metadata so it ends with a “See the
documentation” markdown link pointing to the appropriate ScrapePilot API
reference page, keeping the existing summary text and appending the link at the
end.

In `@components/scrape_autopilot/actions/scrape-url/scrape-url.mjs`:
- Line 20: The scrape-url action’s description is missing the required
documentation link. Update the description in scrape-url.mjs so it still
describes the action, but ends with the exact “See the documentation” link
format requested; use the description field in the scrape-url action object and
append the docs URL to it.

In `@components/scrape_autopilot/actions/scrape-urls/scrape-urls.mjs`:
- Line 21: The action description in scrapeUrlsAction is missing the required
documentation link at the end. Update the description field so it still
describes the component’s purpose and ends with the exact "[See the
documentation](https://...)" link format required by the review; make this
change in the scrape-urls action definition where the description string is
defined.
- Around line 56-62: The pre-call validation in the scrape-urls action currently
throws generic Error instances for the empty urls list and the MAX_URLS limit
check. Update the validation logic in the scrape-urls flow to use
ConfigurationError from `@pipedream/platform` for both cases so these user-input
issues are surfaced as configuration mistakes; keep the existing messages, and
make sure the checks around urls.length and MAX_URLS still run before any API
call.

In `@components/scrape_autopilot/scrape_autopilot.app.mjs`:
- Around line 6-13: Move the shared `format` and `js` props out of the
individual scrape components and into `scrapeAutopilot`’s `propDefinitions`,
since they are duplicated in both `scrape-url.mjs` and `scrape-urls.mjs`. Add
them as shared definitions on the app object, then update the components to
reference them via `propDefinition` while keeping any component-specific
overrides in place. Also consolidate the duplicated `FORMATS` constant so it is
defined once alongside the shared prop definitions and reused by both
components.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: cadc2987-92bb-4bc9-b6f8-65a6e8f1e79d

📥 Commits

Reviewing files that changed from the base of the PR and between 4b54e04 and 984d09f.

📒 Files selected for processing (6)
  • components/scrape_autopilot/README.md
  • components/scrape_autopilot/actions/get-balance/get-balance.mjs
  • components/scrape_autopilot/actions/scrape-url/scrape-url.mjs
  • components/scrape_autopilot/actions/scrape-urls/scrape-urls.mjs
  • components/scrape_autopilot/package.json
  • components/scrape_autopilot/scrape_autopilot.app.mjs

Comment thread components/scrape_autopilot/actions/get-balance/get-balance.mjs Outdated
Comment thread components/scrape_autopilot/actions/scrape-url/scrape-url.mjs Outdated
Comment thread components/scrape_autopilot/actions/scrape-urls/scrape-urls.mjs Outdated
Comment thread components/scrape_autopilot/actions/scrape-urls/scrape-urls.mjs
Comment thread components/scrape_autopilot/scrape_autopilot.app.mjs

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
components/scrape_autopilot/actions/scrape-urls/scrape-urls.mjs (1)

8-8: 📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Component description must end with a documentation link.

Append [See the documentation](https://...) pointing to the ScrapePilot scrape endpoint reference.

As per path instructions: "Component description must end with a documentation link [See the documentation](https://...)".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/scrape_autopilot/actions/scrape-urls/scrape-urls.mjs` at line 8,
The component description in the scrape-urls action must end with the required
documentation link. Update the description field in scrape-urls.mjs so it
appends “[See the documentation](https://...)” pointing to the ScrapePilot
scrape endpoint reference, and ensure the final text ends with that link
exactly.

Source: Path instructions

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@components/scrape_autopilot/actions/scrape-urls/scrape-urls.mjs`:
- Line 8: The component description in the scrape-urls action must end with the
required documentation link. Update the description field in scrape-urls.mjs so
it appends “[See the documentation](https://...)” pointing to the ScrapePilot
scrape endpoint reference, and ensure the final text ends with that link
exactly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 94b597b2-efeb-42ab-9120-30918143f69b

📥 Commits

Reviewing files that changed from the base of the PR and between 984d09f and 6f0a633.

📒 Files selected for processing (4)
  • components/scrape_autopilot/actions/get-balance/get-balance.mjs
  • components/scrape_autopilot/actions/scrape-url/scrape-url.mjs
  • components/scrape_autopilot/actions/scrape-urls/scrape-urls.mjs
  • components/scrape_autopilot/scrape_autopilot.app.mjs

@ashwins01 ashwins01 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @scrappilot, thank you for your contribution! I see you've already raised a request for a base integration to be setup in pipedream for scrappilot, we can review and test these changes once a base integration is in place.

cc: @s0s0physm, @sergio-eliot-rodriguez

@ashwins01 ashwins01 moved this from Ready for PR Review to Blocked in Component (Source and Action) Backlog Jun 30, 2026
@scrappilot

Copy link
Copy Markdown
Author

Thanks! Sounds good. I'll wait until the base integration (#21239) is in place.
Please let me know if you need anything else from me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

User submitted Submitted by a user

Development

Successfully merging this pull request may close these issues.

5 participants