Skip to content

Enhance vendor assessment agent with tools and guardrails#842

Draft
gearnode wants to merge 28 commits intomainfrom
gearnode/vendor-assessment-agent
Draft

Enhance vendor assessment agent with tools and guardrails#842
gearnode wants to merge 28 commits intomainfrom
gearnode/vendor-assessment-agent

Conversation

@gearnode
Copy link
Copy Markdown
Contributor

@gearnode gearnode commented Mar 14, 2026

Summary

Comprehensive improvement to the vendor assessment agent framework, adding 14 commits covering security tools, safety mechanisms, data enrichment, and test coverage.

Security Tools: Added SPF email authentication checker and HTTP-to-HTTPS redirect detection. Extended headers check with Cross-Origin-*-Policy headers. Strengthened all 5 security tool tests.

Safety & Performance: Implemented 30-second timeouts on browser tools, fixed context cancellation propagation to Chrome tabs, and increased document text extraction from 8K to 32K characters.

Agent Configuration: Added bug_bounty_url and data_locations to VendorInfo for richer vendor profiling. Expanded all 6 system prompts with deeper criteria, risk scoring framework, and category reference examples.

Quality: 335 lines of new unit tests covering pure functions in security tools (parseDMARCTag, parseCSPDirectives, parseSPFPolicy, protocolName, splitTrimmed).


Summary by cubic

Replaces the single‑agent vendor assessment with a multi‑agent vetting system that uses browser, security, and search tools to produce a sourced markdown report with risk ratings and safely update vendor records. Also returns the report and subprocessors via GraphQL and adds a vendor assess CLI.

  • New Features

    • Expanded multi‑agent assessor orchestrated by vetting (crawler, security, document analyzer, compliance, market presence, data processing, incident response, financial stability, professional standing, regulatory, vendor comparison, web research) to generate a full markdown assessment with an overall risk rating and per‑category summary.
    • Browser tools: navigate, extract page text (32K cap), extract/find links, click/select elements, download and parse PDFs, fetch robots.txt and sitemaps; per‑tool 30s timeouts and public‑URL validation.
    • Search tools: web search, government databases, Wayback snapshots, and text diffing for document comparisons.
    • Security tools: SSL/TLS, security headers (incl. COOP/COEP/CORP), DMARC, SPF, DNSSEC, CSP analysis, CORS check, breach lookup, WHOIS, and DNS records; security checks run in parallel.
    • Progress: reusable agent progress types with nested steps and randomized messages; step_failed events included.
    • Service/API: integrate vetting.Assessor to persist non‑empty vendor updates and emit a vendor‑updated webhook; GraphQL mutation now returns report and parsed subprocessors; new vendor assess CLI command.
    • Vendor info now includes bug_bounty_url and data_locations.
    • LLM support: optional WithThinking budget for reasoning models, wired into anthropic and OpenAI providers.
  • Bug Fixes

    • Propagate context cancellation to Chrome tabs and enforce per‑tool 30s timeouts to prevent hangs.
    • Detect HTTP→HTTPS redirect in the security headers check.
    • Reliability and safety: support wss:// Chrome address; validate URL schemes; bind HTTP requests to context; SPF parsing fixes (case sensitivity, de‑dupe, mixed‑case all); DNSSEC TCP fallback on truncation; URL‑encode HIBP params; reject empty link patterns; remove hard cap on extracted links; disable CORS redirects and guard reflected origin; guard empty vendor category; verify certificate chain/hostname after InsecureSkipVerify to correctly flag wrong‑host/untrusted certs while still reporting expired certs.

Written for commit accabd5. Summary will update on new commits.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

23 issues found across 37 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="pkg/agent/tools/security/csp.go">

<violation number="1" location="pkg/agent/tools/security/csp.go:81">
P2: Bind the CSP fetch to `ctx` so canceled assessments do not keep the HTTP request running until the client timeout.</violation>
</file>

<file name="pkg/agent/tools/browser/extract_text.go">

<violation number="1" location="pkg/agent/tools/browser/extract_text.go:57">
P2: This truncates by bytes rather than characters, so non-ASCII pages can be cut early or return malformed UTF-8.</violation>
</file>

<file name="pkg/agent/tools/security/hibp.go">

<violation number="1" location="pkg/agent/tools/security/hibp.go:61">
P2: Encode the `domain` query parameter instead of concatenating raw input into the HIBP URL.</violation>
</file>

<file name="pkg/agent/tools/browser/browser.go">

<violation number="1" location="pkg/agent/tools/browser/browser.go:35">
P1: Handle `wss://` endpoints when normalizing the browser address; the current check corrupts secure WebSocket URLs.</violation>
</file>

<file name="pkg/agent/tools/security/dnssec.go">

<violation number="1" location="pkg/agent/tools/security/dnssec.go:67">
P1: Retry truncated DNSKEY responses over TCP before treating the answer as complete.</violation>
</file>

<file name="pkg/agent/tools/browser/find_links.go">

<violation number="1" location="pkg/agent/tools/browser/find_links.go:51">
P2: Reject empty patterns before filtering; `includes("")` makes this match every link on the page.</violation>
</file>

<file name="pkg/agent/tools/security/headers.go">

<violation number="1" location="pkg/agent/tools/security/headers.go:92">
P2: Attach the tool context to both HTTP requests so cancellation stops in-flight header checks.</violation>
</file>

<file name="pkg/agent/tools/security/spf.go">

<violation number="1" location="pkg/agent/tools/security/spf.go:88">
P2: Detect duplicate SPF TXT records instead of returning the first one. Multiple `v=spf1` records are an SPF error, so this can incorrectly report an invalid domain as configured.</violation>

<violation number="2" location="pkg/agent/tools/security/spf.go:106">
P2: Match the SPF version tag case-insensitively. As written, valid records like `V=SPF1 ...` are treated as missing.</violation>
</file>

<file name="pkg/agents/vetting/extraction_prompt.txt">

<violation number="1" location="pkg/agents/vetting/extraction_prompt.txt:29">
P2: The PRODUCT_AND_DESIGN examples don't match this repository's vendor taxonomy, so the extractor can learn the wrong category for common vendors like Notion or Jira.</violation>
</file>

<file name="pkg/agents/vetting/orchestrator_prompt.txt">

<violation number="1" location="pkg/agents/vetting/orchestrator_prompt.txt:15">
P2: Run `assess_compliance` on each discovered trust/compliance page, not just a single URL.</violation>

<violation number="2" location="pkg/agents/vetting/orchestrator_prompt.txt:62">
P2: Add CSP and CORS rows to the Risk Summary so the summary stays consistent with the required security checks.</violation>
</file>

<file name="pkg/agent/tools/security/security.go">

<violation number="1" location="pkg/agent/tools/security/security.go:19">
P2: Don't hard-code `8.8.8.8` for all DNS checks; make the resolver configurable or use the environment's resolver so assessments don't fail or misreport in restricted/split-horizon networks.</violation>
</file>

<file name="pkg/agents/vetting/assessment.go">

<violation number="1" location="pkg/agents/vetting/assessment.go:76">
P2: Returning here leaves `extract_vendor_info` stuck in `step_started` state because no `step_failed` event is emitted.</violation>
</file>

<file name="pkg/agent/tools/security/ssl.go">

<violation number="1" location="pkg/agent/tools/security/ssl.go:65">
P2: Propagate the tool context into the TLS dial so cancellations stop the handshake immediately.</violation>

<violation number="2" location="pkg/agent/tools/security/ssl.go:70">
P1: Handshake verification here prevents the tool from reporting details for expired or otherwise invalid certificates.</violation>
</file>

<file name="pkg/probo/vendor_service.go">

<violation number="1" location="pkg/probo/vendor_service.go:760">
P1: Do not overwrite the vendor category when the assessor returns no category.</violation>

<violation number="2" location="pkg/probo/vendor_service.go:780">
P2: Emit the vendor-updated webhook after persisting assessment changes.</violation>
</file>

<file name="pkg/agent/tools/browser/extract_links.go">

<violation number="1" location="pkg/agent/tools/browser/extract_links.go:50">
P1: Validate the URL scheme before calling `chromedp.Navigate`; this currently allows non-HTTP(S) targets such as `file://`.</violation>

<violation number="2" location="pkg/agent/tools/browser/extract_links.go:52">
P2: Cap the extracted link list; returning every `<a>` element makes this tool's output unbounded on large pages.</violation>
</file>

<file name="pkg/agents/vetting/security_prompt.txt">

<violation number="1" location="pkg/agents/vetting/security_prompt.txt:8">
P1: Don't require `check_breaches` here: the current HIBP tool calls a v3 endpoint without the required API key, so this check will fail on every assessment.</violation>
</file>

<file name="pkg/agent/tools/security/cors.go">

<violation number="1" location="pkg/agent/tools/security/cors.go:68">
P2: Disable automatic redirects so the tool inspects the actual preflight response.</violation>

<violation number="2" location="pkg/agent/tools/security/cors.go:105">
P2: Guard `ReflectsOrigin` against an empty Origin to avoid false positives.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 17 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="pkg/agent/tools/security/ssl.go">

<violation number="1" location="pkg/agent/tools/security/ssl.go:68">
P1: Re-enabling `InsecureSkipVerify` makes the SSL checker report untrusted or wrong-host certificates as valid.</violation>
</file>

<file name="pkg/agent/tools/browser/extract_links.go">

<violation number="1" location="pkg/agent/tools/browser/extract_links.go:62">
P2: Limiting the query to 500 anchors breaks the tool's "extract all links" contract by silently dropping links on larger pages.</violation>
</file>

<file name="pkg/agent/tools/security/spf.go">

<violation number="1" location="pkg/agent/tools/security/spf.go:107">
P2: Normalize the record before parsing the `all` qualifier; mixed-case SPF records now return an empty `policy`.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 3 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="pkg/agent/tools/browser/extract_links.go">

<violation number="1" location="pkg/agent/tools/browser/extract_links.go:62">
P1: Removing the link cap makes this tool return unbounded page data. Large pages can now produce oversized Chrome payloads and agent results, which hurts latency and can exceed downstream limits.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@gearnode gearnode marked this pull request as draft March 15, 2026 23:31
@gearnode gearnode force-pushed the gearnode/vendor-assessment-agent branch from 1efe656 to f0e8b3a Compare March 16, 2026 15:41
gearnode added 24 commits March 27, 2026 21:53
Browser tools provide page navigation, text extraction, and link
discovery via Chrome DevTools Protocol. Security tools check SSL
certificates, headers, DMARC, DNSSEC, CSP, CORS, and known
breaches.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
ProgressEvent is a JSON-serializable struct with type, step,
parent_step, and message fields designed for pushing over
WebSocket or SSE. ProgressReporter is a callback function that
any agent-based feature can accept.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
The old assessment used a single LLM call with a large prompt.
This introduces a dedicated orchestrator that coordinates four
specialized sub-agents: website crawler, security assessor,
document analyzer, and compliance assessor. Each sub-agent uses
purpose-built tools (browser, security checks) for deeper
analysis.

Progress reporting hooks emit structured events with parent-step
nesting so the UI can show real-time sub-details for each phase.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Create the Assessor in probod and pass it through to
NewService and TenantService. The Assess endpoint now loads
the vendor inside a transaction, applies only non-empty fields
from the assessment result, and persists the update.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Each tool now has multiple message variants that rotate on
successive calls, so repeated steps like page navigation
don't show the same text every time. Start events carry
the varied message while end events only carry the step
name, letting the UI pair them by key.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Shorter, idiomatic Go name with no underscores.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
The llm package was accidentally removed from service.go while
it is still referenced by the Service struct and NewService
signature. Also sort the agents/vetting import alphabetically
in both service.go and probod.go.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
The security agent has 7 tools but the prompt only listed 5,
making analyze_csp and check_cors invisible to the LLM.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Privacy policies and DPAs commonly exceed 8000 characters,
causing important provisions to be silently truncated.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
NewTab ignored the incoming context, so tool-level timeouts
and context deadlines never reached the browser tab.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
DMARC alone is incomplete without SPF. This adds a check_spf
tool that queries the TXT record for v=spf1 and extracts the
policy qualifier, completing the SPF/DKIM/DMARC triad.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
These fields capture commonly assessed vendor attributes that
were missing from the structured output: bug bounty program
URL and data processing jurisdictions.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Add bug bounty, SLA, GDPR/CCPA paths and keywords. Instruct
the crawler to check footers and avoid revisiting URLs.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Replace rigid sequential workflow with guidance for parallel
execution. Add overall risk rating, per-category pass/warn/fail
ratings, and a risk summary table for easier comparison across
vendor assessments.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Add strategy for handling truncated pages, discovering linked
sub-documents, and noting missing sections explicitly. Add
indemnification and governing law to the extraction checklist.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Add encryption, BC/DR, StateRAMP, and certification status
tracking. Instruct the agent to follow linked sub-pages.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
These headers (COOP, COEP, CORP) are important for site
isolation and commonly assessed in vendor security reviews.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Including examples for each category helps the LLM pick the
right one, especially for ambiguous vendors.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Test parseDMARCTag, parseCSPDirectives, parseSPFPolicy,
protocolName, and splitTrimmed.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
A slow or unresponsive page could hang browser tools
indefinitely. Each tool now applies a 30-second context
timeout before opening a Chrome tab.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
The tool now first probes the HTTP URL without following
redirects to detect whether the site redirects to HTTPS,
then checks security headers on the HTTPS response.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Address 23 review comments: handle wss:// in browser addr,
retry truncated DNSKEY over TCP, use InsecureSkipVerify for
SSL reporting, guard empty vendor category, validate URL
schemes, bind HTTP requests to context, fix SPF case
sensitivity and duplicate detection, encode HIBP domain
param, reject empty link patterns, cap extracted links,
disable CORS redirects, guard ReflectsOrigin, emit
step_failed event, emit vendor-updated webhook, add CSP/CORS
to risk summary, make DNS resolver configurable, and fix
prompt examples.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Manually verify the certificate chain after connecting with
InsecureSkipVerify so that untrusted or wrong-host certs are
correctly reported as invalid while still returning details
for expired certificates.

Lowercase SPF records before parsing the all qualifier so
mixed-case records like ~All are detected.

Remove the JS-side .slice(0, 500) cap on extract_links to
honor the tool's "extract all links" contract.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Introduce a new assess_market_presence tool that identifies notable
customers, case studies, and company size signals to evaluate vendor
credibility during assessments.

Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Signed-off-by: Bryan Frimin <bryan@getprobo.com>
@gearnode gearnode force-pushed the gearnode/vendor-assessment-agent branch from f0e8b3a to accabd5 Compare April 2, 2026 08:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants