Enhance vendor assessment agent with tools and guardrails#842
Draft
Enhance vendor assessment agent with tools and guardrails#842
Conversation
There was a problem hiding this comment.
23 issues found across 37 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="pkg/agent/tools/security/csp.go">
<violation number="1" location="pkg/agent/tools/security/csp.go:81">
P2: Bind the CSP fetch to `ctx` so canceled assessments do not keep the HTTP request running until the client timeout.</violation>
</file>
<file name="pkg/agent/tools/browser/extract_text.go">
<violation number="1" location="pkg/agent/tools/browser/extract_text.go:57">
P2: This truncates by bytes rather than characters, so non-ASCII pages can be cut early or return malformed UTF-8.</violation>
</file>
<file name="pkg/agent/tools/security/hibp.go">
<violation number="1" location="pkg/agent/tools/security/hibp.go:61">
P2: Encode the `domain` query parameter instead of concatenating raw input into the HIBP URL.</violation>
</file>
<file name="pkg/agent/tools/browser/browser.go">
<violation number="1" location="pkg/agent/tools/browser/browser.go:35">
P1: Handle `wss://` endpoints when normalizing the browser address; the current check corrupts secure WebSocket URLs.</violation>
</file>
<file name="pkg/agent/tools/security/dnssec.go">
<violation number="1" location="pkg/agent/tools/security/dnssec.go:67">
P1: Retry truncated DNSKEY responses over TCP before treating the answer as complete.</violation>
</file>
<file name="pkg/agent/tools/browser/find_links.go">
<violation number="1" location="pkg/agent/tools/browser/find_links.go:51">
P2: Reject empty patterns before filtering; `includes("")` makes this match every link on the page.</violation>
</file>
<file name="pkg/agent/tools/security/headers.go">
<violation number="1" location="pkg/agent/tools/security/headers.go:92">
P2: Attach the tool context to both HTTP requests so cancellation stops in-flight header checks.</violation>
</file>
<file name="pkg/agent/tools/security/spf.go">
<violation number="1" location="pkg/agent/tools/security/spf.go:88">
P2: Detect duplicate SPF TXT records instead of returning the first one. Multiple `v=spf1` records are an SPF error, so this can incorrectly report an invalid domain as configured.</violation>
<violation number="2" location="pkg/agent/tools/security/spf.go:106">
P2: Match the SPF version tag case-insensitively. As written, valid records like `V=SPF1 ...` are treated as missing.</violation>
</file>
<file name="pkg/agents/vetting/extraction_prompt.txt">
<violation number="1" location="pkg/agents/vetting/extraction_prompt.txt:29">
P2: The PRODUCT_AND_DESIGN examples don't match this repository's vendor taxonomy, so the extractor can learn the wrong category for common vendors like Notion or Jira.</violation>
</file>
<file name="pkg/agents/vetting/orchestrator_prompt.txt">
<violation number="1" location="pkg/agents/vetting/orchestrator_prompt.txt:15">
P2: Run `assess_compliance` on each discovered trust/compliance page, not just a single URL.</violation>
<violation number="2" location="pkg/agents/vetting/orchestrator_prompt.txt:62">
P2: Add CSP and CORS rows to the Risk Summary so the summary stays consistent with the required security checks.</violation>
</file>
<file name="pkg/agent/tools/security/security.go">
<violation number="1" location="pkg/agent/tools/security/security.go:19">
P2: Don't hard-code `8.8.8.8` for all DNS checks; make the resolver configurable or use the environment's resolver so assessments don't fail or misreport in restricted/split-horizon networks.</violation>
</file>
<file name="pkg/agents/vetting/assessment.go">
<violation number="1" location="pkg/agents/vetting/assessment.go:76">
P2: Returning here leaves `extract_vendor_info` stuck in `step_started` state because no `step_failed` event is emitted.</violation>
</file>
<file name="pkg/agent/tools/security/ssl.go">
<violation number="1" location="pkg/agent/tools/security/ssl.go:65">
P2: Propagate the tool context into the TLS dial so cancellations stop the handshake immediately.</violation>
<violation number="2" location="pkg/agent/tools/security/ssl.go:70">
P1: Handshake verification here prevents the tool from reporting details for expired or otherwise invalid certificates.</violation>
</file>
<file name="pkg/probo/vendor_service.go">
<violation number="1" location="pkg/probo/vendor_service.go:760">
P1: Do not overwrite the vendor category when the assessor returns no category.</violation>
<violation number="2" location="pkg/probo/vendor_service.go:780">
P2: Emit the vendor-updated webhook after persisting assessment changes.</violation>
</file>
<file name="pkg/agent/tools/browser/extract_links.go">
<violation number="1" location="pkg/agent/tools/browser/extract_links.go:50">
P1: Validate the URL scheme before calling `chromedp.Navigate`; this currently allows non-HTTP(S) targets such as `file://`.</violation>
<violation number="2" location="pkg/agent/tools/browser/extract_links.go:52">
P2: Cap the extracted link list; returning every `<a>` element makes this tool's output unbounded on large pages.</violation>
</file>
<file name="pkg/agents/vetting/security_prompt.txt">
<violation number="1" location="pkg/agents/vetting/security_prompt.txt:8">
P1: Don't require `check_breaches` here: the current HIBP tool calls a v3 endpoint without the required API key, so this check will fail on every assessment.</violation>
</file>
<file name="pkg/agent/tools/security/cors.go">
<violation number="1" location="pkg/agent/tools/security/cors.go:68">
P2: Disable automatic redirects so the tool inspects the actual preflight response.</violation>
<violation number="2" location="pkg/agent/tools/security/cors.go:105">
P2: Guard `ReflectsOrigin` against an empty Origin to avoid false positives.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
3 issues found across 17 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="pkg/agent/tools/security/ssl.go">
<violation number="1" location="pkg/agent/tools/security/ssl.go:68">
P1: Re-enabling `InsecureSkipVerify` makes the SSL checker report untrusted or wrong-host certificates as valid.</violation>
</file>
<file name="pkg/agent/tools/browser/extract_links.go">
<violation number="1" location="pkg/agent/tools/browser/extract_links.go:62">
P2: Limiting the query to 500 anchors breaks the tool's "extract all links" contract by silently dropping links on larger pages.</violation>
</file>
<file name="pkg/agent/tools/security/spf.go">
<violation number="1" location="pkg/agent/tools/security/spf.go:107">
P2: Normalize the record before parsing the `all` qualifier; mixed-case SPF records now return an empty `policy`.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
1 issue found across 3 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="pkg/agent/tools/browser/extract_links.go">
<violation number="1" location="pkg/agent/tools/browser/extract_links.go:62">
P1: Removing the link cap makes this tool return unbounded page data. Large pages can now produce oversized Chrome payloads and agent results, which hurts latency and can exceed downstream limits.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
1efe656 to
f0e8b3a
Compare
Browser tools provide page navigation, text extraction, and link discovery via Chrome DevTools Protocol. Security tools check SSL certificates, headers, DMARC, DNSSEC, CSP, CORS, and known breaches. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
ProgressEvent is a JSON-serializable struct with type, step, parent_step, and message fields designed for pushing over WebSocket or SSE. ProgressReporter is a callback function that any agent-based feature can accept. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
The old assessment used a single LLM call with a large prompt. This introduces a dedicated orchestrator that coordinates four specialized sub-agents: website crawler, security assessor, document analyzer, and compliance assessor. Each sub-agent uses purpose-built tools (browser, security checks) for deeper analysis. Progress reporting hooks emit structured events with parent-step nesting so the UI can show real-time sub-details for each phase. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Create the Assessor in probod and pass it through to NewService and TenantService. The Assess endpoint now loads the vendor inside a transaction, applies only non-empty fields from the assessment result, and persists the update. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Each tool now has multiple message variants that rotate on successive calls, so repeated steps like page navigation don't show the same text every time. Start events carry the varied message while end events only carry the step name, letting the UI pair them by key. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Shorter, idiomatic Go name with no underscores. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
The llm package was accidentally removed from service.go while it is still referenced by the Service struct and NewService signature. Also sort the agents/vetting import alphabetically in both service.go and probod.go. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
The security agent has 7 tools but the prompt only listed 5, making analyze_csp and check_cors invisible to the LLM. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Privacy policies and DPAs commonly exceed 8000 characters, causing important provisions to be silently truncated. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
NewTab ignored the incoming context, so tool-level timeouts and context deadlines never reached the browser tab. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
DMARC alone is incomplete without SPF. This adds a check_spf tool that queries the TXT record for v=spf1 and extracts the policy qualifier, completing the SPF/DKIM/DMARC triad. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
These fields capture commonly assessed vendor attributes that were missing from the structured output: bug bounty program URL and data processing jurisdictions. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Add bug bounty, SLA, GDPR/CCPA paths and keywords. Instruct the crawler to check footers and avoid revisiting URLs. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Replace rigid sequential workflow with guidance for parallel execution. Add overall risk rating, per-category pass/warn/fail ratings, and a risk summary table for easier comparison across vendor assessments. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Add strategy for handling truncated pages, discovering linked sub-documents, and noting missing sections explicitly. Add indemnification and governing law to the extraction checklist. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Add encryption, BC/DR, StateRAMP, and certification status tracking. Instruct the agent to follow linked sub-pages. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
These headers (COOP, COEP, CORP) are important for site isolation and commonly assessed in vendor security reviews. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Including examples for each category helps the LLM pick the right one, especially for ambiguous vendors. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Test parseDMARCTag, parseCSPDirectives, parseSPFPolicy, protocolName, and splitTrimmed. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
A slow or unresponsive page could hang browser tools indefinitely. Each tool now applies a 30-second context timeout before opening a Chrome tab. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
The tool now first probes the HTTP URL without following redirects to detect whether the site redirects to HTTPS, then checks security headers on the HTTPS response. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Address 23 review comments: handle wss:// in browser addr, retry truncated DNSKEY over TCP, use InsecureSkipVerify for SSL reporting, guard empty vendor category, validate URL schemes, bind HTTP requests to context, fix SPF case sensitivity and duplicate detection, encode HIBP domain param, reject empty link patterns, cap extracted links, disable CORS redirects, guard ReflectsOrigin, emit step_failed event, emit vendor-updated webhook, add CSP/CORS to risk summary, make DNS resolver configurable, and fix prompt examples. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Manually verify the certificate chain after connecting with InsecureSkipVerify so that untrusted or wrong-host certs are correctly reported as invalid while still returning details for expired certificates. Lowercase SPF records before parsing the all qualifier so mixed-case records like ~All are detected. Remove the JS-side .slice(0, 500) cap on extract_links to honor the tool's "extract all links" contract. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
Introduce a new assess_market_presence tool that identifies notable customers, case studies, and company size signals to evaluate vendor credibility during assessments. Signed-off-by: Bryan Frimin <bryan@getprobo.com>
f0e8b3a to
accabd5
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Comprehensive improvement to the vendor assessment agent framework, adding 14 commits covering security tools, safety mechanisms, data enrichment, and test coverage.
Security Tools: Added SPF email authentication checker and HTTP-to-HTTPS redirect detection. Extended headers check with Cross-Origin-*-Policy headers. Strengthened all 5 security tool tests.
Safety & Performance: Implemented 30-second timeouts on browser tools, fixed context cancellation propagation to Chrome tabs, and increased document text extraction from 8K to 32K characters.
Agent Configuration: Added bug_bounty_url and data_locations to VendorInfo for richer vendor profiling. Expanded all 6 system prompts with deeper criteria, risk scoring framework, and category reference examples.
Quality: 335 lines of new unit tests covering pure functions in security tools (parseDMARCTag, parseCSPDirectives, parseSPFPolicy, protocolName, splitTrimmed).
Summary by cubic
Replaces the single‑agent vendor assessment with a multi‑agent vetting system that uses browser, security, and search tools to produce a sourced markdown report with risk ratings and safely update vendor records. Also returns the report and subprocessors via GraphQL and adds a
vendor assessCLI.New Features
vetting(crawler, security, document analyzer, compliance, market presence, data processing, incident response, financial stability, professional standing, regulatory, vendor comparison, web research) to generate a full markdown assessment with an overall risk rating and per‑category summary.agentprogress types with nested steps and randomized messages;step_failedevents included.vetting.Assessorto persist non‑empty vendor updates and emit a vendor‑updated webhook; GraphQL mutation now returnsreportand parsedsubprocessors; newvendor assessCLI command.bug_bounty_urlanddata_locations.WithThinkingbudget for reasoning models, wired intoanthropicand OpenAI providers.Bug Fixes
wss://Chrome address; validate URL schemes; bind HTTP requests to context; SPF parsing fixes (case sensitivity, de‑dupe, mixed‑caseall); DNSSEC TCP fallback on truncation; URL‑encode HIBP params; reject empty link patterns; remove hard cap on extracted links; disable CORS redirects and guard reflected origin; guard empty vendor category; verify certificate chain/hostname afterInsecureSkipVerifyto correctly flag wrong‑host/untrusted certs while still reporting expired certs.Written for commit accabd5. Summary will update on new commits.