-
Notifications
You must be signed in to change notification settings - Fork 2k
feat(review): enhance /review with deterministic analysis, autofix, and security hardening #2932
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
wenshao
wants to merge
100
commits into
main
Choose a base branch
from
feat/review-skill-improvements
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 79 commits
Commits
Show all changes
100 commits
Select commit
Hold shift + click to select a range
d75d176
feat(review): enhance /review with deterministic analysis, build/test…
wenshao 1cbffa4
fix(review): address review feedback on /review skill
wenshao 9a583c9
fix(review): address Copilot review comments
wenshao 54f80a4
fix(review): address second round of Copilot review comments
wenshao 7dd3466
fix(review): add edit tool to allowedTools for autofix
wenshao 034557a
fix(review): address third round of Copilot review comments
wenshao 2525791
fix(review): address fourth round of Copilot review comments
wenshao cb8062f
feat(review): add --comment tip after PR review verdict
wenshao 710d512
fix(review): address fifth round of Copilot review comments
wenshao ab0259c
fix(review): restore environment on early exit paths
wenshao 48340c3
fix(review): preserve highest severity during dedup merge
wenshao fe3d596
fix(review): include both staged and unstaged files in local review
wenshao fab4dc5
feat(review): add model attribution to PR review summary
wenshao 3ac5458
fix(review): address Copilot comments and fix CI build failure
wenshao 71733b5
fix(review): fix misleading test name and comment
wenshao a3b3cbe
fix(review): exclude deleted files from linter target list
wenshao 687d035
fix(review): restrict low-confidence findings to terminal only
wenshao d03a7aa
fix(review): clarify PR comment prefix template format
wenshao e4b8d19
fix(review): use actual PR number in --comment tip
wenshao 1d185a1
fix(review): avoid re-running review when posting PR comments
wenshao 0b290e7
fix(review): fix comment template, build precedence, and timeout unit
wenshao 5ff22dc
feat(review): add commit tip for local reviews with Approve verdict
wenshao fe83b82
feat(review): add "fix these issues" follow-up tip
wenshao ac179b0
docs: add /review user documentation
wenshao 95a62da
docs: fix review doc accuracy and remove non-existent /simplify
wenshao 50d2573
feat(review): add reverse audit step to find coverage gaps
wenshao 757bd98
feat(review): model-aware incremental cache for cross-model review
wenshao ba8a3a7
fix(review): fix section numbering, nav entry, and step hierarchy
wenshao 5effbb6
feat(review): read existing PR comments to avoid duplicate feedback
wenshao e502dee
fix(review): add ms values to Step 1.5 timeout specification
wenshao dd2de17
feat(review): use ephemeral worktree for PR reviews
wenshao a5cc2c3
fix(review): fix 5 worktree issues found in audit
wenshao 08a797c
fix(review): address 4 Copilot comments
wenshao e65e5bd
perf(review): replace N verification agents with single batch verific…
wenshao 9b9bccd
docs: update user doc with token efficiency, fix follow-up table
wenshao 295e907
fix(review): address 4 Copilot comments on worktree and verification
wenshao 1db1dec
docs: replace step list with Mermaid flowchart in How It Works
wenshao ce47f64
docs: replace Mermaid with plain-text pipeline diagram
wenshao 6b28920
docs: fix autofix section redundancy and add pre-fix verdict note
wenshao d6b9b35
fix(review): handle interrupted review cleanup
wenshao 6d7afaf
perf(review): skip verification for reverse audit findings
wenshao ccaab17
docs: add /review design document with architecture rationale
wenshao 09c95c4
fix(review): address 5 Copilot comments
wenshao fd3d2a8
docs: add Fork Subagent as future optimization in DESIGN.md
wenshao e5127e6
fix(review): shell safety for no-findings path, fix DESIGN.md table
wenshao b4ae04e
fix(review): capture headRefOid before autofix to prevent line drift
wenshao 01c105b
refactor(review): renumber steps from sub-steps to sequential 1-11
wenshao af4ae2c
fix: update stale Step 1.5 reference to Step 3 in DESIGN.md
wenshao afbede1
fix(review): detect and reject cross-repo PR URLs
wenshao 0e78ced
fix(review): support cross-repo PR URLs in lightweight mode
wenshao 559f2ef
fix(review): fix cross-repo mode and add documentation
wenshao 8759445
fix(review): explicitly require parallel agent launch in single batch
wenshao 6596362
fix(review): 4 issues found in self-audit
wenshao 720796e
fix(review): safe branch cleanup + cross-repo agent count
wenshao 3825962
fix(review): Step 9 owner/repo must respect cross-repo mode
wenshao b2ec7b0
fix(review): check all git remotes for cross-repo detection
wenshao cbf2aa0
feat(review): add Java and C/C++ support for deterministic analysis
wenshao f357517
fix(review): temp file cleanup in cross-repo + base-branch remote
wenshao cb359dc
fix(review): default remote for PR numbers + fork-aware autofix push
wenshao 2595567
feat(review): auto-discover lint/build/test from CI config
wenshao 8f72357
docs: add CI config auto-discovery to user doc and design doc
wenshao dcb5854
feat(review): model attribution on inline comments + minimal summary
wenshao 57b7353
fix(review): skip redundant review summary when inline comments suffice
wenshao 44241bc
fix(review): 3 issues from self-audit
wenshao 226c9e2
fix(review): CI config discovery runs for ALL projects, not just unma…
wenshao b9d3602
fix(review): 3 Copilot comments — conditional cleanup, italic format,…
wenshao 88ff2ac
fix(review): prefer Maven/Gradle wrappers (./mvnw, ./gradlew)
wenshao e96de40
fix(review): handle partial inline comment failures + accept *italic*
wenshao 446d309
fix(review): pragmatic parallel agent launch instruction
wenshao 205eefe
Merge remote-tracking branch 'origin/main' into feat/review-skill-imp…
wenshao 608550f
fix(review): restore strict parallel agent launch instruction
wenshao c826c24
fix(review): skip Agent 5 in cross-repo mode, update token counts
wenshao 776c9fa
fix(review): enforce short agent prompts to enable parallel launch
wenshao 0c5aeb9
docs: record parallel execution lessons in DESIGN.md
wenshao e671cd3
fix(review): make Maven/Gradle wrapper selection explicit
wenshao 62ca195
fix(review): inline comment footer should include full attribution
wenshao 96fa7ac
fix(review): enforce two-phase posting to prevent duplicate reviews
wenshao 13bac8b
fix(review): make skip-summary rule more prominent with warning emoji
wenshao 60eba65
fix(review): 3 logic holes in Step 9 comment posting flow
wenshao 4091210
fix(review): move incremental check before worktree creation
wenshao a9038a1
fix(review): 5 issues — CI security, incremental+comment, doc accuracy
wenshao 009ac9a
fix(review): enforce exact footer template to prevent LLM rewriting
wenshao 6b54aff
fix(review): add reportPath to cache for --comment on unchanged PR
wenshao 3e5582a
fix(review): simplify --comment on unchanged PR — just re-run review
wenshao 2717aa1
feat(review): add "post comments" tip for zero-findings PR review
wenshao 820191d
docs: add zero-findings PR tip to follow-up table
wenshao 297d81e
fix(review): enforce one-line summary body for Approve/Request changes
wenshao 16640e9
Merge remote-tracking branch 'origin/main' into feat/review-skill-imp…
wenshao ea2f8e1
feat(review): match review language to PR language
wenshao 2824bf6
fix(review): avoid #N notation in PR comments (GitHub auto-links)
wenshao 3982df5
fix(review): PR summary should not repeat inline comments or show stats
wenshao 5a41e17
fix(review): clarify Critical vs Suggestion severity boundary
wenshao 3f91c1d
feat(review): use GitHub suggestion blocks for one-click fixes
wenshao 1a93078
fix(review): suggestion blocks supported by GitLab/Gitea too, not jus…
wenshao 664bc34
fix(review): fix broken nested code fence in comment template
wenshao ab72b5e
fix(review): simplify comment template — example-first, rules after
wenshao bb85768
fix(review): detect old review comments + require line numbers
wenshao 97aacfa
fix(review): enforce correct verdict flag + exclude Nice to have from PR
wenshao d4b96c8
feat(review): use Create Review API for single-call review submission
wenshao a83877e
fix(review): findings MUST go in comments array, NOT in body
wenshao File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,277 @@ | ||
| # Code Review | ||
|
|
||
| > Review code changes for correctness, security, performance, and code quality using `/review`. | ||
|
|
||
| ## Quick Start | ||
|
|
||
| ```bash | ||
| # Review local uncommitted changes | ||
| /review | ||
|
|
||
| # Review a pull request (by number or URL) | ||
| /review 123 | ||
| /review https://github.com/org/repo/pull/123 | ||
|
|
||
| # Review and post inline comments on the PR | ||
| /review 123 --comment | ||
|
|
||
| # Review a specific file | ||
| /review src/utils/auth.ts | ||
| ``` | ||
wenshao marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| If there are no uncommitted changes, `/review` will let you know and stop — no agents are launched. | ||
|
|
||
| ## How It Works | ||
|
|
||
| The `/review` command runs a multi-stage pipeline: | ||
|
|
||
| ``` | ||
| Step 1: Determine scope (local diff / PR worktree / file) | ||
| Step 2: Load project review rules | ||
| Step 3: Run deterministic analysis (linter, typecheck) [zero LLM cost] | ||
| Step 4: 5 parallel review agents [5 LLM calls] | ||
| |-- Agent 1: Correctness & Security | ||
| |-- Agent 2: Code Quality | ||
| |-- Agent 3: Performance & Efficiency | ||
| |-- Agent 4: Undirected Audit | ||
| '-- Agent 5: Build & Test (runs shell commands) | ||
| Step 5: Deduplicate --> Batch verify --> Aggregate [1 LLM call] | ||
| Step 6: Reverse audit (find coverage gaps) [1 LLM call] | ||
| Step 7: Present findings + verdict | ||
| Step 8: Autofix (user-confirmed, optional) | ||
| Step 9: Post PR inline comments (if requested) | ||
| Step 10: Save report + incremental cache | ||
| Step 11: Clean up (remove worktree + temp files) | ||
| ``` | ||
|
|
||
| ### Review Agents | ||
|
|
||
| | Agent | Focus | | ||
| | --------------------------------- | ------------------------------------------------------------------ | | ||
| | Agent 1: Correctness & Security | Logic errors, null handling, race conditions, injection, XSS, SSRF | | ||
| | Agent 2: Code Quality | Style consistency, naming, duplication, dead code | | ||
| | Agent 3: Performance & Efficiency | N+1 queries, memory leaks, unnecessary re-renders, bundle size | | ||
| | Agent 4: Undirected Audit | Business logic, boundary interactions, hidden coupling | | ||
| | Agent 5: Build & Test | Runs build and test commands, reports failures | | ||
|
|
||
| All agents run in parallel. All findings are then verified in a **single batch verification pass** (one agent reviews all findings at once, keeping LLM calls fixed regardless of finding count). After verification, a **reverse audit agent** re-reads the entire diff with knowledge of all confirmed findings to catch issues that every other agent missed. Reverse audit findings skip verification (the agent already has full context) and are included directly as high-confidence results. | ||
wenshao marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| ## Deterministic Analysis | ||
|
|
||
| Before the LLM agents run, `/review` automatically runs your project's existing linters and type checkers: | ||
|
|
||
| | Language | Tools detected | | ||
| | --------------------- | ---------------------------------------------------------------- | | ||
| | TypeScript/JavaScript | `tsc --noEmit`, `npm run lint`, `eslint` | | ||
| | Python | `ruff`, `mypy`, `flake8` | | ||
| | Rust | `cargo clippy` | | ||
| | Go | `go vet`, `golangci-lint` | | ||
| | Java | `mvn compile`, `checkstyle`, `spotbugs`, `pmd` | | ||
| | C/C++ | `clang-tidy` (if `compile_commands.json` available) | | ||
| | Other | Auto-discovered from CI config (`.github/workflows/*.yml`, etc.) | | ||
|
|
||
| For projects that don't match standard patterns (e.g., OpenJDK), `/review` reads CI configuration files to discover what lint/check commands the project uses. No user configuration needed. | ||
|
|
||
| Deterministic findings are tagged with `[linter]` or `[typecheck]` and skip LLM verification — they are ground truth. | ||
|
|
||
| - **Errors** → Critical severity | ||
| - **Warnings** → Nice to have (terminal only, not posted as PR comments) | ||
|
|
||
| If a tool is not installed or times out, it is skipped with an informational note. | ||
|
|
||
| ## Severity Levels | ||
|
|
||
| | Severity | Meaning | Posted as PR comment? | | ||
| | ---------------- | ------------------------------------------------------------------- | -------------------------- | | ||
| | **Critical** | Must fix before merging (bugs, security, data loss, build failures) | Yes (high-confidence only) | | ||
| | **Suggestion** | Recommended improvement | Yes (high-confidence only) | | ||
| | **Nice to have** | Optional optimization | No (terminal only) | | ||
|
|
||
| Low-confidence findings appear in a separate "Needs Human Review" section in the terminal and are never posted as PR comments. | ||
|
|
||
| ## Autofix | ||
|
|
||
| After presenting findings, `/review` offers to auto-apply fixes for Critical and Suggestion findings that have clear solutions: | ||
|
|
||
| ``` | ||
| Found 3 issues with auto-fixable suggestions. Apply auto-fixes? (y/n) | ||
| ``` | ||
|
|
||
| - Fixes are applied using the `edit` tool (targeted replacements, not full-file rewrites) | ||
| - Per-file linter checks run after fixes to verify they don't introduce new issues | ||
| - For PR reviews, fixes are committed and pushed from the worktree automatically — your working tree stays clean | ||
| - Nice to have and low-confidence findings are never auto-fixed | ||
| - PR review submission always uses the **pre-fix verdict** (e.g., "Request changes") since the remote PR hasn't been updated until the autofix push completes | ||
|
|
||
| ## Worktree Isolation | ||
|
|
||
| When reviewing a PR, `/review` creates a temporary git worktree (`.qwen/tmp/review-pr-<number>`) instead of switching your current branch. This means: | ||
|
|
||
| - Your working tree, staged changes, and current branch are **never touched** | ||
| - Dependencies are installed in the worktree (`npm ci`, etc.) so linting and build/test work | ||
| - Build and test commands run in isolation without polluting your local build cache | ||
| - If anything goes wrong, your environment is unaffected — just delete the worktree | ||
| - The worktree is automatically cleaned up after the review completes | ||
| - If a review is interrupted (Ctrl+C, crash), the next `/review` of the same PR automatically cleans up the stale worktree before starting fresh | ||
| - Review reports and cache are saved to the main project directory (not the worktree) | ||
|
|
||
| ## Cross-repo PR Review | ||
|
|
||
| You can review PRs from other repositories by passing the full URL: | ||
|
|
||
| ```bash | ||
| /review https://github.com/other-org/other-repo/pull/456 | ||
| ``` | ||
|
|
||
| This runs in **lightweight mode** — no worktree, no linter, no build/test, no autofix. The review is based on the diff text only (fetched via GitHub API). PR comments can still be posted if you have write access. | ||
|
|
||
| | Capability | Same-repo | Cross-repo | | ||
| | ------------------------------------------------ | --------- | ----------------------------- | | ||
| | LLM review (Agents 1-4 + verify + reverse audit) | ✅ | ✅ | | ||
| | Agent 5: Build & test | ✅ | ❌ (no local codebase) | | ||
| | Deterministic analysis (linter/typecheck) | ✅ | ❌ | | ||
| | Cross-file impact analysis | ✅ | ❌ | | ||
| | Autofix | ✅ | ❌ | | ||
| | PR inline comments | ✅ | ✅ (if you have write access) | | ||
| | Incremental review cache | ✅ | ❌ | | ||
|
|
||
| ## PR Inline Comments | ||
|
|
||
| Use `--comment` to post findings directly on the PR: | ||
|
|
||
| ```bash | ||
| /review 123 --comment | ||
| ``` | ||
|
|
||
| Or, after running `/review 123`, type `post comments` to publish findings without re-running the review. | ||
|
|
||
| **What gets posted:** | ||
|
|
||
| - High-confidence Critical and Suggestion findings as inline comments on specific lines | ||
| - A review summary with verdict (Approve / Request changes / Comment) | ||
wenshao marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| - Model attribution footer (e.g., _Reviewed by qwen3-coder via Qwen Code /review_) | ||
|
|
||
| **What stays terminal-only:** | ||
|
|
||
| - Nice to have findings (including linter warnings) | ||
| - Low-confidence findings | ||
|
|
||
| ## Follow-up Actions | ||
|
|
||
| After the review, context-aware tips appear as ghost text. Press Tab to accept: | ||
|
|
||
| | State after review | Tip | What happens | | ||
| | ---------------------------------- | ------------------ | --------------------------------------- | | ||
| | Local review with unfixed findings | `fix these issues` | LLM interactively fixes each finding | | ||
| | PR review with findings | `post comments` | Posts PR inline comments (no re-review) | | ||
| | Local review, all clear | `commit` | Commits your changes | | ||
|
|
||
| Note: `fix these issues` is only available for local reviews. For PR reviews, use Autofix (Step 8) — the worktree is cleaned up after the review, so post-review interactive fixing is not possible. | ||
|
|
||
| ## Project Review Rules | ||
|
|
||
| You can customize review criteria per project. `/review` reads rules from these files (in order): | ||
|
|
||
| 1. `.qwen/review-rules.md` (Qwen Code native) | ||
| 2. `.github/copilot-instructions.md` (preferred) or `copilot-instructions.md` (fallback — only one is loaded, not both) | ||
| 3. `AGENTS.md` — `## Code Review` section | ||
| 4. `QWEN.md` — `## Code Review` section | ||
|
|
||
| Rules are injected into the LLM review agents (1-4) as additional criteria. For PR reviews, rules are read from the **base branch** to prevent a malicious PR from injecting bypass rules. | ||
|
|
||
| Example `.qwen/review-rules.md`: | ||
|
|
||
| ```markdown | ||
| # Review Rules | ||
|
|
||
| - All API endpoints must validate authentication | ||
| - Database queries must use parameterized statements | ||
| - React components must not use inline styles | ||
| - Error messages must not expose internal paths | ||
| ``` | ||
|
|
||
| ## Incremental Review | ||
|
|
||
| When reviewing a PR that was previously reviewed, `/review` only examines changes since the last review: | ||
|
|
||
| ```bash | ||
| # First review — full review, cache created | ||
| /review 123 | ||
|
|
||
| # PR updated with new commits — only new changes reviewed | ||
| /review 123 | ||
| ``` | ||
|
|
||
| ### Cross-model review | ||
|
|
||
| If you switch models (via `/model`) and re-review the same PR, `/review` detects the model change and runs a full review instead of skipping: | ||
|
|
||
| ```bash | ||
| # Review with model A | ||
| /review 123 | ||
|
|
||
| # Switch model | ||
| /model | ||
|
|
||
| # Review again — full review with model B (not skipped) | ||
| /review 123 | ||
| # → "Previous review used qwen3-coder. Running full review with gpt-4o for a second opinion." | ||
| ``` | ||
|
|
||
| Cache is stored in `.qwen/review-cache/` and tracks both the commit SHA and model ID. Make sure this directory is in your `.gitignore` (a broader rule like `.qwen/*` also works). If the cached commit was rebased away, it falls back to a full review. | ||
|
|
||
| ## Review Reports | ||
|
|
||
| Every review is saved as a Markdown file in your project's `.qwen/reviews/` directory: | ||
|
|
||
wenshao marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ``` | ||
| .qwen/reviews/2026-04-06-143022-pr-123.md | ||
| .qwen/reviews/2026-04-06-150510-local.md | ||
| ``` | ||
|
|
||
| Reports include: timestamp, diff stats, deterministic analysis results, all findings with verification status, and the verdict. | ||
|
|
||
| ## Cross-file Impact Analysis | ||
|
|
||
| When code changes modify exported functions, classes, or interfaces, the review agents automatically search for all callers and check compatibility: | ||
|
|
||
| - Parameter count/type changes | ||
| - Return type changes | ||
| - Removed or renamed public methods | ||
| - Breaking API changes | ||
|
|
||
| For large diffs (>10 modified symbols), analysis prioritizes functions with signature changes. | ||
|
|
||
| ## Token Efficiency | ||
|
|
||
| The review pipeline uses a fixed number of LLM calls regardless of how many findings are produced: | ||
|
|
||
| | Stage | LLM calls | Notes | | ||
| | ------------------------------- | ---------- | --------------------------------------------------- | | ||
| | Deterministic analysis (Step 3) | 0 | Shell commands only | | ||
| | Review agents (Step 4) | 5 (or 4) | Run in parallel; Agent 5 skipped in cross-repo mode | | ||
| | Batch verification (Step 5) | 1 | Single agent verifies all findings at once | | ||
| | Reverse audit (Step 6) | 1 | Finds coverage gaps; findings skip verification | | ||
| | **Total** | **7 or 6** | Same-repo: 7; cross-repo: 6 (no Agent 5) | | ||
|
|
||
| ## What's NOT Flagged | ||
|
|
||
| The review intentionally excludes: | ||
|
|
||
| - Pre-existing issues in unchanged code (focus on the diff only) | ||
| - Style/formatting/naming that matches your codebase conventions | ||
| - Issues a linter or type checker would catch (handled by deterministic analysis) | ||
| - Subjective "consider doing X" suggestions without a real problem | ||
| - Minor refactoring that doesn't fix a bug or risk | ||
| - Missing documentation unless the logic is genuinely confusing | ||
| - Issues already discussed in existing PR comments (avoids duplicating human feedback) | ||
|
|
||
| ## Design Philosophy | ||
|
|
||
| > **Silence is better than noise.** Every comment should be worth the reader's time. | ||
|
|
||
| - If unsure whether something is a problem → don't report it | ||
| - Linter/typecheck issues are handled by tools, not LLM guesses | ||
| - Same pattern across N files → aggregated into one finding | ||
| - PR comments are high-confidence only | ||
| - Style/formatting issues matching codebase conventions are excluded | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.