Skip to content

ci: pin actions to SHA and run full test suite on PRs#131

Open
zeevmoney wants to merge 4 commits into
mainfrom
per-15306/ci-pin-actions-tests-on-pr
Open

ci: pin actions to SHA and run full test suite on PRs#131
zeevmoney wants to merge 4 commits into
mainfrom
per-15306/ci-pin-actions-tests-on-pr

Conversation

@zeevmoney

Copy link
Copy Markdown
Contributor
  • What kind of change does this PR introduce?

    CI / build hardening (chore). No SDK source changes. PR#1 of a 3-PR stack (next: AVA→Vitest migration with event-based waits; then comprehensive unit + e2e coverage).

    Linear: PER-15306

  • What is the current behavior?

    • GitHub Actions float on mutable major tags (@v4).
    • The backend-dependent tests (integration + e2e) only run for real in the release/publish workflow. The PR/push CI runs yarn test with a dummy PDP_API_KEY=test and no backend, so they can't actually pass on PRs.
    • CI node matrix is 18/20 (18 is EOL).
  • What is the new behavior?

    • Actions pinned to full commit SHAs at latest stable, with version comments, in both workflows:
      • actions/checkout@9c091bb…0e0 # v7.0.0
      • actions/setup-node@48b55a0…41e # v6.4.0
      • persist-credentials: false on every checkout (verified safe in the publish workflow — it only makes local commits, nothing is pushed).
    • Full test suite runs on PRs/pushes, not only on release. The test job branches:
      • Same-repo PRs / pushes: provision a throwaway Permit env via the PROJECT_API_KEY secret, run a dockerized permitio/pdp-v2 (with -e PDP_API_KEY/-e PERMIT_API_KEY and a /healthy readiness wait), run test:ci:full (unit + integration + module-imports + e2e rbac/rebac/local_facts) against API_TIER=prod, and delete the env on always().
      • Fork PRs / secret-less runs (e.g. Dependabot): fall back to test:ci:unit (unit + module-imports — genuinely no backend needed).
    • A same-repo run that is missing PROJECT_API_KEY now fails fast instead of silently downgrading to unit-only and reporting a misleading green.
    • Concurrency (cancel-in-progress per ref), least-privilege permissions: contents: read, and a lint-and-build → matrix test (20/22) job split.
    • New scripts: test:ci:unit, test:ci:full.
  • Other information:

    How it was tested

    • actionlint clean on both workflows (also fixed a pre-existing unquoted $GITHUB_ENV in the publish workflow).
    • SHAs re-resolved live against the GitHub API; yarn build + package.json JSON validity confirmed.
    • The new test job's full path can only be exercised once merged to a branch with the secret — review the workflow logic accordingly.

    Known risk (resolved by the next stacked PR)

    • The e2e suites still use hardcoded sleep(10s) waits for cloud→PDP propagation. This PR is the first to run them in CI (against a cold, freshly-provisioned env), so they may be intermittently flaky until PR#2 replaces the timers with event/poll-based waits. AVA's failFast: true means one flake aborts the suite — expect possible re-runs until PR#2 lands.

    Deliberately deferred to PR#2 (the test-quality / Vitest migration PR)

    • Two orphaned specs run in no path today: src/tests/bulk_test.spec.ts and src/tests/e2e/lists.e2e.spec.ts. They're kept (valuable bulk-API and list-pagination coverage), not deleted. Vitest's *.spec.ts auto-discovery will pick them up, and lists.e2e's absolute-count assertions (which collide in a shared env) will be reworked there.
    • Quoting the ** globs in test:integration / test:module-imports (moot after the Vitest migration).

    Deferred follow-ups (noted, not in this PR)

    • Factor the temp-env provisioning into a shared composite action used by both ci.yaml and the publish workflow.
    • Add -e PDP_API_KEY/-e PERMIT_API_KEY to the publish workflow's PDP (it never runs e2e today, so it isn't broken).
    • A scheduled sweeper to delete any leaked node-sdk-ci-* envs from cancelled runs.

Generated with Claude Code

Pin actions/checkout (v7.0.0) and actions/setup-node (v6.4.0) to full
commit SHAs in both workflows, set persist-credentials: false on all
checkouts, and bump the CI node matrix from 18/20 to 20/22 (18 is EOL).

Run the full suite on PRs/pushes, not only on release. Same-repo events
provision a throwaway Permit env via PROJECT_API_KEY, run a dockerized
PDP (now with -e PDP_API_KEY/PERMIT_API_KEY and a /healthy readiness
wait), execute test:ci:full, and delete the env on always(). Fork and
secret-less runs fall back to the no-backend test:ci:unit suite. Add the
two supporting scripts and quote $GITHUB_ENV in the publish workflow.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@linear-code

linear-code Bot commented Jun 28, 2026

Copy link
Copy Markdown

PER-15306

zeevmoney and others added 3 commits June 28, 2026 23:40
The e2e suites are AVA with fixed sleep(10s) waits and fail fast on the
first error. Against a freshly started PDP they hit a momentary
ECONNREFUSED window right after the write burst (OPA reload), which kills
the whole run even though the env, key, and policy sync are all healthy
(/healthy passes). Scope the PR backend run to the suite that reliably
passes — unit + integration + module-imports (what `yarn test` runs, the
same set the publish workflow runs). The event-based, error-tolerant e2e
lands in the stacked test-migration PR, which re-includes e2e in CI.

Also add a PDP diagnostics step (docker logs + container state +
/healthy) on backend-run failure so PDP connection errors, which surface
with no HTTP response, are debuggable.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Stacked PRs target feature branches, so a pull_request filter of
branches:[main] meant they never ran CI. Drop the base-branch filter so
every PR is tested regardless of base.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Node resolves `localhost` to ::1 (IPv6) first, but the GitHub runner's
Docker IPv6 port publish refuses connections, so e2e permit.check() calls
hit ECONNREFUSED even though the PDP is healthy on IPv4 (curl /healthy
returns 200). Set PDP_URL to http://127.0.0.1:7766 so the SDK uses the
working IPv4 path, and pin the readiness probe to 127.0.0.1 too.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens CI for the Node SDK by pinning GitHub Actions to immutable SHAs and expanding PR/push validation to run the full backend-dependent test suite when secrets are available, while falling back to unit-only checks for forked PRs/secret-less runs.

Changes:

  • Add CI-focused scripts (test:ci:unit, test:ci:full) to support running either unit-only or full (backend + e2e) suites.
  • Update the main CI workflow to split lint-and-build from a Node matrix test job, provision a temporary Permit environment + local PDP for same-repo runs, and enforce least-privilege permissions + concurrency cancellation.
  • Pin Actions used in CI and publish workflows to full commit SHAs and harden checkout settings (persist-credentials: false), plus fix quoting of $GITHUB_ENV writes in publish.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
package.json Adds CI-specific test scripts used by the updated workflow.
.github/workflows/ci.yaml Runs full backend test suite for same-repo PRs/pushes (with temp env + PDP), unit-only for forks; adds concurrency, permissions, and pinned actions.
.github/workflows/node_sdk_publish.yaml Pins actions to SHAs, disables persisted credentials on checkout, and fixes $GITHUB_ENV quoting.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/workflows/ci.yaml
Comment on lines +118 to +121
docker run -d --name pdp -p 7766:7000 \
-e PDP_API_KEY="$ENV_API_KEY" \
-e PERMIT_API_KEY="$ENV_API_KEY" \
permitio/pdp-v2:latest
Comment thread .github/workflows/ci.yaml
Comment on lines +167 to +168
echo "::group::curl -sv http://localhost:7766/healthy"
curl -sv http://localhost:7766/healthy || true
Comment thread package.json
"test:e2e:rebac": "run-s build && ava --verbose build/tests/e2e/rebac.e2e.spec.js",
"test:e2e:local_facts": "run-s build && ava --verbose build/tests/e2e/local_facts.e2e.spec.js",
"test:ci:unit": "run-s test:unit test:module-imports",
"test:ci:full": "run-s test",

@dshoen619 dshoen619 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review summary

Solid hardening overall — SHA pins verified correct (checkout@9c091bb… = v7.0.0, setup-node@48b55a0… = v6.4.0), secret masking is ordered correctly (::add-mask:: runs before the $GITHUB_ENV write and before the PDP starts), fork-PR safety is right (pull_request, not pull_request_target; the gating expression semantics check out), and the least-privilege permissions, per-ref concurrency, the lint-and-build → test split, dropping EOL Node 18, and the $GITHUB_ENV quoting fix are all good.

One blocking issue (inline on package.json): test:ci:fullrun-s test:* does not match the test:e2e:* scripts — npm-run-all treats : as a path separator and a single * doesn't cross it — so the e2e suites silently never run, which is this PR's headline goal. Verified empirically against this repo's npm-run-all@4.1.5. As a side effect, the PDP machinery in ci.yaml is currently unexercised by the suites that actually run.

Inline comments cover: the e2e wiring (blocking), the Dependabot gate (latent), the :latest PDP tag, the misleading IPv4 diagnostic, swallowed curl errors, and the env-leak window.

Pre-existing items in node_sdk_publish.yaml (out of scope for this PR, noted for a follow-up):

  • Lines 99/106 — github.event.release.tag_name is interpolated straight into the run: shell; a crafted tag name is code-exec (requires release-create access). Pass it through an env: var.
  • Lines 74–79 — the publish PDP gets no -e PDP_API_KEY/-e PERMIT_API_KEY and no readiness wait. Harmless today (as the PR notes — yarn test there doesn't exercise the PDP either), but fix when e2e actually runs.
  • Line 31 — npm install -g npm@latest is unpinned.

Reasoning cross-checked with the javascript-typescript and backend-development review agents. Leaving as comments (not a formal block); happy to convert any of these to issues or apply the fixes if useful.

Comment thread package.json
"test:e2e:rebac": "run-s build && ava --verbose build/tests/e2e/rebac.e2e.spec.js",
"test:e2e:local_facts": "run-s build && ava --verbose build/tests/e2e/local_facts.e2e.spec.js",
"test:ci:unit": "run-s test:unit test:module-imports",
"test:ci:full": "run-s test",

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Blocking — test:ci:full never runs the e2e suites (the PR's headline goal).

run-s testrun-s test:*, and npm-run-all treats : as a path separator: a single * does not cross it. Verified empirically with this repo's npm-run-all@4.1.5:

run-s test:*  →  test:unit, test:integration, test:module-imports     # test:e2e:* NOT matched

So test:ci:full runs unit + integration + module-imports only — test:e2e:rbac/rebac/local_facts silently never execute, and the "flaky e2e sleeps" risk is moot because they don't run. Enumerate explicitly:

Suggested change
"test:ci:full": "run-s test",
"test:ci:full": "run-s test:unit test:integration test:module-imports test:e2e:rbac test:e2e:rebac test:e2e:local_facts",

⚠️ Do not "fix" this with run-s test:**** does cross :, so it would match test:ci:full/test:ci:unit and infinite-loop.

Two adjacent notes (lines 34–35, outside this hunk): test:integration and test:module-imports pass their ** globs unquoted, unlike test:unit — works only because those dirs are flat today; add a subdirectory and the shell pre-expands the glob and silently drops the top-level spec. Quote them like test:unit. Also, each test:* re-runs build, so test:ci:full rebuilds 3×.

Comment thread .github/workflows/ci.yaml
# Backend suite runs only for same-repo events AND when the provisioning
# secret is available. Fork PRs and secret-less runs (e.g. Dependabot)
# fall back to the no-backend suite.
RUN_BACKEND: ${{ (github.event_name == 'push' || github.event.pull_request.head.repo.full_name == github.repository) && secrets.PROJECT_API_KEY != '' }}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 This gate will block Dependabot (latent today).

Dependabot PRs are same-repo (head.repo.full_name == github.repository → true) but get no access to regular secrets, so RUN_BACKEND evaluates to false and the "Require backend secret" step (line 67) hits exit 1. There's no .github/dependabot.yml yet, so it's unreachable today — but it breaks the moment Dependabot (or any same-repo bot without the secret) is enabled. Exempt the actor:

Suggested change
RUN_BACKEND: ${{ (github.event_name == 'push' || github.event.pull_request.head.repo.full_name == github.repository) && secrets.PROJECT_API_KEY != '' }}
RUN_BACKEND: ${{ (github.event_name == 'push' || (github.event.pull_request.head.repo.full_name == github.repository && github.actor != 'dependabot[bot]')) && secrets.PROJECT_API_KEY != '' }}

Apply the same && github.actor != 'dependabot[bot]' to the gate if: on line 67.

Comment thread .github/workflows/ci.yaml
-d "{\"key\":\"${ENV_KEY}\",\"name\":\"${ENV_KEY}\"}")
env_id=$(printf '%s' "$response" | jq -r '.id // empty')
if [ -z "$env_id" ]; then
echo "Failed to create env (key=${ENV_KEY})." >&2

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low — error responses are swallowed. curl -sS exits 0 on HTTP 4xx/5xx and only .id presence is checked, so a 401/429 surfaces as a bare "Failed to create env" with no context. Echo the response (safe — PROJECT_API_KEY is in the request header, not the body):

Suggested change
echo "Failed to create env (key=${ENV_KEY})." >&2
echo "Failed to create env (key=${ENV_KEY}). Response: ${response}" >&2

Same applies to the "Fetch env API key" step (line 106).

Comment thread .github/workflows/ci.yaml
docker run -d --name pdp -p 7766:7000 \
-e PDP_API_KEY="$ENV_API_KEY" \
-e PERMIT_API_KEY="$ENV_API_KEY" \
permitio/pdp-v2:latest

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 permitio/pdp-v2:latest is a mutable tag — inconsistent with the SHA-pinning discipline this PR otherwise applies. A silent PDP image update changes CI behavior with no code change, and a compromised image would receive ENV_API_KEY at runtime. Pin to a digest (permitio/pdp-v2@sha256:<digest>) and bump it intentionally. (The publish workflow uses the same mutable tag.)

Comment thread .github/workflows/ci.yaml
# Docker IPv6 publish refuses connections, so PDP checks would hit
# ECONNREFUSED. 127.0.0.1 pins the working IPv4 path.
PDP_URL: http://127.0.0.1:7766
run: yarn test:ci:full

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tied to the package.json comment: because test:ci:full doesn't include the e2e suites, this step runs unit + integration + module-imports only. The integration/endpoints suite (src/tests/endpoints/test-environments.spec.ts) makes control-plane API calls only and never calls permit.check() — so the dockerized PDP, the /healthy wait, and the IPv4 PDP_URL pin above are provisioned but never actually queried by anything that runs here. Once the e2e wiring is fixed in package.json, this all becomes load-bearing.

Comment thread .github/workflows/ci.yaml
docker logs pdp || true
echo "::endgroup::"
echo "::group::curl -sv http://localhost:7766/healthy"
curl -sv http://localhost:7766/healthy || true

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Diagnostic uses localhost, defeating the IPv4 workaround. The readiness loop deliberately uses 127.0.0.1 (the whole point of the PDP_URL pin), but this on-failure diagnostic uses localhost → resolves to ::1 → reports connection-refused even when the PDP is healthy on IPv4, i.e. it misleads exactly when you need it.

Suggested change
curl -sv http://localhost:7766/healthy || true
curl -sv http://127.0.0.1:7766/healthy || true

Comment thread .github/workflows/ci.yaml

# ---------- Cleanup (always, even on failure/cancel) ----------
- name: Delete temp Permit env
if: ${{ always() && steps.provision.outputs.env_id != '' }}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note (acknowledged in the PR): cleanup keys off steps.provision.outputs.env_id, so if cancel-in-progress kills the provision step after the env is created server-side but before env_id is written to $GITHUB_OUTPUT, the env leaks (up to 2/run with the matrix). The deferred sweeper is the right fix — worth prioritizing since the leak rate scales with PR/push volume.

uses: actions/checkout@v4
uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with:
persist-credentials: false

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

persist-credentials: false ✅ — confirmed safe: this workflow makes only local commits and never git pushes. One pre-existing corollary (out of scope, just noting): because nothing is pushed, the git commit -m "update tsdoc" on line 43 has always been discarded at job end — a dead write. Either add a push or drop the commit step.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants