Skip to content

fix: redact credentials in logs at the model layer#921

Open
Zivxx wants to merge 13 commits into
masterfrom
fix/redact-credentials-in-logs
Open

fix: redact credentials in logs at the model layer#921
Zivxx wants to merge 13 commits into
masterfrom
fix/redact-credentials-in-logs

Conversation

@Zivxx

@Zivxx Zivxx commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Context

Follow-up to #902. A customer reported credentials still leaking into logs after that fix — this time via opal_client/data/updater.py: logger.exception(...) serialized the full DataSourceEntry, whose config carries auth headers and data the payload.

Rather than patch one line and wait for the next path, this addresses the root cause at the model layer and then sweeps the codebase (with multi-agent adversarial review) for every related vector.

1. Central fix — model-layer redaction

A small RedactedReprMixin (opal_common/logging_utils/redaction.py) overrides __repr_args__ so repr()/str() — and therefore loguru's serialize path — mask secret-bearing fields. The redaction set is unioned across the MRO, so a subclass can only ever add protection, never silently drop it.

Model Redacted fields
FetcherConfig / HttpFetcherConfig headers, data
FastApiRpcFetchConfig rpc_arguments
DataSourceEntry config, data
FetchEvent config

This protects every current and future log site at once, and propagates transitively to container models (DataUpdate, DataSourceConfig, DataUpdateReport).

Why it's safe (no functional impact)

__repr_args__ only feeds repr()/str(). Every functional path uses something else and is unaffected — verified empirically against pydantic 1.10 and 2.13:

  • Transport: update.dict(by_alias=True) / .json(); event reconstruction via .dict().
  • Hashing / dedup keys: calc_hash uses config.json(); repos_last_fetched keys on source id/url.
  • Equality / copy / schema / pickle / jsonable_encoder: none call __repr_args__.

2. URL credential redaction

DataSourceEntry.url and git repo URLs can embed credentials (https://user:token@host). Added http_utils.redact_url() / redact_url_in_text() and applied them to every URL log/exception in data/updater.py, data_update_publisher.py, repo_cloner.py, and git_fetcher.py. The functional clone_repository() call still gets the real URL.

3. Traceback leak — LOG_DIAGNOSE default flipped to False

loguru's diagnose=True renders raw local-variable values in tracebacks (e.g. a bare headers dict, or opa_client._get_auth_headers()'s Bearer token), which bypasses model-level redaction entirely. Default flipped TrueFalse (matches loguru's own production guidance); operators can opt back in for local debugging.

4. Verbose git tracing gated

git_utils/env.py forced GIT_TRACE=1 + GIT_CURL_VERBOSE=1 on every SSH clone, making git dump HTTP Authorization headers to captured stderr. Now only enabled when LOG_DIAGNOSE is on.

Spot fixes (belt-and-suspenders)

Sites that logged whole credential-bearing objects now log only safe, redacted fields: data/updater.py (×3), data_update_publisher.py, data/api.py (dropped the config model), fastapi_rpc_fetch_provider.py (stopped logging rpc_arguments).

⚠️ Reviewer note — one intentional visible change

The publish-data-update CLI (cli/commands.py:188) echoes str(update), which now renders config='<redacted>' data='<redacted>'. The published payload is unchanged (uses update.json()). Redacting the stdout echo is intentional — CLI output lands in shell history / CI logs.

Tests

opal-common/.../tests/redaction_test.py covers: repr/str redaction (incl. f-strings & nested DataUpdate), the RPC subclass, wire-payload integrity, loguru serialize=True and diagnose=True, the MRO-union footgun, and redact_url/redact_url_in_text. Verified locally against pydantic 1.10 + loguru: 58 tests pass (new + #902 reporter regression + opal-common + git_utils suites).

Review

Reviewed by a 4-agent adversarial panel (security/leak-completeness, pydantic-internals, behavioral-regression, code-quality); all findings above were surfaced and addressed.

🤖 Generated with Claude Code

Follow-up to #902. A customer reported credentials still leaking into logs,
this time via opal_client/data/updater.py (logger.exception serializing the
full DataSourceEntry, whose `config` carries auth headers and `data` the
payload). A full codebase scan found several more sites with the same pattern.

Root cause: log statements interpolate credential-bearing pydantic models
(`{entry}`, `{config}`, `model=...`) and loguru's serialize=True sink dumps
record["extra"] as JSON, falling back to str() for non-JSON objects.

Central fix: a RedactedReprMixin that overrides __repr_args__ on the
credential-bearing models so repr()/str() (and thus the loguru serialize path)
mask secret fields:
  - FetcherConfig / HttpFetcherConfig -> headers, data
  - DataSourceEntry                   -> config, data
  - FetchEvent                        -> config
This protects every current and future log site at once. Wire serialization
(.json()/.dict()), hashing (calc_hash uses config.json()), dedup keys, and
equality are untouched, so transport and behavior are unaffected.

Belt-and-suspenders spot fixes (log only safe fields):
  - opal-client data/updater.py (x3): log entry.url instead of the entry
  - opal-server data_update_publisher.py: log entry.url instead of the entry
  - opal-server data/api.py: drop the data_sources_config model from the error
  - opal-common fastapi_rpc_fetch_provider.py: stop logging rpc_arguments

Adds opal-common tests/redaction_test.py covering repr/str redaction,
wire-payload integrity, and the loguru serialize=True leak scenario.

Note for reviewers: the publish-data-update CLI (cli/commands.py:188) echoes
str(update), which now shows config/data as <redacted>. The published payload
itself (update.json(), line 180) is unaffected; redacting the stdout echo is
intentional since it can land in shell history / CI logs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@netlify

netlify Bot commented Jun 18, 2026

Copy link
Copy Markdown

Deploy Preview for opal-docs ready!

Name Link
🔨 Latest commit 5c54135
🔍 Latest deploy log https://app.netlify.com/projects/opal-docs/deploys/6a439fff61c03700085928df
😎 Deploy Preview https://deploy-preview-921--opal-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Zivxx and others added 4 commits June 18, 2026 19:08
… userinfo)

Addresses findings from a multi-agent adversarial review of the redaction fix:

- RedactedReprMixin now unions `_redacted_repr_fields` across the whole MRO, so
  a subclass can only ever ADD fields to redact - it can never silently drop a
  parent's protection by redeclaring/omitting the set. Prevents the exact
  failure mode below.
- FastApiRpcFetchConfig: redact `rpc_arguments` (a secret-bearing FetcherConfig
  subclass that was missed - a live leak of the same class as the original bug).
- Add http_utils.redact_url() to strip embedded `user:password@` credentials
  from URLs, and apply it to every `entry.url` log/exception in data/updater.py
  and data_update_publisher.py (DataSourceEntry.url is a free string and may be
  a basic-auth URL).
- Tests: add nested-container (DataUpdate), RPC-config, diagnose=True, and
  redact_url cases.
- Nits: fix pre-existing "uniqueue" typo, drop unused imports in schemas/data.py.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two leak vectors outside the model layer, found in the expert review:

1. LOG_DIAGNOSE defaulted to True. loguru's diagnose=True renders raw local
   variable VALUES in tracebacks (e.g. a bare headers dict, or the Bearer token
   returned by opa_client._get_auth_headers), bypassing the model-level repr
   redaction entirely. Default flipped to False (matches loguru's own
   production guidance); operators can still opt in for local debugging.

2. Git subsystem leaked credentials independently of the data models:
   - repo_cloner.py logged the raw clone URL (https://token@github.com/...) and
     the GitCommandError text (which embeds the failing command incl. the URL).
   - git_fetcher.py (scopes) logged source/remote URLs in many places.
   - git_utils/env.py forced GIT_TRACE=1 + GIT_CURL_VERBOSE=1, making git dump
     HTTP Authorization headers to captured stderr.
   Fix: add http_utils.redact_url_in_text(); redact every git URL log and the
   git error text; gate GIT_TRACE/GIT_CURL_VERBOSE behind LOG_DIAGNOSE. The
   functional clone_repository() call still receives the real URL.

Adds redact_url_in_text tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Second adversarial review pass found residual leaks on the ACTUAL fetch path
that the first sweep missed, plus correctness gaps in redact_url:

- Fetch-path URL logs now redacted: http_fetch_provider._fetch_ (logged
  self._url raw), opal_client/data/fetcher.handle_url (3 sites), and
  sources/git_policy_source remote_urls.
- redact_url hardened:
  * fix IPv6 host (brackets were dropped, producing an invalid URL)
  * also mask sensitive query-string params (?token=, ?access_token=, ...),
    not just user:password@ userinfo
  * username-only userinfo is stripped; non-sensitive URLs returned unchanged
- redact_url_in_text is now regex-based (scrubs any scheme://userinfo@ in the
  text), so it catches credentials even when git normalizes/encodes the URL
  differently from the value we pass; still also replaces the known URL.
- LOG_DIAGNOSE description now notes it also gates git protocol tracing.
- Tests: IPv6, query-param, username-only, byte-identity, regex-scrub, and
  env-gating (GIT_TRACE/GIT_CURL_VERBOSE) cases.
- Nits: drop unused FetcherConfig import; List[str] over list[str]; black/isort.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The docformatter pre-commit hook (v1.7.5) reformats docstrings: capitalizes
the summary line and splits summary from body. Pure formatting, no logic change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens credential redaction across OPAL by centralizing model-layer repr()/str() redaction for secret-bearing pydantic models, and by systematically redacting credentialed URLs in log messages/exceptions (plus reducing loguru/git diagnostic leakage vectors).

Changes:

  • Introduces RedactedReprMixin to mask sensitive model fields in repr()/str() across the MRO (protecting log interpolation + serialize=True sinks).
  • Adds URL redaction utilities (redact_url, redact_url_in_text) and applies them across git/data fetcher logging paths.
  • Flips LOG_DIAGNOSE default to False and gates verbose git tracing env vars behind diagnosis mode; adds regression tests for the above.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
packages/opal-server/opal_server/git_fetcher.py Redacts repo URLs in debug/info/error logs to prevent credentialed URL leakage.
packages/opal-server/opal_server/data/data_update_publisher.py Avoids logging full entries; logs redacted URL instead.
packages/opal-server/opal_server/data/api.py Stops logging full configuration model on validation failure; logs safer message.
packages/opal-common/opal_common/tests/redaction_test.py Adds regression tests for repr/str redaction, URL redaction, loguru serialize/diagnose, and git env gating.
packages/opal-common/opal_common/sources/git_policy_source.py Redacts remote URLs when reporting repo mismatch / existing remotes.
packages/opal-common/opal_common/schemas/data.py Applies RedactedReprMixin to DataSourceEntry and redacts config/data in repr/str.
packages/opal-common/opal_common/logging_utils/redaction.py Adds the core RedactedReprMixin implementation.
packages/opal-common/opal_common/http_utils.py Adds redact_url() and redact_url_in_text() utilities + sensitive query param handling.
packages/opal-common/opal_common/git_utils/repo_cloner.py Redacts URLs and scrubs credentialed URLs from git error text before logging.
packages/opal-common/opal_common/git_utils/env.py Gates GIT_TRACE/GIT_CURL_VERBOSE behind LOG_DIAGNOSE to avoid auth header leakage.
packages/opal-common/opal_common/fetcher/providers/http_fetch_provider.py Redacts fetch URL in debug logs; marks headers/data as redacted fields on config model.
packages/opal-common/opal_common/fetcher/providers/fastapi_rpc_fetch_provider.py Redacts RPC arguments in repr/str and stops logging them directly.
packages/opal-common/opal_common/fetcher/events.py Applies RedactedReprMixin to FetcherConfig/FetchEvent; redacts config field in event repr/str.
packages/opal-common/opal_common/config.py Changes default LOG_DIAGNOSE to False and clarifies credential-leak rationale.
packages/opal-client/opal_client/data/updater.py Redacts URLs in logs/errors and avoids logging full DataSourceEntry objects.
packages/opal-client/opal_client/data/fetcher.py Redacts URLs in fetcher logs/exceptions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/opal-common/opal_common/sources/git_policy_source.py
Comment thread packages/opal-client/opal_client/data/updater.py
@zeevmoney

Copy link
Copy Markdown
Contributor

Review: credential-leak sweep is incomplete

The model-layer redaction (RedactedReprMixin) and redact_url helpers are sound and verified working (pydantic 1.10.26 + loguru, serialize/diagnose). Inline comments cover issues on changed lines. This comment lists credential-leak sites that remain outside this PR's diff — relevant because the PR's stated goal is to sweep every related vector. None of these can be posted as inline comments since they sit on unchanged lines.

HIGH — still leak credentials:

  • opal_common/fetcher/engine/fetch_worker.py:42 and fetcher/fetch_provider.py:59logger.exception(...) runs while an aiohttp ClientResponseError is active. Its str() includes request_info.real_url, a yarl.URL that renders https://user:password@host/path?token=... in full. This fires under default config (backtrace=True, even with diagnose=False) and is the most direct leak for any failed authenticated data fetch. The model-layer strategy cannot catch it — the leaking object is a third-party exception, not an OPAL model. The robust fix is a loguru sink-level patcher that runs redact_url_in_text over the formatted record (including record["exception"]); that is the only thing that makes the "every vector" claim true for 3rd-party exceptions.

  • opal_common/sources/api_policy_source.py:190-196 (and :89, :124) — the HTTP-bundle analogue of git_policy_source.py (which this PR did redact). Logs/raises the full bundle URL raw (full_url = f"{url}/{path}", url = self.remote_source_url) and renders the active aiohttp exception. Symmetric site, missed entirely. Wrap with redact_url(...).

  • opal_server/scopes/service.py:132-134logger.debug(f"Sync scope: ... (remote: {source.url}, ...)") logs source.url raw on the scope-sync hot path (same GitPolicyScopeSource.url that git_fetcher.py redacts). Use redact_url(source.url).

  • opal_server/scopes/loader.py:22-25logger.info("Adding default scope from env: {url}", url=opal_server_config.POLICY_REPO_URL) logs the policy repo URL raw at startup; POLICY_REPO_URL commonly embeds a token for HTTPS auth. Use redact_url(...).

  • opal_client/data/updater.py:552, 561-564, 570-571 (unchanged lines in a file this PR edits) — store_transaction._update_remote_status(url=entry.url, ...) passes entry.url raw three times. That URL flows into the transaction-log JSON dump (opa_client.py, json.dumps(transaction, default=str)) and the OPA-transaction-failed error log. The PR redacted the adjacent log lines in this same function but missed these. Pass url=redact_url(entry.url) (changes only the logged/persisted status URL, not any functional fetch).

MEDIUM:

  • opal_client/callbacks/reporter.py:65 (and :70, :84) — callback urls/url logged raw. Callback URLs are user-supplied and can embed creds; fix: stop logging FetcherConfig in callback reporter #902 hardened this file but left the URLs themselves unredacted. Use redact_url(...).
  • opal_client/data/updater.py:450logger.exception(f"Failed to calculate hash for data {data}: {e}") logs the full inline data payload (a credential vector per this PR's own threat model; here it is the raw dict, so the mixin does not protect it).
  • opal_client/data/updater.py (~line 230)logger.info("Getting data-sources configuration from '{source}'", source=url) logs the data-sources config URL raw.

Operational note: flipping LOG_DIAGNOSE default to False means operators upgrading silently lose diagnose tracebacks and verbose git tracing — worth a release-note line (opt back in via OPAL_LOG_DIAGNOSE=true).

@zeevmoney zeevmoney left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments.

Comment thread packages/opal-common/opal_common/http_utils.py Outdated
Comment thread packages/opal-common/opal_common/http_utils.py
Comment thread packages/opal-common/opal_common/schemas/data.py
Comment thread packages/opal-common/opal_common/fetcher/events.py
Comment thread packages/opal-common/opal_common/tests/redaction_test.py Outdated
Comment thread packages/opal-common/opal_common/git_utils/env.py Outdated
Comment thread packages/opal-common/opal_common/http_utils.py Outdated
Zivxx and others added 6 commits June 23, 2026 13:22
…ordering, and fragment leaks

- redact_url no longer raises from a log/except path: accessing
  parts.username/.password/.port is what actually validates a URL (urlsplit
  is lazy), so an out-of-range port raised ValueError straight out of an
  except block. Guard the credential-detection block.
- redact_url_in_text now does the known-URL replacement *before* the userinfo
  regex; previously the regex rewrote user:pw@ -> ***@ first, destroying the
  verbatim URL so its query token was never masked.
- redact_url now masks sensitive params in the URL *fragment* too, and masks
  query/fragment values in-place so untouched params keep their exact original
  encoding (no urlencode %20 -> + normalization).

Addresses review comments:
- #921 (comment) (@zeevmoney)
- #921 (comment) (@zeevmoney)
- #921 (comment) (@zeevmoney)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
DataSourceEntry.url and FetchEvent.url can embed credentials
(user:token@host or ?token=...), but RedactedReprMixin only masked
config/data - so whole-model logging (repr/str, loguru serialize) still
leaked the URL. Add a generic _redacted_url_fields set to the mixin that runs
the value through redact_url (keeps host/path visible, unlike blanket
masking), unioned across the MRO like the existing set, and declare
{"url"} on both models. Wire serialization (.dict()/.json()) is untouched.

Addresses review comments:
- #921 (comment) (@zeevmoney)
- #921 (comment) (@zeevmoney)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ond)

Wrap raw URL log/exception sites with redact_url / redact_url_in_text so a
credentialed data-source, policy-repo or callback URL (user:token@host or
?token=...) never reaches the logs:

In-diff inline review:
- fastapi_rpc_fetch_provider: redact self._url in the fetch log line.
- updater._fetch_data: drop the raw {exc} binding (logger.exception already
  records the traceback; str(e) on a 3rd-party error can carry a credentialed
  URL and would serialize verbatim under LOG_SERIALIZE) and redact the
  re-raised message.
- git_policy_source: drop a stray ")" in the "found existing repo" log format.

Out-of-diff sites (PR-level review):
- api_policy_source: redact the 404 bundle URL (log + HTTPException detail)
  and the "Fetching changes from remote" log.
- scopes/service + scopes/loader: redact source.url / POLICY_REPO_URL.
- updater: redact entry.url passed to _update_remote_status (it is persisted
  into the transaction log) and the error strings; stop logging the raw inline
  data payload in calc_hash; redact the data-sources config URL.
- callbacks/reporter: redact the user-supplied callback URLs.

Addresses review comments:
- #921 (comment) (@copilot-pull-request-reviewer)
- #921 (comment) (@copilot-pull-request-reviewer)
- #921 (comment) (@copilot-pull-request-reviewer)
- #921 (comment) (@zeevmoney)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The diagnose test raised a bare ValueError("boom") that never referenced the
model, so loguru's diagnose never rendered it - the assertion held even for an
unprotected model (a tautology). Raise ValueError(entry) so the whole model is
rendered in the traceback. Deliberately do NOT reach into entry.config: loguru
diagnose evaluates that sub-expression and renders the raw dict, the one vector
the mixin documents it cannot cover (verified - the reviewer's suggested
entry.config[...] variant leaks).

Addresses review comments:
- #921 (comment) (@zeevmoney)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
provide_git_ssh_environment early-returns for non-SSH URLs, so it only ever
runs for SSH clones - which use key auth, not HTTP Authorization headers. The
comment (and the LOG_DIAGNOSE docstring) claimed gating GIT_TRACE/
GIT_CURL_VERBOSE here closes an HTTP Authorization-header leak; it does not
(that path is HTTPS-only and is not handled here). Correct the comment, the
config docstring and the test docstring to state the real value: reducing
verbose SSH protocol/host disclosure in logs.

Addresses review comments:
- #921 (comment) (@zeevmoney)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Third-party exceptions (most notably aiohttp ClientResponseError) render the
full request URL - https://user:pw@host/path?token=... - inside the traceback
loguru formats. The model-layer redaction can't reach those (the leaking
object isn't an OPAL model), and this fires under default config
(backtrace=True). Close the vector at the sink layer:

- CredentialScrubbingStream wraps the stdout/stderr sink so the final
  formatted record (message + exception traceback, and the JSON blob under
  serialize=True) is run through redact_url_in_text before write.
- A loguru patcher additionally scrubs record["message"] for every record, so
  the optional rotating file sink (which is path-based and can't be wrapped)
  still gets message-level scrubbing without losing rotation/retention.
- Document that OPAL_LOG_DIAGNOSE now defaults to False (operators upgrading
  opt back in via OPAL_LOG_DIAGNOSE=true).

Addresses review comments:
- #921 (comment) (@zeevmoney)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@Zivxx

Zivxx commented Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

@zeevmoney thanks for the out-of-diff sweep — all of it is now addressed.

Redacted at the missed call sites (7cedb07):

  • api_policy_source.pyfull_url in the 404 log + HTTPException detail, and the "Fetching changes from remote" log.
  • opal_server/scopes/service.py and scopes/loader.pysource.url / POLICY_REPO_URL.
  • opal_client/data/updater.pyentry.url in all three _update_remote_status calls (it is persisted into the transaction log) plus the error=str(e) strings; calc_hash no longer logs the raw inline data payload; the data-sources config URL is redacted.
  • opal_client/callbacks/reporter.py — the user-supplied callback URLs (:65/:70/:84).

Third-party exception leak (9c73269): rather than redacting each logger.exception site individually, I closed the vector at the sink layer, as you suggested. CredentialScrubbingStream wraps the stdout/stderr sink and runs redact_url_in_text over the fully-formatted record (message and exception traceback, including the JSON blob under serialize=True), so an aiohttp ClientResponseError rendering https://user:pw@host/... in its traceback is scrubbed regardless of which logger.exception("...") raised it. A loguru patcher additionally scrubs record["message"] for every record, so the optional rotating file sink (path-based, can't be stream-wrapped) still gets message-level scrubbing without losing rotation/retention. Covered by test_scrubbing_stream_redacts_thirdparty_exception_traceback.

LOG_DIAGNOSE default flip: documented in configuration.mdx (now Default: False, with a "default changed" note that operators opt back in via OPAL_LOG_DIAGNOSE=true).

One caveat worth flagging: the sink wrapper scrubs scheme://user:pw@ userinfo anywhere in a traceback, but a bare ?token=... query param that appears only inside a third-party exception string (with no known URL to match against) is not generically detectable — those remain best-effort via the per-call-site redact_url paths.

CI fixes:
- pre-commit / docformatter: reflow the CredentialScrubbingStream summary
  line so it conforms to docformatter (new file was missed locally because
  it was untracked when the formatters ran).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@Zivxx Zivxx requested a review from zeevmoney June 28, 2026 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants