Skip to content

Feature/957 more auth mechanisms#1453

Open
ghjklw wants to merge 7 commits into
databricks:mainfrom
ghjklw:feature/957-more-auth-mechanisms
Open

Feature/957 more auth mechanisms#1453
ghjklw wants to merge 7 commits into
databricks:mainfrom
ghjklw:feature/957-more-auth-mechanisms

Conversation

@ghjklw
Copy link
Copy Markdown

@ghjklw ghjklw commented May 12, 2026

What this PR does

Closes databricks/dbt-databricks#957.

Problem: The adapter's authentication layer had two classes of issues:

  1. Many valid Databricks SDK authentication methods (azure-cli, azure-msi, databricks-cli, metadata-service, etc.) were silently blocked — validate_creds() required either a token or the legacy auth_type: oauth alias, and any other auth_type value was ignored rather than forwarded to the SDK.
  2. Two profile fields — oauth_scopes and oauth_redirect_url — behaved incorrectly (see OAuth bugs below).

Solution: Auth dispatch is simplified to two branches: the existing heuristic (preserved for backward compatibility but now deprecated with a runtime warning) and a pass-through to the Databricks SDK Config for everything else. Users still on the heuristic path will see a deprecation warning directing them to set auth_type explicitly. This makes every auth method the SDK supports available to users without requiring changes to the adapter for each new method.


Authentication type reference

The table below covers every profile configuration. Rows marked unchanged behave identically to the previous release; rows marked fixed or new indicate a behavioral change.

Auth scenario Profile fields auth_type Status
Personal Access Token (PAT) token (not required) Unchanged
Azure Service Principal (dedicated fields) azure_client_id + azure_client_secret azure-client-secret (inferred) Fixedazure_tenant_id now forwarded (was silently dropped)
OAuth M2M (Databricks) client_id + client_secret starting with dose (not required) Unchanged — heuristic tries oauth-m2m first. Deprecated: set auth_type: oauth-m2m explicitly
Azure Service Principal (legacy fields) client_id + client_secret (non-dose) (not required) Unchanged — heuristic tries azure-client-secret first. Deprecated: set auth_type: azure-client-secret explicitly
OAuth U2M (browser) (none) oauth or external-browser Functionally unchanged — empty client_secret="" no longer leaked into SDK Config
OAuth U2M with custom app client_id oauth / external-browser Functionally unchanged — empty client_secret="" no longer leaked
OAuth U2M inferred from client_id alone client_id (no client_secret) external-browser (inferred) Newclient_id without client_secret now selects browser-based OAuth
Azure CLI (none) azure-cli New — previously blocked by validate_creds() and silently ignored
Azure Managed Identity (system-assigned) (none) azure-msi New
Azure Managed Identity (user-assigned) azure_client_id azure-msi New
Databricks CLI / config file databricks_cli_profile, config_file (optional) databricks-cli New
Google credentials google_credentials or google_service_account google-credentials New
Instance metadata service metadata_service_url (optional) metadata-service New
OIDC token oidc_token_env or oidc_token_filepath (SDK-inferred) New
Basic auth username + password (SDK-inferred) New
Any future SDK auth type (varies) (any SDK value) New — forwarded automatically; no adapter changes needed
SDK escape hatch databricks_sdk_parameters: {key: value} (any) New — arbitrary SDK Config kwargs, merged last, override everything above

Additional field-forwarding fixes (previously silently dropped)

Profile field Previous behaviour Now
auth_type when token is also set auth_type ignored; only PAT used Both forwarded — SDK decides precedence
azure_tenant_id with azure_client_id/azure_client_secret Not forwarded to Config Forwarded
databricks_sdk_parameters Accepted in profile but discarded Merged into Config kwargs for all auth paths
Explicit auth_type with azure_client_id/azure_client_secret auth_type ignored; forced to azure-client-secret Explicit auth_type wins

Removed field

Field Previous behaviour Now
oauth_redirect_url Accepted in profiles but had no effect Removed — the Databricks SDK hardcodes http://localhost:8020 internally and never reads this value from Config; accepting it created a false impression of control

OAuth scopes/redirect_url bugs

Two fields were silently broken in the previous implementation:

oauth_scopes was never forwarded to the SDK

DatabricksCredentialManager was initialised with the oauth_scopes value and a default of ["all-apis", "offline_access"] was applied. However, none of the Config(...) calls — authenticate_with_external_browser(), authenticate_with_oauth_m2m(), or any other path — included scopes=.... The field was accepted, stored, and then discarded. Users who set oauth_scopes in their profile to restrict or extend token scopes were silently ignored.

Fix: oauth_scopes is now passed as scopes to Config for all SDK-delegated auth paths.

oauth_redirect_url had no effect

REDIRECT_URL = "http://localhost:8020" was defined as a module constant and used as the default for DatabricksCredentialManager.oauth_redirect_url. The Databricks SDK's Config class does not accept a redirect URL parameter — it hardcodes http://localhost:8020 internally. Any user who configured oauth_redirect_url in their profile to point to a different port or host was silently ignored, and the browser OAuth flow continued to use localhost:8020 regardless.

Fix: The field is removed entirely to avoid the false impression that it is configurable.


Test strategy

This PR was developed test-first to make the behavioral delta between old and new explicit and reviewable:

Step 1 — Baseline documentation (a31eec8): Before any code change, a parametrized TestAuthDispatch.test_config_kwargs suite was written to document the exact Config(...) kwargs produced by every legacy dispatch path (7 cases: PAT, Azure SP with dedicated fields, dose-prefix heuristic, non-dose heuristic, oauth alias, oauth with custom client_id, and no credentials). A TestValidateCreds suite (6 cases) pinned validation rules. All 13 cases passed against origin/main, confirming the baseline was an accurate snapshot of existing behavior.

Step 2 — Refactor applied (e57d441): The dispatch was simplified. Exactly 2 of the 13 baseline tests began failing, making the behavioral delta explicit: both failing cases were the external-browser paths where an empty client_secret="" was previously leaked into Config. This made the scope of the change fully visible before any test was updated.

Step 3 — Tests updated and new coverage added (daff8f9): The 2 failing cases were updated to reflect the corrected behavior (no empty client_secret). 15 new parametrized cases were added covering: all new auth_type passthroughs (azure-cli, azure-msi, databricks-cli, google-credentials, metadata-service), field-forwarding fixes (azure_tenant_id, databricks_sdk_parameters, token+auth_type combined, explicit auth_type overriding field inference), and new profile fields. 5 new TestValidateCreds cases confirm that previously-blocked SDK auth_type values now pass validation.

Step 4 — oauth_scopes coverage (eda9a78): 2 additional cases assert that oauth_scopes is forwarded as scopes for both external-browser (U2M) and oauth-m2m (M2M) flows.

Step 5 — Deprecation warning and README (b5cf6b6): The legacy heuristic now emits a logger.warning at runtime directing users to set auth_type explicitly. _resolve_client_secret_heuristic is documented as deprecated and notes that extra fields (e.g. azure_tenant_id, oauth_scopes) are intentionally not forwarded in the heuristic path to preserve identical legacy behaviour. README.md is updated: priority table order corrected (heuristic rows last, flagged as deprecated), client_id-only browser OAuth inference documented, and a migration callout block added.

Final state: 24 parametrized dispatch cases + 11 validation cases, all passing. No functional tests required changes — the connector-level behavior (how connections are opened against Databricks) is unchanged; only the Config kwargs constructed from profile fields change.


Checklist

  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt-databricks next" section.

ghjklw and others added 6 commits May 10, 2026 11:37
TestAuthDispatch.test_config_kwargs: 7 parametrized cases covering every
dispatch path in the legacy code, asserting the exact kwargs passed to
databricks.sdk.core.Config. The 3 external-browser cases (oauth alias,
oauth with custom client_id, no credentials) will fail after the refactor —
that failure is the signal showing the behavioral delta.

TestValidateCreds: 6 cases covering validation rules on main, all
unchanged by the refactor (host/http_path required, token and oauth
accepted, client_id required when client_secret present, azure fields
must be paired).

All 13 cases pass against origin/main.

https://claude.ai/code/session_0164JzVFw4g7yH2Z6UHC5i9p
Replace DatabricksCredentialManager's 4-branch dispatch (token → Azure SP
→ no-client-secret external-browser → heuristic) with a two-branch model:
- Legacy heuristic (client_secret present, no auth_type, no azure_client_secret)
  preserved unchanged for backward compatibility
- Everything else routed through to_sdk_config_kwargs(), which handles
  auth_type passthrough, field translation, and databricks_sdk_parameters

Fixes gaps in the original dispatch:
- Explicit auth_type values (azure-cli, azure-msi, databricks-cli, etc.) now
  forwarded to the SDK instead of silently falling back to external-browser
- auth_type="oauth" alias cleanly maps to "external-browser" without leaking
  an empty client_secret into Config
- client_id alone infers external-browser without requiring explicit auth_type
- No-credential path preserves legacy external-browser fallback (CLIENT_ID used)
- token + auth_type both forwarded (auth_type was previously dropped when token present)
- explicit auth_type wins over azure_client_id/secret field inference
- azure_tenant_id forwarded in the Azure SP path
- databricks_sdk_parameters merged for all auth paths
- validate_creds relaxed to accept any SDK auth_type, not just token or "oauth"

2 baseline tests now fail, marking exactly what changed in the external-browser path.

https://claude.ai/code/session_0164JzVFw4g7yH2Z6UHC5i9p
Fix the 2 baseline tests that exposed behavioral changes in the refactor:
- oauth_alias / oauth_alias_with_custom_client_id: remove client_secret=""
  that legacy leaked into Config for external-browser auth
- no_credentials: external-browser fallback preserved; update expected value
  to drop the empty client_secret="" that legacy passed unnecessarily

Add 16 new cases for behaviors enabled by the refactor:
- client_id alone now infers external-browser (no explicit auth_type needed)
- all SDK auth_type values forwarded: azure-cli, azure-msi (+ user-assigned
  identity and tenant variants), databricks-cli (+ profile), google-credentials,
  metadata-service
- explicit auth_type="oauth-m2m" bypasses the legacy heuristic
- token + auth_type both forwarded (auth_type was previously dropped)
- explicit auth_type wins over azure_client_id/secret field inference
- azure_tenant_id now forwarded in the Azure SP path
- databricks_sdk_parameters merged for all paths (azure-cli and PAT examples)

Add 5 TestValidateCreds cases confirming that SDK auth_types no longer
require a token or auth_type='oauth' to pass validation.

https://claude.ai/code/session_0164JzVFw4g7yH2Z6UHC5i9p
oauth_scopes is now passed as 'scopes' through to_sdk_config_kwargs(),
so it reaches the Databricks SDK Config for all auth paths that go
through that method (external-browser, explicit auth_type passthrough,
PAT, etc.).

oauth_redirect_url is removed: the Databricks SDK hardcodes the redirect
URL to http://localhost:8020 and never reads it from Config, so the field
had no effect. REDIRECT_URL and SCOPES module-level constants are also
removed as they are no longer referenced.

Test coverage: two new TestAuthDispatch cases assert that oauth_scopes
is forwarded as 'scopes' for external-browser (U2M) and oauth-m2m (M2M).

https://claude.ai/code/session_0164JzVFw4g7yH2Z6UHC5i9p
…ty table

- Emit a deprecation warning when client_secret is used without auth_type,
  directing users to set auth_type: oauth-m2m or azure-client-secret explicitly
- Docstring notes that extra fields (azure_tenant_id, oauth_scopes, etc.) are
  intentionally not forwarded in the heuristic path to preserve legacy behaviour
- README: fix priority table order (heuristic row last, correctly flagged as
  deprecated); document client_id-only → external-browser inference; add a
  deprecated callout block with migration examples

https://claude.ai/code/session_0164JzVFw4g7yH2Z6UHC5i9p
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for more authentication mechanisms

2 participants