Add SetDefaultHostMetadataResolverFactory#1636
Merged
simonfaltum merged 5 commits intomainfrom Apr 20, 2026
Merged
Conversation
## Changes PR #1572 added `Config.HostMetadataResolver` so callers could override the SDK's `/.well-known/databricks-config` fetch on a per-Config basis. That covers the "I have one Config and I want to wrap it" case, but a program that constructs many Configs (e.g. the Databricks CLI across its command surface) ends up calling the same assignment at every construction site and needs a lint-style guardrail to keep it from drifting. This adds a package-level default consulted when a Config has no explicit resolver set. Callers assign it once during startup (typically from an init() block in a package that the main binary blank-imports); every subsequent Config picks up the same resolver without per-site wiring. Config-level resolvers still take precedence, so PR #1572's contract is unchanged. Exposed as a plain variable (not a setter function), matching the existing `Config.HostMetadataResolver` field and stdlib precedents like `http.DefaultClient` and `http.DefaultTransport`. Not safe for concurrent modification after initialization — documented as such. **What's new:** - `config.DefaultHostMetadataResolverFactory` — experimental; set to nil to clear. **What stays the same:** - Default behavior (no factory set) is unchanged: the SDK performs its own HTTP fetch. - `Config.HostMetadataResolver` takes precedence over the factory. - A factory returning nil falls through to the default HTTP fetch. ## Tests Three new tests in `config/config_test.go`: - Factory is invoked when Config has no resolver; backfill works. - Config-level resolver wins when both are set (factory not called). - Factory returning nil falls through to the SDK's HTTP fetch. Each test clears the factory via `t.Cleanup(func() { DefaultHostMetadataResolverFactory = nil })` so package-level state does not leak between tests. `make fmt test lint` clean, `-race` clean. Signed-off-by: simon <simon.faltum@databricks.com>
ac18d2c to
9f7e4b4
Compare
Codex review flagged the exposed-variable approach: a plain global function value is a Go data race under concurrent read/write, even if documented as init-only. Wraps the storage in an unexported var guarded by a sync.RWMutex, exposing SetDefaultHostMetadataResolverFactory. Also keeps the representation private so future iteration (atomic.Pointer, validation, observer hooks) stays non-breaking while the API is marked experimental. Tests use a withDefaultHostMetadataResolverFactory helper that captures and restores the prior registration instead of resetting to nil, so tests never clobber each other via the package-level default. Signed-off-by: simon <simon.faltum@databricks.com>
Signed-off-by: simon <simon.faltum@databricks.com>
Comment on lines
+42
to
+45
| var ( | ||
| defaultHostMetadataResolverFactoryMu sync.RWMutex | ||
| defaultHostMetadataResolverFactory func(*Config) HostMetadataResolver | ||
| ) |
Contributor
There was a problem hiding this comment.
[optional] If users are expected to set this at init time, then you don't need the mutex. See http and log as references from the standard library.
var DefaultHostMetadataResolverFactory func(*Config) HostMetadataResolver = nil
Users who need to use different resolvers within their code should use Config.HostMetadataResolver instead.
Member
Author
There was a problem hiding this comment.
Switched to a plain var DefaultHostMetadataResolverFactory, dropped the mutex and the sync import. Matches the stdlib precedent you called out (http.DefaultClient, http.DefaultTransport, log.Default).
renaudhartert-db
approved these changes
Apr 20, 2026
Co-authored-by: Renaud Hartert <renaud.hartert@databricks.com>
Renaud noted that for an init-time default the stdlib pattern is a plain public variable, not a setter/getter pair guarded by a mutex (see http.DefaultClient, http.DefaultTransport, log.Default). Callers who need dynamic or per-Config resolvers use Config.HostMetadataResolver instead, so there's no real concurrent-write scenario to guard against. Simplifies the API surface and drops the sync import. Signed-off-by: simon <simon.faltum@databricks.com>
|
If integration tests don't run automatically, an authorized user can run them manually by following the instructions below: Trigger: Inputs:
Checks will be approved automatically on success. |
simonfaltum
added a commit
to databricks/cli
that referenced
this pull request
Apr 21, 2026
SDK v0.128.0 (databricks/databricks-sdk-go#1636) adds config.DefaultHostMetadataResolverFactory so a package can install a single hook that every Config picks up on EnsureResolved, without per-site wiring. Replaces ten hostmetadata.Attach(cfg) call sites across seven files and the injection guardrail test with two pieces: - libs/hostmetadata/resolver.go: init() sets config.DefaultHostMetadataResolverFactory to wrap cfg.DefaultHostMetadataResolver() in the caching resolver. - main.go: blank import of libs/hostmetadata triggers that init() at startup so every *config.Config the CLI constructs picks up the cached lookup automatically. Co-authored-by: Isaac
6 tasks
andrewnester
pushed a commit
to bernardo-rodriguez/b-cli
that referenced
this pull request
Apr 22, 2026
) ## Why Every CLI command (`databricks auth profiles`, `bundle validate`, every workspace or account call) goes through `Config.EnsureResolved`, which triggers an unauthenticated GET to `{host}/.well-known/databricks-config` to populate host metadata. That round trip is ~700ms against production and gets paid on every invocation, doubling the latency of otherwise single-request commands. ## Changes **Before:** every CLI invocation hits the well-known endpoint once (or more when multiple configs get constructed). **Now:** the first invocation populates a local disk cache under `~/.cache/databricks/<version>/host-metadata/`; subsequent invocations read from it. Failures are negatively cached for 60s (except for `context.Canceled` / `context.DeadlineExceeded`, which are transient and never cached). The integration hooks into SDK `v0.128.0`'s `config.DefaultHostMetadataResolverFactory` (added in databricks/databricks-sdk-go#1636) via two pieces: - `libs/hostmetadata/resolver.go`: `init()` registers a factory that wraps `cfg.DefaultHostMetadataResolver()` in the caching resolver. `NewResolver(fetch)` remains the unit-testable primary API. - `main.go`: a blank import of `libs/hostmetadata` triggers that `init()` at startup, so every `*config.Config` the CLI constructs now and in the future picks up the cached lookup automatically. No per-site wiring, no guardrail test. Positive cache wraps the miss path, so the hit path is a single disk read; negative cache is only consulted when positive misses. `internal/testutil/env.go` pins `DATABRICKS_CACHE_DIR` to a temp dir in test cleanup so tests don't leak cache files into `HOME`. ### Collateral cleanups - `libs/cache/file_cache.go`: drop the `failed to stat cache file` debug log when the file is simply missing (`fs.ErrNotExist`). It was pure noise (the next line, `cache miss, computing`, conveys the same info) and its OS-specific error text diverged between Unix (`no such file or directory`) and Windows (`The system cannot find the file specified.`), breaking cross-platform acceptance goldens. Genuine stat failures (permission, corruption) still log. - `libs/testdiff/replacement.go`: `devVersionRegex` now accepts either `+SHA` or `-SHA` after `0.0.0-dev`. `build.GetSanitizedVersion()` swaps `+` to `-` for filesystem safety when the version is used in cache paths, and the old regex only covered the `+` form. ## Test plan - [x] `make checks` clean - [x] `make lint` clean (0 issues) - [x] `go test ./libs/hostmetadata/... -race` clean (factory-installed assertion + cache hit + fetch error + cancellation-not-cached + host isolation + end-to-end integration) - [x] `go test ./cmd/root/... ./bundle/config/... ./cmd/auth/... ./libs/auth/... -race` clean - [x] End-to-end acceptance test `acceptance/auth/host-metadata-cache/` asserts exactly ONE `/.well-known/databricks-config` GET across two `auth profiles` invocations sharing a `DATABRICKS_CACHE_DIR` - [x] Existing acceptance tests regenerated: fewer well-known GETs in `out.requests.txt` (caching works), new `[Local Cache]` debug lines in cache/telemetry tests, two `Warn: Failed to resolve host metadata` lines removed (intentional: the resolver returns `(nil, nil)` on fetch errors, which is how the SDK interprets "no metadata available"), stat-not-found lines removed (see Collateral cleanups) ### Live validation against dogfood (from previous push) Built locally and ran `databricks -p e2-dogfood current-user me` with and without a warm cache: | Scenario | Elapsed well-known time | Cache log output | |---|---|---| | Cold cache (fresh `DATABRICKS_CACHE_DIR`) | ~713ms fetch | `cache miss, computing` -> `GET /.well-known/databricks-config` -> `computed and stored result` | | Warm cache (second invocation) | ~1ms | single `[Local Cache] cache hit` line | Net per-command savings: ~700ms, matching the Why.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changes
PR #1572 added
Config.HostMetadataResolverso callers could override the SDK's/.well-known/databricks-configfetch on a per-Config basis. That covers "I have one Config and I want to wrap it."The gap: programs that construct many Configs across their command surface (e.g. the Databricks CLI) end up copying the same
cfg.HostMetadataResolver = ...assignment at every construction site, in the CLI roughly 10 sites across 7 files plus a guardrail test to catch drift.This PR adds a package-level default consulted when a Config has no explicit resolver set. Callers set a factory once during startup; every subsequent Config gets the same resolver without per-site wiring. The Config-level field still takes precedence, so PR #1572's contract is unchanged.
API
Plain public variable, set once at init. Matches the stdlib pattern for single-default hooks:
http.DefaultClient,http.DefaultTransport,log.Default. Callers needing per-Config or dynamic behaviour should useConfig.HostMetadataResolverinstead.Resolution order inside
Config.EnsureResolvedConfig.HostMetadataResolveris set, use it.DefaultHostMetadataResolverFactoryis non-nil, invoke it with the resolving Config and use its return value. If it returns nil, fall through.How the Databricks CLI will use this
The canonical Go idiom for "library A registers itself with library B" is a blank import that triggers an
init()in A. This is howdatabase/sqldrivers (_ "github.com/lib/pq"), image codecs (_ "image/png"), and encoding formats register themselves.After this PR lands and is bumped into the CLI, CLI PR #5011 will collapse from ~10 wired-in
hostmetadata.Attach(cfg)calls + a guardrail test down to two small pieces:repos/cli/libs/hostmetadata/resolver.go— set the caching factory at package init:repos/cli/cmd/databricks/main.go— one blank import to pull the package in at startup:That's the full integration. Every Config the CLI creates, now and in the future from any new command a developer adds, automatically gets caching. No per-site
Attachcall to remember, no guardrail test to maintain, no new developer ever has to learn this mechanism exists to benefit from it.Experimental
Marked experimental to match the existing
HostMetadataResolverfield. No default behavior change for callers that never setDefaultHostMetadataResolverFactory.Tests
Three new tests in
config/config_test.go, each using a smallwithDefaultHostMetadataResolverFactory(t, factory)helper that captures and restores the prior value, so tests never clobber each other via the package-level default:Factory is invoked when Config has no resolver; back-fill works end-to-end.
Config-level resolver takes precedence (factory not consulted).
Factory returning nil falls through to the SDK's HTTP fetch.
make fmt test lintcleango test ./config/... -count=1 -racecleanSigned-off-by: simon simon.faltum@databricks.com