dbt-materialize: Support connection overrides#36127
dbt-materialize: Support connection overrides#36127SangJunBak merged 4 commits intoMaterializeInc:mainfrom
Conversation
7ea9bf0 to
88e356d
Compare
88e356d to
c2586ce
Compare
|
@bobbyiliev Unsure how we actually deploy an update to our dbt adapter! Also I decided to go with the approach of allowing an
|
c2586ce to
0514015
Compare
0514015 to
a5eaee4
Compare
@SangJunBak This gets auto released after bumping the version bump (version.py + setup.py). What we usually do is to split the version bump itself out into a follow-up release PR. We usually merge the feature + CHANGELOG entry first, then do a separate PR to cut the release to PyPI. Would you mind reverting the verions bump and adding a changelog entry here? Then we can have a follow-up PR for the release itself. That way if there are other pending changes they all get released together. |
Did you by any chance test this with a full dbt run / dbt build, not just dbt debug? The verification shows the options making it into the connection string, but I'd feel better knowing a real query round-trip works with an override that actually changes behavior (e.g. setting auto_route_catalog_queries: off). Happy to try this on my end if you hit any blockers setting this all up! |
| options: | ||
| hint: 'overrides the PostgreSQL `options` connection parameter' |
There was a problem hiding this comment.
I totally hear you that this is tricky to unit-test, but do you think we could add an integration test in tests/adapter/ e.g. set a non-default option like welcome_message: on via a dbt_profile_target override and assert the server honored it? The options_dict.update(credentials.options) merge is the whole point of the feature. Happy to help sketch something out if useful!
| @classmethod | ||
| def open(cls, connection): | ||
| connection = super().open(connection) | ||
| # Much of the `open` method setup is copied from the `PostgresConnectionManager.open` method | ||
| # https://github.com/dbt-labs/dbt-adapters/blob/v1.17.3/dbt-postgres/src/dbt/adapters/postgres/connections.py#L102, | ||
| # except we allow users to override options. |
There was a problem hiding this comment.
This new open method copies a chunk of PostgresConnectionManager.open from dbt-postgres. Did you consider overriding a smaller surface instead (e.g. just the options-string construction and then super().open())? My worry is that when dbt-postgres adds new kwargs or changes retry logic, we'll silently fall behind. Totally fine if you already looked at this and it wasn't workable, just curious.
There was a problem hiding this comment.
My worry is that when dbt-postgres adds new kwargs or changes retry logic, we'll silently fall behind
I think this is a valid concern :( I tried to but because kwargs is defined internally and there's no way to override options through the credentials object, we can't use super().open(). The best way I could think of is extending off the previous approach of monkey patching psycopg, then doing some weird python thread blocking such that the monkey patch can accept input from our adapter's .open, but that seems worse than copying their implementation. FWIW prior art to this is something like Clickhouse https://github.com/ClickHouse/dbt-clickhouse/blob/main/dbt/adapters/clickhouse/connections.py
There was a problem hiding this comment.
Maybe the correct implementation here however is inheriting off SQLConnectionManager instead of PostgresqlConnectionManager?
|
|
||
| # If you bump this version, bump it in setup.py too. | ||
| version = "1.9.7" | ||
| version = "1.9.8" |
There was a problem hiding this comment.
Mentioned this in the other comment but could we split the version bump (version.py + setup.py) out of this PR? Normally I prefer to merge the feature + document it under "Unreleased" in the CHANGELOG first, then do a separate release PR that bumps the version and publishes to PyPI. Keeps the feature history and release history cleanly separate.
I think the verification also tries to connect, so the way I actually tested it was toggling welcome_message to on and off and seeing the output! I can write a unit test for this however! |
|
Created an integration test in the latest commit. Can test via |
|
Followup PR to bump the versions #36213 |
|
@claude review |
|
Claude finished @bobbyiliev's task in 1m 32s —— View job Code Review
Overall this is a clean, well-scoped implementation. A few issues worth addressing before merge:
|
|
Regarding claude's output, so I did notice that spaces had no negative effect in the connection. I.e. setting the cluster to "weewoo quickstart" would still work, even with other arguments. This might be a postgres specific thing that Materialize already handles? |
| if search_path is not None and search_path != "": | ||
| options_parts.append(("search_path", search_path)) |
There was a problem hiding this comment.
Just doing a second read and I think I noticed something here, sorry I missed it the first time. The parent has search_path.replace(" ", "\\ ") and I had to look up why, looks like Postgres treats spaces in the options string as argument separators unless you backslash-escape them (docs). I think with the new join we'd lose that, so a search_path with a space in it might silently truncate at the first space?
I think the same thing might apply to user-provided values too. If someone wrote something like options: { welcome_message: "hello world" }, I think we'd end up sending --welcome_message=hello world and the server would parse world as a stray positional. Not 100% sure though, would you mind double-checking? Happy to double check this myself if you wish!
If it is an issue, maybe escape spaces in values, or call it out in a docstring that values shouldn't contain spaces? Wdyt?
There was a problem hiding this comment.
I think we'd end up sending --welcome_message=hello world and the server would parse world as a stray positional.
Ah so I tested this out by setting the default option for cluster as weewoo quickstart and it worked in that case, even with other option overrides. This might be because Materialize is able to handle this without escapes but not postgres? However I think if we get the correct/same behavior with the escaping, then it makes sense for us to escape as well to maintain compatibility. I'll check!
| import pytest | ||
|
|
||
|
|
||
| class TestConnectionOptionsOverride: |
There was a problem hiding this comment.
One more thing, would it be worth adding a second test case (or just another assertion) that checks the defaults still apply when the user doesn't pass an options block? Right now I think we'd catch a regression on the override path, but if someone refactored and accidentally dropped the dict(DEFAULT_SESSION_PARAMETERS) line we might not notice. Probably just asserting welcome_message=off, auto_route_catalog_queries=on, and current_object_missing_warnings=off in a profile with no options would do it. Up to you though, happy either way.
Purpose is to allow OIDC auth via the oidc_auth_enabled connection option variable. - Removes the initial monkey patch - Copies much of the implementation in PGConnectionManager
9c20e55 to
cb786a9
Compare
9b272e3 to
254d44e
Compare
bobbyiliev
left a comment
There was a problem hiding this comment.
I think that this is in a good shape now! Thanks for making all of the changes @SangJunBak 🎉
|
Thanks for the thorough review! |
Motivation
Unblocks OIDC auth because our dbt adapter doesn't allow inputting arbitrary options: https://materializeinc.slack.com/archives/C0A23PK183Z/p1776360806044789
Verification
Wasn't sure the best way to automate a unit test for this however.