fix(get_url_parameter): only match complete parameter names#1080
Open
toseekandfind wants to merge 1 commit into
Open
fix(get_url_parameter): only match complete parameter names#1080toseekandfind wants to merge 1 commit into
toseekandfind wants to merge 1 commit into
Conversation
Previously, `get_url_parameter` split the URL on `<param>=`, which also matched when the search token appeared as a *suffix* of a longer parameter name. For example, calling `get_url_parameter(url, 'g')` against `http://example.com/?msg=myvalue` incorrectly returned `myvalue` because the `msg=` substring contains `g=`. The fix normalizes the URL so every parameter is preceded by `&` (using `dbt.replace` to turn `?` into `&`, and prepending `&` with `dbt.concat` so the first parameter is also delimited), then splits on `&<param>=`. Requiring the leading `&` in the search token means only complete parameter names match. Resolves dbt-labs#980 (also resolves the duplicate report in dbt-labs#1014).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
resolves #980
(also resolves the duplicate report in #1014)
Problem
dbt_utils.get_url_parametersilently returns the wrong value when therequested parameter name is a suffix of a different parameter name that
actually exists in the URL.
The macro splits the URL on
<url_parameter>=, so any parameter whose nameends with the search token matches. Two examples from the linked issues:
get_url_parameter('url', 'g')againsthttp://example.com/?msg=myvaluereturns
myvalue(becausemsg=containsg=).get_url_parameter('url', 'ku')againsthttp://example.com/?sku=EXAMPLE_SKU&u=...returnsEXAMPLE_SKU.The expected behaviour is to return
NULLwhenever the URL does not containa parameter whose full name matches the search token.
Solution
Normalize the URL so every parameter is preceded by
&, then split on&<url_parameter>=. Requiring the leading&in the search token meansonly complete parameter names match.
The normalization uses cross-database primitives that
dbt-utilsalreadyrelies on elsewhere (
dbt.replace,dbt.concat,dbt.split_part), so thechange stays dispatch-friendly and adapter-neutral:
Alternatives considered:
regexp_substr/regexp_match): cleanestsemantically, but each warehouse uses a different regex function and
argument order, so it would require per-adapter dispatch. Sticking with
split_part/replace/concatkeeps the macro single-implementation.&<param>=) without alsoprepending to
field: fails when the requested parameter is the firstone in the query string after normalization, because there is no leading
&in the URL itself.Tests
Added a singular test at
integration_tests/tests/web/test_get_url_parameter_substring.sqlthatcovers both regression cases and the legitimate short-name cases:
gagainst?msg=...,kuagainst?sku=...) must returnNULL.?m=...,?s=...) must still resolvecorrectly.
utm_medium,utm_source) is unchanged.The test compares
actualandexpectedwith NULL-safe equality soNULL = NULLis treated as a pass rather than "unknown" (which wouldotherwise let regressions slip through the existing
!=comparison).Verified locally against
dbt-postgres(make setup-db+dbt build):mainwith 2 failing rows (confirming itexercises the bug).
test_urls,test_url_host, andtest_url_pathtests allstill pass.
Checklist