[SPARK-56520][SQL] Persist SQL PATH in views and SQL functions, expose in DESCRIBE#55383
[SPARK-56520][SQL] Persist SQL PATH in views and SQL functions, expose in DESCRIBE#55383srielau wants to merge 1 commit intoapache:masterfrom
Conversation
f802878 to
ba25924
Compare
dtenedor
left a comment
There was a problem hiding this comment.
Thanks for working on this!
a994a5c to
bd3ec19
Compare
cloud-fan
left a comment
There was a problem hiding this comment.
Nice refactor since the first round — the extracted SqlPathFormat, DescribeFunctionCommandUtils, and shared CatalogManager.pathEntriesForPersistence / serializePathEntries helpers address the earlier dedup, narrow-catch, and file-health feedback cleanly.
Remaining items below. The main themes are:
- The scaladoc for both
VIEW_RESOLUTION_PATHandFUNCTION_RESOLUTION_PATHdescribes the stored value as comma-separated, but it's actually a JSON array-of-arrays produced byCatalogManager.serializePathEntries. - View DESCRIBE and function DESCRIBE each have their own display formatter for the same stored JSON shape — these should share one renderer.
- The
DESCRIBEoutput is gated on the current session'sSQLConf.get.pathEnabled, which hides the persisted property whenever a user later disables the flag. - Test coverage for
DESCRIBE FUNCTION EXTENDEDon SQL UDFs is missing — the wholeDescribeFunctionCommandUtilscodepath is currently untested end-to-end, as is the temp-vs-persistentsystem.sessionstripping rule.
Additional notes that didn't fit inline:
- Please add a test that creates a SQL UDF (both persistent and
TEMPORARY) with an explicitSET PATH, runsDESCRIBE FUNCTION EXTENDED, and asserts theSQL Path: …row. That coversDescribeFunctionCommandUtils(persistent catalog branch and temp UDF usage-blob fallback), which has no test right now. - Tests for the persistence semantics themselves (persistent view/function strips
system.session, temp view keeps it,SET PATHround-trip,PATH_ENABLEDtoggle) would probably sit more naturally inSetPathSuitealongside the rest of the PATH-feature tests than inDescribeTableSuite.
cb3169f to
106c994
Compare
srielau
left a comment
There was a problem hiding this comment.
Overall the PR is well-structured and the serialization/display separation is clean. One correctness issue and three doc/access-modifier nits below.
106c994 to
8ccfd95
Compare
cloud-fan
left a comment
There was a problem hiding this comment.
Round 3 re-review. Status: 9 addressed, 2 remaining, 1 new.
- Addressed: the
VIEW_RESOLUTION_PATH/FUNCTION_RESOLUTION_PATHscaladoc now describes the JSON array format;formatStoredPathdelegates toSqlPathFormat;removeResolutionPathhelper mirrors the otherremove*helpers; empty-parts.quotedguard added; non-JArrayentries inSqlPathFormat.toDescribeJsonnow drop instead of pass through; the exactList(SqlPathEntry(...), ...)assertion is pinned; andgenerateViewPropertiesnow takescaptureNewPath, soAlterViewSchemaBindingCommandpreserves the frozen path instead of overwriting it. - Remaining:
DESCRIBE FUNCTION EXTENDEDon SQL UDFs still has no test for theDescribeFunctionCommandUtilscode path (persistent catalog branch and temp UDF usage-blob fallback). And the PATH persistence tests (persistent view/function stripssystem.session, temp view keeps it,SET PATHround-trip) are still underDescribeTableSuiterather than alongside the other PATH-feature tests inSetPathSuite. On thepathEnabledDESCRIBE gate I am accepting your "kill switch" rationale as a deliberate design choice; not re-raising. - New: the
catchonstoredResolutionPathStringwas widened fromNoSuchFunctionExceptionall the way toAnalysisException. SinceUserDefinedFunction.fromCatalogFunctionwraps any parse failure intoAnalysisException("CORRUPTED_CATALOG_FUNCTION"), this silently masks real catalog corruption onDESCRIBE FUNCTION EXTENDED. My earlier ask was specifically to includeNoSuchDatabaseException; please narrow back toNoSuchFunctionException | NoSuchDatabaseExceptionsoCORRUPTED_CATALOG_FUNCTIONand any other genuine analysis error propagates to the user. Inline suggestion below.
Note: the re-review diff could not sub-classify new findings (newly introduced vs. late catch) because history between the prior review commit and the current HEAD was rewritten by a force-push.
…e in DESCRIBE When PATH is enabled, persist the effective resolution path at creation time for views and SQL functions: - Views: store frozen path in view.resolutionPath property; strip system.session for persisted views, keep for temp views. - SQL functions: store frozen path in function.resolutionPath property; same stripping rules as views. - DESCRIBE EXTENDED/JSON for views shows "SQL Path" when PATH is enabled and a stored path exists. - DESCRIBE FUNCTION EXTENDED for SQL UDFs shows "SQL Path" when PATH is enabled and a stored path exists. - CapturesConfig denies spark.sql.session.path from the sqlConfig blob (path is stored separately in the dedicated property). - Old views/functions without a stored path use the default resolution path (resolutionSearchPath with session function resolution order). The stored path is not yet used during analysis (frozen path resolution comes in a follow-up). This PR only persists and displays the metadata. Part of SPARK-54810.
8ccfd95 to
95da8aa
Compare
cloud-fan
left a comment
There was a problem hiding this comment.
Round 4 re-review. Status: 1 addressed, 2 remaining, 0 new.
- Addressed: the
catchonstoredResolutionPathStringis narrowed back toNoSuchFunctionException | NoSuchDatabaseException.CORRUPTED_CATALOG_FUNCTIONand otherAnalysisExceptions now propagate to the user instead of being silently masked as "no path stored." Thanks. - Remaining (still valid, not re-posting inline): (1)
DESCRIBE FUNCTION EXTENDEDon SQL UDFs still has no automated coverage for theDescribeFunctionCommandUtilscode path (persistent catalog branch and temp-UDF usage-blob fallback). (2) The PATH persistence tests (persistent view/function stripssystem.session, temp view keeps it,SET PATHround-trip) still live underDescribeTableSuiterather than alongside the other PATH-feature tests inSetPathSuite. - New: none. The only diff since the prior review is the
catchnarrowing.
Note: history between the prior review commit and the current HEAD was rewritten by force-push, so new-vs-late-catch sub-classification is not applicable this round (no new findings anyway).
DESCRIBE for SQL UDF (whcih is where PATH matters) is no really implemented. Thee is no formatted output of teh function. Adding proper DESRIBE (and QA for it) shodul be a separate PR. |
cloud-fan
left a comment
There was a problem hiding this comment.
Round 5 re-review. Status: 0 addressed, 0 remaining, 0 new.
- No code changes since the Round 4 HEAD (
95da8aa); the only new input is your pushback on DESCRIBE FUNCTION test coverage. - Remaining finding #1 (DESCRIBE FUNCTION EXTENDED SQL UDF test coverage): your scope argument is accepted. Confirmed by reading
functions.scala—DescribeFunctionCommand.run()for SQL UDFs only emits Function/Class/Usage/Extended Usage plus the newSQL Pathrow; there is no fleshed-out DESCRIBE FUNCTION output, so bundling QA for it with a follow-up PR that implements proper DESCRIBE for SQL UDFs is reasonable. - Remaining finding #2 (persistence tests under
DescribeTableSuitevs.SetPathSuite): accepting your implicit decision. The added test is genuinely DESCRIBE-focused (it asserts the JSON shape and the backtick-quoted display string), soDescribeTableSuiteis a defensible home. Dedicated persistence-semantics tests (temp keepssystem.session, persistent strips it, round-trip) don't exist yet, but that's a coverage gap to track separately — not something to keep re-raising here.
No further AI concerns. Closing the re-review loop.
|
thanks, merging to master! |
What changes were proposed in this pull request?
When
spark.sql.path.enabledis true, persist the effective resolution path at creation time for views and SQL functions, and expose it in DESCRIBE output.View persistence (
views.scala):view.resolutionPathproperty at CREATE VIEW time.system.sessionfrom the stored path since persistent views cannot reference temporary objects (session scope). Temporary views keep it because temp objects can reference other temp objects.SQL function persistence (
CreateSQLFunctionCommand.scala,SQLFunction.scala):function.resolutionPathproperty at CREATE FUNCTION time.Storage format:
[["spark_catalog","default"],["system","builtin"]][["catalog","ns1","ns2"]].DESCRIBE output for views (
interface.scala):DESCRIBE EXTENDED/DESCRIBE FORMATTEDshowsSQL Pathwith backtick-quoted entries (e.g.`spark_catalog`.`default`, `system`.`builtin`).DESCRIBE ... AS JSONexposessql_pathas an array of objects, consistent withdescribeIdentifier:DESCRIBE FUNCTION EXTENDED for SQL UDFs (
functions.scala):SQL Pathline for SQL UDFs when PATH is enabled and a stored path exists.Backward compatibility:
resolutionPathproperty. They fall back to the default resolution path (resolutionSearchPathwith session function resolution order).The stored path is not yet used during analysis -- frozen path resolution comes in a follow-up PR. This PR only persists and displays the metadata.
Why are the changes needed?
Views and SQL functions need to capture the resolution path at creation time so that their body can later resolve with the same path, independent of the caller's session path. This is the SQL-standard behavior for view and routine body resolution.
Part of SPARK-54810. Depends on SPARK-56501 (SET PATH syntax).
Does this PR introduce any user-facing change?
Yes.
DESCRIBE EXTENDEDon views showsSQL Pathwhen PATH is enabled.DESCRIBE FUNCTION EXTENDEDon SQL UDFs showsSQL Pathwhen PATH is enabled.DESCRIBE ... AS JSONincludessql_pathfield for views when PATH is enabled.How was this patch tested?
DescribeTableSuite: new test verifyingDESCRIBE EXTENDED view AS JSONincludessql_pathwith correct structure, and regularDESCRIBE EXTENDEDshowsSQL Path(both V1 catalog V1 and V2 command variants).DescribeTableSuiteBase: addedSqlPathEntrycase class andsql_pathfield toDescribeTableJsonfor JSON deserialization.SetPathSuite: all 22 existing tests pass (no regressions).Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Opus 4.6