Skip to content

feat: Replace information schema with describe calls#1432

Merged
tejassp-db merged 22 commits into
1.12.latestfrom
replace-information-schema-with-describe-calls
May 14, 2026
Merged

feat: Replace information schema with describe calls#1432
tejassp-db merged 22 commits into
1.12.latestfrom
replace-information-schema-with-describe-calls

Conversation

@tejassp-db
Copy link
Copy Markdown
Contributor

Description

Replace expensive information_schema queries with a single DESCRIBE TABLE EXTENDED ... AS JSON call for metadata fetching in _describe_relation methods. Gated behind DBRCapability.DESCRIBE_TABLE_EXTENDED_AS_JSON (DBR 17.3+), falls back to existing info_schema queries on older runtimes.

Changes:

  • IncrementalTableAPI._describe_relation: replaces 4 info_schema queries (PK, FK, non-null constraints, column masks) with AS JSON parsing
  • MaterializedViewAPI._describe_relation: replaces get_view_description (info_schema.views) with AS JSON parsing
  • ViewAPI._describe_relation: same as MV
  • New DatabricksDescribeJsonMetadata parser class with composite PK/FK support
  • New is_describe_as_json_supported() gating method (checks HMS, foreign table, capability)
  • New is_foreign_table property on DatabricksRelation
  • New describe_table_extended_as_json Jinja macro

Testing:

  • 56 unit tests: parser, processor roundtrip, diff validation, edge cases
  • Capability boundary tests (17.2 vs 17.3)
  • is_describe_as_json_supported unit tests (UC, HMS, foreign table, no capability)
  • Jinja macro SQL generation test
  • Functional tests validated on both code paths:
    • SQL warehouse (DBR 17.3+): all 14 constraint/mask tests passed via DESCRIBE AS JSON path
    • UC cluster (DBR 16.4 LTS): all 14 tests passed via information_schema fallback path

Checklist

  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt-databricks next" section.

PECOBLR-2546

Add alternate code path for metadata fetching in _describe_relation
methods using DESCRIBE TABLE EXTENDED AS JSON (DBR 17.3+).

Replaces 4 information_schema queries in IncrementalTableAPI and
get_view_description in MaterializedViewAPI/ViewAPI. Falls back
to legacy queries when capability is unavailable.

PECOBLR-2546
Wrap long docstrings and expressions to satisfy E501, and add the
missing return annotation on DatabricksRelation.is_foreign_table for
mypy.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 29, 2026

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  dbt/adapters/databricks
  dbr_capabilities.py
  impl.py 444-445, 1226-1236, 1280-1286, 1336-1337, 1384, 1414, 1492, 1535, 1563, 1603-1606, 1628, 1756, 1800
  relation.py 151
Project Total  

This report was generated by python-coverage-comment-action

tejassp-db added 2 commits May 4, 2026 10:59
Fix primary key and foreign key constraint parser to handle non-delimited
identifiers within backticks.
@sd-db
Copy link
Copy Markdown
Collaborator

sd-db commented May 9, 2026

/integration-test

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 9, 2026

Integration tests dispatched for PR #1432 by @sd-db. Track progress in the Actions tab.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 9, 2026

Integration results for PR #1432 — UC cluster ✅ success · SQL warehouse ❌ failure · All-purpose cluster ✅ success · Shard coverage ✅ success

Run details.

@sd-db
Copy link
Copy Markdown
Collaborator

sd-db commented May 9, 2026

On the failure,

FAILED tests/functional/adapter/python_model/test_python_model.py::TestSpecifyingHttpPath::test_singular_tests - AssertionError: dbt exit state did not match expected

this is known flaky test so can be ignored (I will look to solve for this separately)

Copy link
Copy Markdown
Collaborator

@sd-db sd-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review 1/x. Added a few comments on structure

Comment thread dbt/adapters/databricks/impl.py Outdated
f"Current connection does not meet this requirement."
)

def is_describe_as_json_supported(self, relation: DatabricksRelation) -> bool:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again I feel we should rename this, this check is specifically for using DECRIBE TABLE EXTENDED AS JSON for getting metadata we should mark it as such as there can be other places where we are using DESCRIBE... AS JSON that don't go through this check. Some suggestions is_describe_as_json_relation_metadata_supported() or maybe can_use_describe_json_for_relation_metadata()

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method specifically is to check whether "DESCRIBE TABLE EXTENDED AS JSON" is supported. The method should not take responsibility of how the call response is consumed.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for raising this was actually for the behaviour flag check --> bool(self.behavior.use_describe_as_json_for_relation_metadata.no_warn); since this is dependent on the flag it is a more specific version/flavour, we can look to either move this out or rename

Comment thread dbt/adapters/databricks/impl.py Outdated
Comment thread dbt/adapters/databricks/impl.py
)

@classmethod
def parse_column_masks(cls, json_metadata: dict[str, Any]) -> "Table":
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all the parse methods we are heavily relying on the structure of the reponse from the DESCRIBE...JSON call. While this works. I feel it might have been much better to define our out class/data-model and load the results from the DESCRIBE...JSON call into the data-model. Basically a DAO layer. Here we are at the mercy of the server and whatever validations we might/might not add on correctness and there can still be edge-cases.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future.

Comment thread dbt/adapters/databricks/impl.py
Copy link
Copy Markdown
Collaborator

@sd-db sd-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, made some minor comments

@tejassp-db
Copy link
Copy Markdown
Contributor Author

/integration-test

@github-actions
Copy link
Copy Markdown

Integration tests dispatched for PR #1432 by @tejassp-db. Track progress in the Actions tab.

@github-actions
Copy link
Copy Markdown

Integration results for PR #1432 — UC cluster ✅ success · SQL warehouse ✅ success · All-purpose cluster ✅ success · Shard coverage ✅ success

Run details.

@tejassp-db
Copy link
Copy Markdown
Contributor Author

/integration-test

@github-actions
Copy link
Copy Markdown

Integration tests dispatched for PR #1432 by @tejassp-db. Track progress in the Actions tab.

@tejassp-db
Copy link
Copy Markdown
Contributor Author

/integration-test

@github-actions
Copy link
Copy Markdown

Integration tests dispatched for PR #1432 by @tejassp-db. Track progress in the Actions tab.

@github-actions
Copy link
Copy Markdown

Integration results for PR #1432 — UC cluster ❌ cancelled · SQL warehouse ❌ failure · All-purpose cluster ✅ success · Shard coverage ❌ failure

Run details.

@github-actions
Copy link
Copy Markdown

Integration results for PR #1432 — UC cluster ✅ success · SQL warehouse ❌ failure · All-purpose cluster ✅ success · Shard coverage ✅ success

Run details.

@tejassp-db
Copy link
Copy Markdown
Contributor Author

Integration results for PR #1432 — UC cluster ✅ success · SQL warehouse ❌ failure · All-purpose cluster ✅ success · Shard coverage ✅ success

Run details.

Flaky integration test, this code path does not touch those tests or their code path.

@tejassp-db tejassp-db merged commit 3caad33 into 1.12.latest May 14, 2026
7 checks passed
@tejassp-db tejassp-db deleted the replace-information-schema-with-describe-calls branch May 14, 2026 08:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants