feat: Hive / Kyuubi / Spark Thrift Server connector by shirshanka · Pull Request #30 · datahub-project/analytics-agent

shirshanka · 2026-05-05T14:39:46Z

Summary

Implements the Hive engine support requested in #13, following the isolated-connector architecture introduced in #24.

New analytics-agent-connector-hive package (connectors/hive/) — runs as an MCP subprocess, keeping heavy Thrift/SASL deps out of the base install. Uses PyPI-stable releases (pyhive[hive], pure-sasl, thrift-sasl) instead of git-pinned dependencies.
Supports all common auth modes: NONE, NOSASL, LDAP, PLAIN, KERBEROS
Hive registered in _CONNECTOR_MAP — the UI "Install connector" flow and env-var wiring work automatically, same as Snowflake/BigQuery
connect_args passthrough in SQLAlchemyQueryEngine — generic improvement from feat: Supports hive engine for connecting to hiveserver2/kyuubi/spark thrift server #13 that benefits all SQLAlchemy-based connections
Frontend plugin — Hive/Kyuubi/Spark appears in the data source picker

What changed from #13

#13 approach	This PR
Deps added to base wheel (`pyhive @ git+kyuubi`, `kerberos`, `thrift`)	Isolated connector package, zero impact on base install
Git-pinned, non-reproducible deps	PyPI releases only
`sql_allow_limit` exposed as LLM tool param	Removed — `_apply_row_limit` already skips if LIMIT present
`hive` → `SQLAlchemyQueryEngine` inline	`hive` → MCP subprocess (same pattern as Snowflake)

Test plan

uv tool install analytics-agent-connector-hive installs cleanly
Add Hive data source in UI — "Install connector" flow works
NONE auth connects to a local HiveServer2 / Kyuubi instance
LDAP auth (username + password) works
execute_sql, list_tables, get_schema, preview_table return correct results

@wForget — this PR addresses your request from #13. Could you give it a try against your Kyuubi setup and report back? The connector package will need to be installed separately (uv tool install analytics-agent-connector-hive) until it lands on PyPI — for now you can install from the repo:

uv tool install ./connectors/hive

Any feedback on the auth config or field names would be welcome.

🤖 Generated with Claude Code

@wForget

Adds support for HiveServer2-compatible engines (Apache Hive, Kyuubi, Spark Thrift Server) following the existing isolated-connector architecture. - connectors/hive/ — new analytics-agent-connector-hive package: uses PyPI-stable pyhive[hive], pure-sasl, thrift-sasl (no git deps); supports NONE, NOSASL, LDAP, PLAIN, and KERBEROS auth modes - factory.py — registers "hive" in _CONNECTOR_MAP so the UI install flow and env-var wiring work automatically - sqlalchemy/engine.py — passes connect_args from connection config to create_engine(), enabling dialect-specific driver options for all SQLAlchemy-based connections (contributed by @wForget in #13) - frontend — adds Hive/Kyuubi/Spark plugin to the data source picker Closes #13 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

wForget · 2026-05-06T04:24:17Z

+    { key: "host",     label: "Host",     type: "mono", placeholder: "kyuubi-host or localhost", required: true },
+    { key: "port",     label: "Port",     type: "mono", placeholder: "10000" },
+    { key: "database", label: "Database", type: "mono", placeholder: "default" },
+    { key: "auth",     label: "Auth",     type: "mono", placeholder: "NONE  (or NOSASL, LDAP, KERBEROS)" },


Missing kerberos_service_name

{ key: "kerberos_service_name", label: "Kerberos Service Name", type: "mono", placeholder: "hive" },

wForget · 2026-05-06T04:29:57Z

+requires-python = ">=3.10"
+dependencies = [
+    "mcp>=1.0.0",
+    "pyhive[hive]>=0.6.5",


For python 3.11+, we might need pyhive[hive_pure_sasl], see: https://github.com/apache/kyuubi/tree/master/python#requirements

"pyhive[hive_pure_sasl]>=0.7.0",

Note: 'pyhive[hive]' extras uses sasl that doesn't support Python 3.11, See cloudera/python-sasl#30. Hence PyHive also supports pure-sasl via additional extras 'pyhive[hive_pure_sasl]' which support Python 3.11.

Furthermore, it seems that kerberos is required on the Kerberos environment.

"kerberos>=1.3.0",

wForget · 2026-05-06T04:33:41Z

Thanks @shirshanka, after making the two changes above, I was able to successfully submit queries to our kyuubi server.

…os_service_name UI field - Switch from pyhive[hive]>=0.6.5 to pyhive[hive_pure_sasl]>=0.7.0 — the sasl extra relies on the `sasl` C library which doesn't build on Python 3.11+ - Add kerberos_service_name field to the Hive connection UI so Kerberos auth can be configured without manually editing env vars Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

shirshanka · 2026-05-06T19:11:03Z

Thanks @wForget — both fixes are now in: switched to pyhive[hive_pure_sasl]>=0.7.0 for Python 3.11+ compatibility and added the kerberos_service_name field to the UI. Great catch on both, and glad it's working on Kyuubi!

wForget

Thanks @shirshanka

shirshanka mentioned this pull request May 5, 2026

feat: Supports hive engine for connecting to hiveserver2/kyuubi/spark thrift server #13

Closed

wForget reviewed May 6, 2026

View reviewed changes

wForget approved these changes May 7, 2026

View reviewed changes

shirshanka merged commit 7aaa12a into main May 7, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Hive / Kyuubi / Spark Thrift Server connector#30

feat: Hive / Kyuubi / Spark Thrift Server connector#30
shirshanka merged 2 commits intomainfrom
feat/hive-connector

shirshanka commented May 5, 2026

Uh oh!

wForget May 6, 2026

Uh oh!

wForget May 6, 2026

Uh oh!

wForget commented May 6, 2026

Uh oh!

shirshanka commented May 6, 2026

Uh oh!

wForget left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shirshanka commented May 5, 2026

Summary

What changed from #13

Test plan

Uh oh!

wForget May 6, 2026

Choose a reason for hiding this comment

Uh oh!

wForget May 6, 2026

Choose a reason for hiding this comment

Uh oh!

wForget commented May 6, 2026

Uh oh!

shirshanka commented May 6, 2026

Uh oh!

wForget left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants