Skip to content

Add Doris connector#29120

Open
ningsh7 wants to merge 4 commits intotrinodb:masterfrom
ningsh7:doris-flight-sql-connector
Open

Add Doris connector#29120
ningsh7 wants to merge 4 commits intotrinodb:masterfrom
ningsh7:doris-flight-sql-connector

Conversation

@ningsh7
Copy link
Copy Markdown

@ningsh7 ningsh7 commented Apr 15, 2026

Description

Introduce the initial Doris connector implementation.

The connector uses three Doris interfaces:

  • Doris FE HTTP for query planning and split generation
  • Doris FE JDBC for metadata access
  • Apache Arrow Flight SQL for data reads

It includes support for metadata access, type mapping, split generation, predicate and limit pushdown, and reading Doris tables from Trino.

This change also adds the required configuration, unit tests, integration tests, and product tests to validate the implementation and provide a baseline for future improvements.

Additional context and related issues

Compared with using the MySQL connector against Doris, this connector adds Doris-specific split planning, Flight SQL-based reads, and Doris-specific type handling.

Related discussions:

Testing includes:

  • unit tests for configuration, metadata, split planning, and type mapping
  • connector tests following the BaseConnectorTest pattern
  • smoke and integration tests against a real Doris instance
  • product tests with a Docker-based Doris environment

Follow-up changes may further improve performance, high availability, caching, and pushdown coverage.

Release notes

(x) Release notes are required, with the following suggested text:

## Doris connector
* Add initial support for querying Doris tables from Trino.

@cla-bot
Copy link
Copy Markdown

cla-bot Bot commented Apr 15, 2026

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@github-actions github-actions Bot added the docs label Apr 15, 2026
@ningsh7
Copy link
Copy Markdown
Author

ningsh7 commented Apr 15, 2026

I submitted my signed CLA to cla@trino.io on April 1, 2026, but the CLA check is still failing. Please let me know if there is anything else I should do on my side.

@ningsh7 ningsh7 force-pushed the doris-flight-sql-connector branch from 01c7cf2 to 95566bb Compare April 16, 2026 03:30
@cla-bot
Copy link
Copy Markdown

cla-bot Bot commented Apr 16, 2026

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

1 similar comment
@cla-bot
Copy link
Copy Markdown

cla-bot Bot commented Apr 16, 2026

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@ningsh7 ningsh7 force-pushed the doris-flight-sql-connector branch from b59a6b4 to 34fc8af Compare April 16, 2026 07:51
@cla-bot
Copy link
Copy Markdown

cla-bot Bot commented Apr 16, 2026

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@ningsh7
Copy link
Copy Markdown
Author

ningsh7 commented Apr 16, 2026

Compared to using the generic MySQL connector

1.More Data Type Support (Including boolean and largeint)
Previously, as discussed in Issue #17329 (which applies to both Doris and StarRocks), using the MySQL connector to query Doris tables resulted in type mapping issues.

This new Doris connector properly supports and maps these types. As demonstrated in the test queries below, types like boolean, decimal(38,10) and various datetime scales are now natively supported and correctly formatted without casting errors.

For example:

The MySQL connector maps Doris boolean to tinyint.

image image

2.Resolving the Mixed-Case Identifier Issue
Doris supports case-sensitive identifiers (e.g., distinguishing between a database named DB_A and db_a).
When using the MySQL connector, Trino normalizes identifiers to lowercase, which means a database named DB_A in Doris becomes invisible or gets conflated with db_a.
With this native Doris connector, we have addressed this case-sensitivity mismatch. Both DB_A and db_a are now properly resolved and managed under the connector's schema handling, ensuring no tables or databases are hidden from Trino due to case folding.
image

@ningsh7
Copy link
Copy Markdown
Author

ningsh7 commented Apr 16, 2026

Performance improved significantly with the inclusion of pushdown support.
Although not included in this specific PR, the results below were obtained using a pushdown-enabled build.

In terms of performance evaluation, we used the TPC-H benchmark with the 100 GB scale dataset.
The dataset size is as follows:

Table Rows
nation 25
region 5
customer 15,000,000
part 20,000,000
supplier 1,000,000
partsupp 80,000,000
orders 150,000,000
lineitem 600,037,902

For data generation, table creation on the Doris side, and the query statements used in the benchmark, we followed the official Doris TPC-H test cases:

https://github.com/apache/doris/tree/master/tools/tpch-tools

Test environment:

  • Trino version: 480
  • Doris version: 2.1.1

Below are the comparison results.

We prepared two tables:

  1. A summary table showing the average execution time comparison between the two connectors for each query.
    For example:
图片1
Query ID Doris Avg (s) MySQL Avg (s) Difference (s) Speedup (x)
Q1 53.54 1577.87 1524.33 29.47
Q2 22.2 122.54 100.35 5.52
Q3 18.58 486.24 467.66 26.17
Q4 23.7 948.31 924.61 40.02
Q5 36.32 1098.33 1062.02 30.24
Q6 2.31 14.84 12.54 6.43
Q7 31.59 429 397.41 13.58
Q8 40.15 1269.03 1228.88 31.6
Q9 48.82 1666.75 1617.93 34.14
Q10 15.39 299.22 283.83 19.44
Q11 15.97 135.53 119.56 8.49
Q12 8.16 190.81 182.65 23.39
Q13 12.16 221.55 209.39 18.22
Q14 3.92 19.82 15.9 5.06
Q15 4.74 0.84 3.9 0.18
Q16 7.25 72.94 65.69 10.05
Q17 32.42 822.28 789.86 25.36
Q18 48.59 617.77 569.18 12.71
Q19 5.83 66.56 60.73 11.42
Q20 14.06 162.14 148.08 11.53
Q21 73.03 1043.47 970.44 14.29
Q22 5.1 25.57 20.47 5.01
  1. A detailed table showing the results of three runs for each query on each connector, together with the average value.
    For example:
Query ID Connector Name Run 1 (ms) Run 2 (ms) Run 3 (ms) Average (ms)
Q1 Doris 53760 53746 53108 53538
Q1 Mysql 1459383 1600915 1673318 1577872
Q2 Doris 22021 21547 23018 22195.33
Q2 Mysql 124783 119777 123061 122540.33
Q3 Doris 19166 19503 17066 18578.33
Q3 Mysql 477098 493870 487745 486237.67
Q4 Doris 24197 24057 22842 23698.67
Q4 Mysql 990317 844949 1009670 948312
Q5 Doris 37241 37181 34527 36316.33
Q5 Mysql 1138956 1126418 1029628 1098334
Q6 Doris 2629 2466 1825 2306.67
Q6 Mysql 13531 15573 15423 14842.33
Q7 Doris 33986 30271 30505 31587.33
Q7 Mysql 428067 417250 441683 429000
Q8 Doris 40502 40500 39458 40153.33
Q8 MySQL 1245228 1213269 1348607 1269034.67
Q9 Doris 48249 49651 48554 48818
Q9 MySQL 1681189 1678228 1640839 1666752
Q10 Doris 15851 15570 14754 15391.67
Q10 MySQL 301863 299843 295945 299217
Q11 Doris 15948 15801 16148 15965.67
Q11 MySQL 134509 132464 139612 135528.33
Q12 Doris 8408 8269 7801 8159.33
Q12 MySQL 193085 189366 189988 190813
Q13 Doris 11688 12359 12440 12162.33
Q13 MySQL 218487 229379 216786 221550.67
Q14 Doris 4256 3735 3758 3916.33
Q14 MySQL 19769 19537 20141 19815.67
Q15 Doris 4713 5294 4227 4744.67
Q15 MySQL 754 1056 718 842.67
Q16 Doris 7885 6891 6989 7255
Q16 MySQL 73401 75994 69438 72944.33
Q17 Doris 33693 30704 32863 32420
Q17 MySQL 847493 859640 759699 822277.33
Q18 Doris 48210 46061 51504 48591.67
Q18 MySQL 603703 627725 621893 617773.67
Q19 Doris 5338 6564 5591 5831
Q19 MySQL 69071 64660 65958 66563
Q20 Doris 14522 13593 14067 14060.67
Q20 MySQL 131603 141046 213769 162139.33
Q21 Doris 76828 72275 69998 73033.67
Q21 MySQL 987613 1037356 1105455 1043474.67
Q22 Doris 5304 5095 4909 5102.67
Q22 MySQL 26703 24837 25175 25571.67

Overall, based on the TPC-H 100 GB tests we ran, the Doris connector showed significantly better query performance than the MySQL connector in these cases. That said, these results are only intended as a preliminary reference under this specific benchmark setup, and we understand that actual performance may vary depending on workload, schema design, cluster configuration, and query patterns.

In addition, when the benchmark execution reached Q15, we found that we had overlooked view handling. We have already added support for views in the second commit.

We also referred to the implementation in the Doris Flink connector:

https://github.com/apache/doris-flink-connector

@ningsh7
Copy link
Copy Markdown
Author

ningsh7 commented Apr 16, 2026

@ebyhr Hi! This PR introduces the initial Doris connector. Could you please help approve the workflow execution and take a look at the implementation when you have time?

@cla-bot
Copy link
Copy Markdown

cla-bot Bot commented Apr 17, 2026

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@ningsh7 ningsh7 force-pushed the doris-flight-sql-connector branch from ea8a08e to 4c26ca9 Compare April 17, 2026 07:38
@cla-bot
Copy link
Copy Markdown

cla-bot Bot commented Apr 17, 2026

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@ningsh7 ningsh7 force-pushed the doris-flight-sql-connector branch from 4c26ca9 to db17789 Compare April 17, 2026 08:33
@cla-bot
Copy link
Copy Markdown

cla-bot Bot commented Apr 17, 2026

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

2 similar comments
@cla-bot
Copy link
Copy Markdown

cla-bot Bot commented Apr 21, 2026

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@cla-bot
Copy link
Copy Markdown

cla-bot Bot commented Apr 21, 2026

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@ningsh7 ningsh7 force-pushed the doris-flight-sql-connector branch from 219d2ac to b6a67c3 Compare April 21, 2026 08:56
@cla-bot
Copy link
Copy Markdown

cla-bot Bot commented Apr 21, 2026

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

1 similar comment
@cla-bot
Copy link
Copy Markdown

cla-bot Bot commented Apr 22, 2026

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@cla-bot
Copy link
Copy Markdown

cla-bot Bot commented Apr 22, 2026

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

Introduce the initial Doris connector, including metadata access, query planning, type mapping, and read paths needed to query Doris tables from Trino.

Add the doc, unit tests, and product tests needed.
@ningsh7 ningsh7 force-pushed the doris-flight-sql-connector branch from 6f343cc to 372d052 Compare April 22, 2026 08:31
@cla-bot
Copy link
Copy Markdown

cla-bot Bot commented Apr 22, 2026

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@ningsh7
Copy link
Copy Markdown
Author

ningsh7 commented Apr 22, 2026

Following the community’s suggestion, we narrowed the scope of this PR for the initial version of the connector.

In this revision, we removed pushdown support to keep the PR minimal.

Even without pushdown, the Doris connector is still faster than the MySQL connector in most query scenarios in our benchmark, thanks to the new connection approach and related improvements.

Below is a comparison of the average execution time of all 22 TPC-H SF100 queries between:

  1. the Doris connector without pushdown, and
  2. the MySQL connector.
Query ID Doris Avg (s)Not Pushed Down MySQL Avg (s) Difference (s) Speedup (x)
Q1 56.97 1488.73 1431.77 26.13
Q2 23.88 117.52 93.64 4.92
Q3 31.51 496.78 465.27 15.76
Q4 19.29 907.54 888.25 47.04
Q5 40.96 1115.43 1074.47 27.23
Q6 26.13 11.86 -14.26 0.45
Q7 47.54 433.25 385.71 9.11
Q8 41.43 1360.42 1319 32.84
Q9 48.34 1661.85 1613.52 34.38
Q10 36.99 290.81 253.82 7.86
Q11 17.12 136.97 119.85 8
Q12 28.26 196.4 168.14 6.95
Q13 12.73 229.89 217.17 18.06
Q14 21.47 20.96 -0.51 0.98
Q15 4.96 0.67 -4.29 0.13
Q16 8.08 76.45 68.37 9.46
Q17 48.31 885.85 837.53 18.34
Q18 52.2 602.46 550.26 11.54
Q19 47.19 67.1 19.91 1.42
Q20 28.33 190.44 162.11 6.72
Q21 76.18 1105.2 1029.02 14.51
Q22 6.96 27.1 20.14 3.89

We also include a performance comparison between:

  1. the Doris connector with pushdown, and
  2. the Doris connector without pushdown.
Query ID Doris Avg (s)Not Pushed Down Doris Avg (s)Pushed Down MySQL Avg (s) Speedup (x)Not Pushed Down Speedup (x)Pushed Down Speedup Difference
Q1 56.97 53.87 1538.40 27.00 28.56 1.55
Q2 23.88 22.25 127.51 5.34 5.73 0.39
Q3 31.51 19.11 485.12 15.40 25.39 9.99
Q4 19.29 18.59 912.30 47.29 49.07 1.78
Q5 40.96 40.77 1117.42 27.28 27.41 0.13
Q6 26.13 1.65 12.19 0.47 7.39 6.92
Q7 47.54 31.59 412.63 8.68 13.06 4.38
Q8 41.43 38.83 1374.33 33.17 35.39 2.22
Q9 48.34 45.86 1649.46 34.12 35.97 1.85
Q10 36.99 15.25 297.47 8.04 19.51 11.46
Q11 17.12 15.95 147.58 8.62 9.25 0.63
Q12 28.26 7.79 201.15 7.12 25.82 18.70
Q13 12.73 12.00 247.48 19.44 20.62 1.18
Q14 21.47 3.93 21.31 0.99 5.42 4.43
Q15 4.96 5.35 0.61 0.12 0.11 -0.01
Q16 8.08 6.53 83.15 10.29 12.73 2.44
Q17 48.31 33.01 876.94 18.15 26.57 8.41
Q18 52.20 44.06 614.44 11.77 13.95 2.17
Q19 47.19 5.15 71.01 1.50 13.79 12.28
Q20 28.33 13.61 211.56 7.47 15.54 8.08
Q21 76.18 73.18 1132.71 14.87 15.48 0.61
Q22 6.96 5.39 26.30 3.78 4.88 1.10

@ningsh7
Copy link
Copy Markdown
Author

ningsh7 commented Apr 22, 2026

Hi @ebyhr. Could you please take a look? I’d really appreciate your review.

</artifact>
</artifactSet>

<artifactSet to="plugin/doris">
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you update ci.yml as well?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.
I added plugin/trino-doris to the standard CI test matrix and excluded :trino-doris from the fallback test-other-modules list.

Comment thread docs/src/main/sphinx/connector/doris.md Outdated
Comment on lines +80 to +83
* - `doris.largeint-mapping`
- No
- Mapping for Doris `LARGEINT`. Valid values are `VARCHAR` and `DECIMAL`.
The default is `VARCHAR`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this config property and map to NUMBER type instead?

Also, is this required for the initial PR? I want to handle it as a follow-up.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.
I removed the doris.largeint-mapping config, enum, docs, and config tests.
LARGEINT is now mapped directly to Trino NUMBER, and the Arrow conversion path handles NUMBER values. Additionally, I have verified in a real test environment that -170141183460469231731687303715884105 and 170141183460469231731687303715884105 can be selected successfully.

Comment on lines +52 to +55
public DorisMetadata(DorisMetadataClient metadataClient, DorisTypeMapper typeMapper)
{
this(metadataClient, typeMapper, Set.of());
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constructor can be removed. It is called only from tests.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.
I removed the test-only DorisMetadata constructor and updated the tests to use the production constructor explicitly.

Comment on lines +116 to +118
@SuppressWarnings("deprecation")
@Override
public Map<SchemaTableName, List<ColumnMetadata>> listTableColumns(ConnectorSession session, SchemaTablePrefix prefix)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please implement streamRelationColumns method instead.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.
I replaced the deprecated listTableColumns implementation with streamRelationColumns and return RelationColumnsMetadata for both tables and views.


import io.trino.spi.connector.ConnectorSession;

public interface DorisQueryEventListener
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this interface really needed?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.
The listener was only a placeholder for future extension, so I removed the interface, beginQuery/cleanupQuery hooks, bindings, and related tests.

throw t;
}
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add main method?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.
I added a main method to DorisQueryRunner following the style used by other connector query runners.

Comment on lines +103 to +111
public void clearLastRequest()
{
lastRequest.set(null);
}

public Optional<Request> getLastRequest()
{
return Optional.ofNullable(lastRequest.get());
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove unused methods.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.
Unused methods have been removed.

import java.util.Optional;
import java.util.OptionalLong;

public interface DorisMetadataClient
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's avoid having 2 methods with/without ConnectorSession parameter.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.
I simplified DorisMetadataClient and updated the JDBC implementation and tests accordingly.

import static java.util.Objects.requireNonNull;

@Execution(ExecutionMode.SAME_THREAD)
final class TestDorisTypeMapping
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type mapping should cover more values, e.g. null, min, max, non-ascii characters, julian->gregorian switch, DST, and etc.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.
I expanded TestDorisTypeMapping to cover nulls, min/max values, non-ASCII strings, Julian/Gregorian date boundaries, DST-like timestamp values, decimal boundaries, and char/varchar empty and trailing-space cases.

Comment on lines +103 to +106
try (Connection connection = openConnection(session);
PreparedStatement statement = connection.prepareStatement(LIST_SCHEMAS_SQL.formatted(VISIBLE_SCHEMAS_PREDICATE));
ResultSet resultSet = statement.executeQuery()) {
List<String> schemas = new ArrayList<>();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to build this query in the connector? Does JDBC driver's DatabaseMetaData not work?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot replace the Doris metadata queries with DatabaseMetaData wholesale. The connector needs Doris-specific INFORMATION_SCHEMA fields such as ENGINE, TABLE_ROWS, and raw COLUMN_TYPE. JDBC DatabaseMetaData exposes standard table/column metadata but does not preserve these fields with the fidelity needed for Doris table filtering, row-count estimates, and Doris-specific type mapping.
Case-insensitive resolution is also handled explicitly because Trino normalizes identifiers to lowercase.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, listSchemaNames could perhaps be changed to use JDBC DatabaseMetaData.getCatalogs(), but getTables / getTable / loadColumns / getTableRowCount cannot be switched wholesale, because the standard DatabaseMetaData API does not expose the Doris-specific fields required here.

Add the Doris module to the CI test matrix and exclude it from the fallback module list. Remove the LARGEINT mapping configuration and map LARGEINT to NUMBER directly.

Replace deprecated metadata column listing with streamRelationColumns. Simplify metadata APIs by keeping only ConnectorSession-aware methods. Remove unused query listener hooks and test request state.

Add the QueryRunner main entry point and expand type mapping tests for nulls, boundary values, non-ASCII strings, date/time edge cases, and decimal limits.
@cla-bot
Copy link
Copy Markdown

cla-bot Bot commented Apr 27, 2026

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@ningsh7
Copy link
Copy Markdown
Author

ningsh7 commented Apr 28, 2026

Thanks for your review, @ebyhr. We have addressed the comments and pushed a new round of fixes. Could you please take another look when you have time?

@ningsh7 ningsh7 requested a review from ebyhr April 28, 2026 02:53
Expose the root cause from Doris Flight SQL failures in connector error messages, so users can see actionable Doris or network errors without debugging the wrapped ADBC exception.

Document the FE and BE Flight SQL endpoint requirements, including BE arrow_flight_sql_port setup and public_host/proxy configuration for deployments where Doris returns internal BE addresses.
@cla-bot cla-bot Bot added the cla-signed label Apr 28, 2026
@cla-bot
Copy link
Copy Markdown

cla-bot Bot commented Apr 28, 2026

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@cla-bot cla-bot Bot removed the cla-signed label Apr 28, 2026
@ningsh7 ningsh7 force-pushed the doris-flight-sql-connector branch from a3f0594 to 2299d73 Compare April 28, 2026 14:30
@cla-bot cla-bot Bot added the cla-signed label Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants