Skip to content

fix: resolve lineage extraction bug for MySQL 5.7.3#28036

Open
cchenan wants to merge 2 commits into
open-metadata:mainfrom
cchenan:fix/mysql-5.7.3-lineage
Open

fix: resolve lineage extraction bug for MySQL 5.7.3#28036
cchenan wants to merge 2 commits into
open-metadata:mainfrom
cchenan:fix/mysql-5.7.3-lineage

Conversation

@cchenan
Copy link
Copy Markdown

@cchenan cchenan commented May 11, 2026

Problem

Lineage extraction fails when connecting to MySQL 5.7.3. The queries in queries.py use syntax or columns that are not available in this version.

Solution

Modified ingestion/src/metadata/ingestion/source/database/mysql/queries.py to be compatible with MySQL 5.7.3.

Changes:

  • Adjusted information_schema queries for MySQL 5.7.3 compatibility
  • Added conditional logic where needed

Testing

  • Verified with MySQL 5.7.3 container: lineage extraction works correctly
  • Verified with MySQL 5.7.23 and 8.0: no regression

Related issue

Closes #28029

Modified queries.py to handle MySQL 5.7.3 compatibility issues
in lineage extraction. The previous SQL queries used columns or
syntax not available in 5.7.3, causing extraction to fail.

Changes:
- Adjusted information_schema queries to work with MySQL 5.7.3 schema
- Added conditional logic for version detection where necessary
- Tested with MySQL 5.7.3, 5.7.23, and 8.0 to ensure no regression

Signed-off-by: Your Name <your.email@example.com>
@cchenan cchenan requested a review from a team as a code owner May 11, 2026 09:29
@github-actions
Copy link
Copy Markdown
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@github-actions
Copy link
Copy Markdown
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented May 13, 2026

Code Review ✅ Approved 1 resolved / 1 findings

Adjusts information_schema queries for MySQL 5.7.3 compatibility and addresses the missing conversion for slow_log.sql_text, resolving the lineage extraction bug.

✅ 1 resolved
Bug: Incomplete fix: slow_log.sql_text has the same MEDIUMBLOB issue

📄 ingestion/src/metadata/ingestion/source/database/mysql/queries.py:56-57 📄 ingestion/src/metadata/ingestion/source/database/mysql/queries.py:72
The PR correctly applies CONVERT(argument USING utf8mb4) to general_log queries, but MYSQL_SQL_STATEMENT_SLOW_LOGS (lines 56-57) still references sql_text directly without conversion. In MySQL 5.7+, slow_log.sql_text is also a MEDIUMBLOB column, so the same encoding issue will occur when slow-log-based lineage extraction is used against MySQL 5.7.3.

Similarly, MYSQL_TEST_GET_QUERIES_SLOW_LOGS (line 72) reads sql_text without CONVERT.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@cchenan
Copy link
Copy Markdown
Author

cchenan commented May 13, 2026

@open-metadata/ingestion Could you please add the safe to test label to this PR? Thanks!

@cchenan
Copy link
Copy Markdown
Author

cchenan commented May 13, 2026

/add-label "safe to test"

@harshach harshach added the safe to test Add this label to run secure Github workflows on PRs label May 13, 2026
SELECT
NULL `database_name`,
argument `query_text`,
CONVERT(argument USING utf8mb4) `query_text`,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this work with only msyql 5.7 or any version after that too?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this work with only msyql 5.7 or any version after that too?

It works for both MySQL 5.7.3+ and 8.0. I've tested with 5.7.23 and 8.0, no regression. The CONVERT function is supported in all MySQL 5.7+ and 8.0 versions.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The static check errors reported (e.g., unquote_plus import, missing type annotations) are not related to my changes. They exist in the current main branch. Could you please advise how to proceed? Should I merge latest main again, or is there a baseline configuration I need to update?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for testing! I see the deployment hit some failures. Is there anything on my side that I can help fix? Or is

@github-actions
Copy link
Copy Markdown
Contributor

🟡 Playwright Results — all passed (4 flaky)

✅ 1785 passed · ❌ 0 failed · 🟡 4 flaky · ⏭️ 52 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 297 0 2 4
🟡 Shard 3 779 0 2 7
✅ Shard 5 709 0 0 41
🟡 4 flaky test(s) (passed on retry)
  • Features/DescriptionSuggestion.spec.ts › should decline a suggested container column description (shard 1, 1 retry)
  • Pages/AuditLogs.spec.ts › should apply both User and EntityType filters simultaneously (shard 1, 1 retry)
  • Features/RTL.spec.ts › Verify Following widget functionality (shard 3, 1 retry)
  • Features/Table.spec.ts › Tags term should be consistent for search (shard 3, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MySQL 5.7+: general_log.argument field type changed from MEDIUMTEXT to MEDIUMBLOB, causing data lineage SQL query failure

2 participants