Skip to content

[BugFix] Resolve inconsistent global dictionary generation in flat JSON#72953

Merged
kangkaisen merged 7 commits into
mainfrom
global_dict_inconsistent
May 13, 2026
Merged

[BugFix] Resolve inconsistent global dictionary generation in flat JSON#72953
kangkaisen merged 7 commits into
mainfrom
global_dict_inconsistent

Conversation

@stdpain
Copy link
Copy Markdown
Contributor

@stdpain stdpain commented May 8, 2026

Why I'm doing:

The heterogeneous JSON detection logic did not correctly handle cases where:

a load batch did not generate a dictionary
a JSON path was extracted as a non-string type (such as int)

As a result, later dictionary collection could incorrectly ignore these values, causing dictionary loss and inconsistency across loads.

What I'm doing:

Fixes #issue

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
    • This pr needs auto generate documentation
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 4.1
    • 4.0
    • 3.5
    • 3.4

The heterogeneous JSON detection logic did not correctly handle cases where:

a load batch did not generate a dictionary
a JSON path was extracted as a non-string type (such as int)

As a result, later dictionary collection could incorrectly ignore these values, causing dictionary loss and inconsistency across loads.

Signed-off-by: stdpain <drfeng08@gmail.com>
@stdpain stdpain requested review from a team as code owners May 8, 2026 03:16
@github-actions github-actions Bot added the 4.1 label May 8, 2026
@mergify mergify Bot assigned stdpain May 8, 2026
@CelerData-Reviewer
Copy link
Copy Markdown

@codex review

@github-actions github-actions Bot requested a review from trueeyu May 8, 2026 03:18
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4de1df92bc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread test/sql/test_semi/R/test_flat_json_dict Outdated
Comment thread test/sql/test_semi/T/test_flat_json_dict Outdated
stdpain added 2 commits May 8, 2026 11:52
Signed-off-by: stdpain <34912776+stdpain@users.noreply.github.com>
Signed-off-by: stdpain <34912776+stdpain@users.noreply.github.com>
wyb
wyb previously approved these changes May 8, 2026
Signed-off-by: stdpain <34912776+stdpain@users.noreply.github.com>
Signed-off-by: stdpain <34912776+stdpain@users.noreply.github.com>
trueeyu
trueeyu previously approved these changes May 8, 2026
Signed-off-by: stdpain <34912776+stdpain@users.noreply.github.com>
Signed-off-by: stdpain <34912776+stdpain@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

@github-actions
Copy link
Copy Markdown
Contributor

[FE Incremental Coverage Report]

pass : 0 / 0 (0%)

@github-actions
Copy link
Copy Markdown
Contributor

[BE Incremental Coverage Report]

pass : 3 / 3 (100.00%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 be/src/storage/rowset/column_iterator.h 1 1 100.00% []
🔵 be/src/storage/meta_reader.cpp 1 1 100.00% []
🔵 be/src/storage/rowset/default_value_column_iterator.h 1 1 100.00% []

@stdpain stdpain requested a review from a team May 13, 2026 01:56
@stdpain stdpain requested a review from a team May 13, 2026 02:42
@kangkaisen kangkaisen merged commit 4c8f9c6 into main May 13, 2026
60 of 62 checks passed
@kangkaisen kangkaisen deleted the global_dict_inconsistent branch May 13, 2026 02:43
@github-actions
Copy link
Copy Markdown
Contributor

@Mergifyio backport branch-4.1

@github-actions github-actions Bot removed the 4.1 label May 13, 2026
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented May 13, 2026

backport branch-4.1

✅ Backports have been created

Details

Cherry-pick of 4c8f9c6 has failed:

On branch mergify/bp/branch-4.1/pr-72953
Your branch is up to date with 'origin/branch-4.1'.

You are currently cherry-picking commit 4c8f9c6128.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   be/src/storage/rowset/column_iterator.h
	modified:   be/src/storage/rowset/default_value_column_iterator.h
	modified:   test/sql/test_semi/R/test_flat_json_dict

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   be/src/storage/meta_reader.cpp

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

wanpengfei-git pushed a commit that referenced this pull request May 13, 2026
…ON (backport #72953) (#73188)

Signed-off-by: stdpain <34912776+stdpain@users.noreply.github.com>
Co-authored-by: stdpain <34912776+stdpain@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants