Skip to content

fix(entity): null-safe updateColumns for entities without columns#28047

Open
sonika-shah wants to merge 9 commits into
mainfrom
fix/file-patch-null-columns-npe
Open

fix(entity): null-safe updateColumns for entities without columns#28047
sonika-shah wants to merge 9 commits into
mainfrom
fix/file-patch-null-columns-npe

Conversation

@sonika-shah
Copy link
Copy Markdown
Collaborator

@sonika-shah sonika-shah commented May 12, 2026

Summary

PATCH on a File entity without columns (PDF / image / any non-tabular file) returns 500 with NullPointerException: Cannot invoke "java.util.List.iterator()" because "updatedColumns" is null. Any edit — tags, glossary terms, description, owner — triggers it, because FileRepository.entitySpecificUpdate unconditionally calls updateColumns(..., original.getColumns(), updated.getColumns(), ...).

The schema (file.json) lists columns as optional (only id, name, service are required), so updated.getColumns() is legitimately null for non-tabular files.

ColumnEntityUpdater.updateColumns already relies on recordListChange null-coalescing its inputs, but later iterates for (Column updated : updatedColumns) directly — that's the NPE site. This PR null-coalesces both origColumns and updatedColumns at the top of the method, fixing the bug for File and protecting any future ColumnEntityUpdater subclass with optional columns.

Repro

PATCH /api/v1/drives/files/{id}   (any PATCH on a column-less File)
→ 500 "Cannot invoke java.util.List.iterator() because updatedColumns is null"

Stack:

ColumnEntityUpdater.updateColumns(EntityRepository.java)
FileRepository$FileUpdater.entitySpecificUpdate(FileRepository.java:512)
EntityRepository$EntityUpdater.updateInternal(...)

Test plan

  • mvn spotless:check passes for openmetadata-service + openmetadata-integration-tests
  • openmetadata-service compiles
  • openmetadata-integration-tests test-compiles
  • New regression test FileResourceIT#test_patchFileWithoutColumns_doesNotNpe exercises the bug path: create a PDF file with no columns, PATCH the description, expect 200
  • Run FileResourceIT in CI to confirm green

🤖 Generated with Claude Code


Summary by Gitar

  • Fixed NPE in ColumnEntityUpdater:
    • Added null-coalescing for origColumns and updatedColumns in ColumnEntityUpdater to safely handle entities like File that lack column definitions.
  • Expanded integration testing:
    • Added robust coverage for FileResourceIT, DirectoryResourceIT, and SpreadsheetResourceIT including create, delete, and field-based retrieval.
    • Added specific regression tests to ensure PATCH operations on non-tabular entities remain functional.

This will update automatically on new commits.

PATCH on a File without columns (PDF/image/etc.) NPEs in
ColumnEntityUpdater.updateColumns because FileRepository.entitySpecificUpdate
unconditionally invokes the columns updater. recordListChange already
null-coalesces internally, but the subsequent `for (Column updated :
updatedColumns)` iteration does not — any updater calling updateColumns
with a null list hits the same path.

Null-coalesce origColumns and updatedColumns at the top of
ColumnEntityUpdater.updateColumns so any optional-columns entity is safe.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 12, 2026 04:48
@github-actions github-actions Bot added backend safe to test Add this label to run secure Github workflows on PRs labels May 12, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a backend NPE when patch-updating entities (notably File) that legitimately have columns == null (e.g., PDFs/images), by making ColumnEntityUpdater.updateColumns(...) null-safe and adding an integration test to prevent regressions.

Changes:

  • Coalesce origColumns and updatedColumns to empty lists at the start of ColumnEntityUpdater.updateColumns(...) to avoid iterating a null list.
  • Add a regression integration test that creates a PDF File without columns and PATCH-updates its description, asserting a successful response.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java Makes column update logic null-safe by normalizing null column lists to empty lists before processing/iteration.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/FileResourceIT.java Adds regression test ensuring PATCH/update on column-less File entities does not crash and preserves columns == null.

Add two generic tests to BaseEntityIT that any entity inheriting from it
must pass:

- patch_addClassificationTag_200_OK
- patch_addGlossaryTerm_200_OK

Both create a minimal entity, set a single tag/glossary-term label,
PATCH, and assert the label is present. Gated on supportsTags &&
supportsPatch, so an entity that opts out keeps doing so.

This gives every new EntityRepository subclass (e.g. the next entity
type someone adds) automatic defense against the class of bug where
the tag/glossary PATCH path NPEs on an unrelated optional field
(updateColumns with null columns being the original instance).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Migrate FileResourceIT, DirectoryResourceIT, and SpreadsheetResourceIT
onto BaseEntityIT<T, K> so they automatically inherit the ~60 generic
entity tests (CRUD, owners, tags PATCH, glossary PATCH, soft-delete,
versions, custom extensions, etc.). Add a brand-new FolderResourceIT
covering the previously untested folder entity type.

The previous standalone harnesses were a migration gap from the bulk
"Faster tests" PR (#24948) — only WorksheetResourceIT got the full
BaseEntityIT plumbing; the rest of the drive family did not. This meant
bugs like the updateColumns NPE fixed in this PR did not surface in
the IT suite for File and friends.

Each entity sets feature flags conservatively:
- supportsFollowers/Domains/DataProducts/CustomExtension/BulkAPI/
  DataContract = false (matches the existing minimal surface area)
- Folder additionally sets supportsVersionHistory/GetByVersion = false
  since the FolderResource doesn't expose /versions

Entity-specific tests (column handling, directory hierarchy, root
filter, FQN structure, etc.) are preserved. Existing redundant smoke
tests (createMinimal, deleteById, getByName, ID/name not-found) are
removed since BaseEntityIT covers them.

Also add openmetadata-sdk/.../drives/FolderService.java so Folder has
SDK access (basePath /v1/drive/folders).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 12, 2026

🔴 Playwright Results — 8 failure(s), 13 flaky

✅ 4058 passed · ❌ 8 failed · 🟡 13 flaky · ⏭️ 92 skipped

Shard Passed Failed Flaky Skipped
✅ Shard 1 299 0 0 4
🔴 Shard 2 747 8 7 14
🟡 Shard 3 779 0 2 7
🟡 Shard 4 789 0 1 18
🟡 Shard 5 707 0 2 41
🟡 Shard 6 737 0 1 8

Genuine Failures (failed on all attempts)

Features/DataQuality/IncidentManagerDateFilter.spec.ts › should show "Created At" as the default sort field label (shard 2)
Error: �[2mexpect(�[22m�[31mreceived�[39m�[2m).�[22mtoBe�[2m(�[22m�[32mexpected�[39m�[2m) // Object.is equality�[22m

Expected: �[32m200�[39m
Received: �[31m400�[39m
Features/DataQuality/IncidentManagerDateFilter.spec.ts › should open sort field dropdown on click (shard 2)
Error: �[2mexpect(�[22m�[31mreceived�[39m�[2m).�[22mtoBe�[2m(�[22m�[32mexpected�[39m�[2m) // Object.is equality�[22m

Expected: �[32m200�[39m
Received: �[31m400�[39m
Features/DataQuality/IncidentManagerDateFilter.spec.ts › should switch to "Updated At" and call API with dateField=updatedAt (shard 2)
Error: �[2mexpect(�[22m�[31mreceived�[39m�[2m).�[22mtoBe�[2m(�[22m�[32mexpected�[39m�[2m) // Object.is equality�[22m

Expected: �[32m200�[39m
Received: �[31m400�[39m
Features/DataQuality/IncidentManagerDateFilter.spec.ts › should switch back to "Created at" and call API with dateField=timestamp (shard 2)
Error: �[2mexpect(�[22m�[31mreceived�[39m�[2m).�[22mtoBe�[2m(�[22m�[32mexpected�[39m�[2m) // Object.is equality�[22m

Expected: �[32m200�[39m
Received: �[31m400�[39m
Features/DataQuality/IncidentManagerDateFilter.spec.ts › should close sort dropdown after selecting an option (shard 2)
Error: �[2mexpect(�[22m�[31mreceived�[39m�[2m).�[22mtoBe�[2m(�[22m�[32mexpected�[39m�[2m) // Object.is equality�[22m

Expected: �[32m200�[39m
Received: �[31m400�[39m
Features/DataQuality/IncidentManagerDateFilter.spec.ts › Date picker shows placeholder by default on Incident Manager page (shard 2)
Error: �[2mexpect(�[22m�[31mreceived�[39m�[2m).�[22mtoBe�[2m(�[22m�[32mexpected�[39m�[2m) // Object.is equality�[22m

Expected: �[32m200�[39m
Received: �[31m400�[39m
Features/DataQuality/IncidentManagerDateFilter.spec.ts › Select and clear date range on Incident Manager page (shard 2)
Error: �[2mexpect(�[22m�[31mreceived�[39m�[2m).�[22mtoBe�[2m(�[22m�[32mexpected�[39m�[2m) // Object.is equality�[22m

Expected: �[32m200�[39m
Received: �[31m400�[39m
Features/IncidentManager.spec.ts › Complete Incident lifecycle with table owner (shard 2)
Error: Wait for incident pw_test_case_358a0805 to appear in Incident Manager

�[2mexpect(�[22m�[31mreceived�[39m�[2m).�[22mtoBe�[2m(�[22m�[32mexpected�[39m�[2m) // Object.is equality�[22m

Expected: �[32mtrue�[39m
Received: �[31mfalse�[39m

Call Log:
- Timeout 60000ms exceeded while waiting on the predicate
🟡 13 flaky test(s) (passed on retry)
  • Features/ActivityAPI.spec.ts › Activity event shows the actor who made the change (shard 2, 1 retry)
  • Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
  • Features/BulkImport.spec.ts › Keyboard Delete selection (shard 2, 2 retries)
  • Features/KnowledgeCenter.spec.ts › Article mentions in description should working for Knowledge Center (shard 2, 1 retry)
  • Features/KnowledgeCenterTextEditor.spec.ts › Rich Text Editor - Text Formatting (shard 2, 1 retry)
  • Features/KnowledgeCenterTextEditor.spec.ts › Rich Text Editor - Text Formatting (shard 2, 1 retry)
  • Features/KnowledgeCenterTextEditor.spec.ts › Rich Text Editor - Text Formatting (shard 2, 1 retry)
  • Features/RTL.spec.ts › Verify Following widget functionality (shard 3, 1 retry)
  • Features/Workflows/WorkflowOssRestrictions.spec.ts › save-node-configuration-button closes sidebar (local state update) (shard 3, 1 retry)
  • Pages/DomainUIInteractions.spec.ts › Search assets within domain (shard 4, 1 retry)
  • Pages/Entity.spec.ts › Tier Add, Update and Remove (shard 5, 1 retry)
  • Pages/ExplorePageRightPanel_KnowledgeCenter.spec.ts › Should remove user owner for knowledgeCenter (shard 5, 1 retry)
  • Pages/Lineage/LineageFilters.spec.ts › Verify lineage schema filter selection (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

Three real issues surfaced by the inherited generic tests after the
drive-IT migration:

1. FolderResource.list was missing @min(0) / @max(1000000) on the limit
   query param, so negative or excessive values were silently accepted.
   Add validation to match DirectoryResource/WorksheetResource. This
   fixes BaseEntityIT.get_entityListWithInvalidLimit_4xx.

2. FolderResource hard-delete is asynchronous — it kicks off
   deleteByIdAsync and returns 200 before the row is gone. The generic
   delete_entityAsAdmin_hardDelete_200 test asserts the entity is no
   longer fetchable immediately. Override hardDeleteEntity in
   FolderResourceIT to poll on include=deleted until the async delete
   completes.

3. BaseEntityIT.get_deletedEntityVersion_200 calls getVersion(...) but
   only gated on supportsSoftDelete/supportsPatch, missing the
   supportsGetByVersion gate. For entities like Folder that don't expose
   /versions, the test threw UnsupportedOperationException from the
   subclass override. Add the missing gate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 12, 2026 11:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.

sonika-shah and others added 2 commits May 12, 2026 16:45
- SpreadsheetResourceIT.test_listSpreadsheetsWithRootParameter was
  vacuously passing if the list came back empty (the for-loop body
  silently runs zero times). Add an explicit
  assertFalse(getData().isEmpty()) so a regression in the ?root=true
  filter actually fails the test.

- FolderResource.list parameter description said "1 to 1000000" but
  the annotation is @min(0) — 0 is a valid limit (returns empty page).
  Align the description with the annotation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tyIT migration

The previous migration commit was too aggressive in deleting tests it
assumed BaseEntityIT covers. Restore the entity-specific tests:

SpreadsheetResourceIT (8 restored):
- test_createSpreadsheetWithOptionalFields (displayName + description)
- test_updateSpreadsheet (Spreadsheet-specific path/size fields)
- test_spreadsheetWithWorksheets (@disabled — worksheet relationship)
- test_listSpreadsheetsByService (?service filter)
- test_spreadsheetFQNPatterns (nested directory FQN construction)
- test_spreadsheetsWithAndWithoutDirectory (directory-presence variations)
- test_listSpreadsheetsWithRootParameterAndPagination (root + pagination)
- test_listSpreadsheetsWithRootParameterEmptyResult
- test_listSpreadsheetsWithRootParameterAcrossMultipleServices (@disabled)

FileResourceIT (2 restored):
- test_createFileWithDisplayName
- test_fileWithAllOptionalFields

DirectoryResourceIT (1 restored):
- test_createDirectoryWithAllFields

BaseEntityIT still covers the genuinely redundant tests that stayed
deleted (CRUD smoke tests, get-by-id, get-by-name with fields, delete,
non-existent ID/FQN, fluent-SDK find variants).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 12, 2026 11:23
FolderResource:
- Grammar: "Limit the number folders" -> "Limit the number of folders"

FolderResourceIT:
- Javadoc said Folder "supports ... followers" but supportsFollowers=false.
  Align doc with actual capability flags.
- Awaitility hard-delete poll was catching `Exception` and treating any
  failure as success (would mask transient 500s). Narrow to ApiException
  with statusCode==404; re-throw everything else so Awaitility surfaces
  real errors.

FileResourceIT:
- Pass String IDs to service.update for consistency with the rest of the
  IT suite (createdFile.getId().toString()).
- Compare service references via getFullyQualifiedName() instead of
  getName(); the latter only matches for top-level services and would
  silently break if the reference schema changes.

SpreadsheetResourceIT:
- Import @disabled and use the short annotation form instead of
  @org.junit.jupiter.api.Disabled (project standards prohibit FQNs in
  annotations).
- Strengthen test_listSpreadsheetsWithRootParameter: assert the root
  spreadsheet we created appears in ?root=true results AND the child
  spreadsheet does NOT, so a broken root filter actually fails the test
  instead of passing on an empty list.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Per reviewer request, restore the rest of the tests that the migration
commit removed under the (incorrect) assumption they were fully
redundant with BaseEntityIT.

FileResourceIT (+8): test_createFileMinimal, test_createFileWithDescription,
  test_deleteFile, test_findFileById, test_findFileByNameWithFields,
  test_getFileByNameWithFields, test_getFileWithNonExistentId_shouldFail,
  test_getFileByNameWithNonExistentFQN_shouldFail.

DirectoryResourceIT (+10): test_createDirectoryMinimalRequest, test_getByName,
  test_getByNameWithFields, test_deleteDirectory, test_findDirectoryById,
  test_findDirectoryByName, test_findDirectoryWithFields,
  test_createMultipleDirectories, test_getNonExistentDirectory_fails,
  test_getByNameNonExistent_fails.

SpreadsheetResourceIT (+10): test_createSpreadsheet, test_createSpreadsheetMinimal,
  test_getSpreadsheetById, test_getSpreadsheetByName, test_deleteSpreadsheet,
  test_finderWithFields, test_finderByNameWithFields, test_getByNameWithFields,
  test_createMultipleSpreadsheetsUnderSameService, test_patchSpreadsheetAttributes.

BaseEntityIT generic coverage stays intact — the subclass tests and the
inherited tests now coexist (deliberate overlap, opted in by reviewer).

Counts vs ab53590 (original): File 14→18 (+4 column tests added),
Directory 15→15, Spreadsheet 27→27.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…luded)

Prior commits restored test methods by name but trimmed bodies — e.g.,
dropped `assertNotNull(created)`, `assertNotNull(driveService)`, and other
redundant-but-original assertions. Reviewer asked for the tests "as is",
so this commit replays each restored method body byte-for-byte from
ab53590 (the introducing commit, PR #24948).

Preserved on top of the verbatim bodies:
- FileResourceIT: getFullyQualifiedName() comparison instead of getName()
  in test_createAndGetFile / test_fileWithAllOptionalFields (gitar-bot
  review fix).
- SpreadsheetResourceIT: @disabled (short form) instead of
  @org.junit.jupiter.api.Disabled (gitar-bot review fix). The skip state
  and reasons on test_spreadsheetWithWorksheets and
  test_listSpreadsheetsWithRootParameterAcrossMultipleServices match
  the original — they were disabled in PR #24948 due to backend gaps,
  not by this PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 12, 2026 12:57
@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented May 12, 2026

Code Review ✅ Approved 4 resolved / 4 findings

Implements null-safe column updates in ColumnEntityUpdater to prevent NPEs on non-tabular entities, while restoring and expanding integration test coverage. All identified issues regarding test assertions, annotations, and service references were resolved.

✅ 4 resolved
Bug: Vacuous assertion in test_listSpreadsheetsWithRootParameter

📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/SpreadsheetResourceIT.java:315-322
After the refactoring, test_listSpreadsheetsWithRootParameter no longer asserts that the returned list is non-empty. The for-loop over rootSpreadsheets.getData() will pass silently if the list is empty, meaning the test no longer verifies that root-level spreadsheets are actually returned. The old code had assertTrue(rootSpreadsheets.getData().size() >= 2).

Quality: @min(0) contradicts description saying range starts at 1

📄 openmetadata-service/src/main/java/org/openmetadata/service/resources/drive/FolderResource.java:102-106
The @Parameter description says "(1 to 1000000, default = 10)" but @Min(value = 0) allows 0. Either the description should say "0 to 1000000" or the @Min should use value = 1 to match the documented range. A limit of 0 is technically valid (returns empty results) so aligning the description to the annotation is the simpler fix.

Quality: Fully qualified @disabled instead of import

📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/SpreadsheetResourceIT.java:373 📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/SpreadsheetResourceIT.java:616
Lines 373 and 616 in SpreadsheetResourceIT.java use @org.junit.jupiter.api.Disabled(...) instead of importing org.junit.jupiter.api.Disabled and using @Disabled. The project's coding standards prohibit fully qualified class names in annotations/code.

Bug: Comparing service FQN against EntityReference.getName()

📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/FileResourceIT.java:385
At line 385, the assertion assertEquals(driveService.getFullyQualifiedName(), createdFile.getService().getName()) compares the service's FQN with EntityReference.getName(). For top-level services these happen to be equal, but semantically the correct comparison is against getFullyQualifiedName() on the reference. If the service were ever nested (or the reference schema changes), this assertion would silently pass for the wrong reason or break unexpectedly.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Comment on lines +8749 to 8753
origColumns = listOrEmpty(origColumns);
updatedColumns = listOrEmpty(updatedColumns);
List<Column> deletedColumns = new ArrayList<>();
List<Column> addedColumns = new ArrayList<>();
HashMap<String, String> originalUpdatedColumnFqns = new HashMap<>();
@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants