fix(entity): null-safe updateColumns for entities without columns#28047
fix(entity): null-safe updateColumns for entities without columns#28047sonika-shah wants to merge 9 commits into
Conversation
PATCH on a File without columns (PDF/image/etc.) NPEs in ColumnEntityUpdater.updateColumns because FileRepository.entitySpecificUpdate unconditionally invokes the columns updater. recordListChange already null-coalesces internally, but the subsequent `for (Column updated : updatedColumns)` iteration does not — any updater calling updateColumns with a null list hits the same path. Null-coalesce origColumns and updatedColumns at the top of ColumnEntityUpdater.updateColumns so any optional-columns entity is safe. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Fixes a backend NPE when patch-updating entities (notably File) that legitimately have columns == null (e.g., PDFs/images), by making ColumnEntityUpdater.updateColumns(...) null-safe and adding an integration test to prevent regressions.
Changes:
- Coalesce
origColumnsandupdatedColumnsto empty lists at the start ofColumnEntityUpdater.updateColumns(...)to avoid iterating a null list. - Add a regression integration test that creates a PDF
Filewithout columns and PATCH-updates its description, asserting a successful response.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java | Makes column update logic null-safe by normalizing null column lists to empty lists before processing/iteration. |
| openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/FileResourceIT.java | Adds regression test ensuring PATCH/update on column-less File entities does not crash and preserves columns == null. |
Add two generic tests to BaseEntityIT that any entity inheriting from it must pass: - patch_addClassificationTag_200_OK - patch_addGlossaryTerm_200_OK Both create a minimal entity, set a single tag/glossary-term label, PATCH, and assert the label is present. Gated on supportsTags && supportsPatch, so an entity that opts out keeps doing so. This gives every new EntityRepository subclass (e.g. the next entity type someone adds) automatic defense against the class of bug where the tag/glossary PATCH path NPEs on an unrelated optional field (updateColumns with null columns being the original instance). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Migrate FileResourceIT, DirectoryResourceIT, and SpreadsheetResourceIT onto BaseEntityIT<T, K> so they automatically inherit the ~60 generic entity tests (CRUD, owners, tags PATCH, glossary PATCH, soft-delete, versions, custom extensions, etc.). Add a brand-new FolderResourceIT covering the previously untested folder entity type. The previous standalone harnesses were a migration gap from the bulk "Faster tests" PR (#24948) — only WorksheetResourceIT got the full BaseEntityIT plumbing; the rest of the drive family did not. This meant bugs like the updateColumns NPE fixed in this PR did not surface in the IT suite for File and friends. Each entity sets feature flags conservatively: - supportsFollowers/Domains/DataProducts/CustomExtension/BulkAPI/ DataContract = false (matches the existing minimal surface area) - Folder additionally sets supportsVersionHistory/GetByVersion = false since the FolderResource doesn't expose /versions Entity-specific tests (column handling, directory hierarchy, root filter, FQN structure, etc.) are preserved. Existing redundant smoke tests (createMinimal, deleteById, getByName, ID/name not-found) are removed since BaseEntityIT covers them. Also add openmetadata-sdk/.../drives/FolderService.java so Folder has SDK access (basePath /v1/drive/folders). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
🔴 Playwright Results — 8 failure(s), 13 flaky✅ 4058 passed · ❌ 8 failed · 🟡 13 flaky · ⏭️ 92 skipped
Genuine Failures (failed on all attempts)❌
|
Three real issues surfaced by the inherited generic tests after the drive-IT migration: 1. FolderResource.list was missing @min(0) / @max(1000000) on the limit query param, so negative or excessive values were silently accepted. Add validation to match DirectoryResource/WorksheetResource. This fixes BaseEntityIT.get_entityListWithInvalidLimit_4xx. 2. FolderResource hard-delete is asynchronous — it kicks off deleteByIdAsync and returns 200 before the row is gone. The generic delete_entityAsAdmin_hardDelete_200 test asserts the entity is no longer fetchable immediately. Override hardDeleteEntity in FolderResourceIT to poll on include=deleted until the async delete completes. 3. BaseEntityIT.get_deletedEntityVersion_200 calls getVersion(...) but only gated on supportsSoftDelete/supportsPatch, missing the supportsGetByVersion gate. For entities like Folder that don't expose /versions, the test threw UnsupportedOperationException from the subclass override. Add the missing gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- SpreadsheetResourceIT.test_listSpreadsheetsWithRootParameter was vacuously passing if the list came back empty (the for-loop body silently runs zero times). Add an explicit assertFalse(getData().isEmpty()) so a regression in the ?root=true filter actually fails the test. - FolderResource.list parameter description said "1 to 1000000" but the annotation is @min(0) — 0 is a valid limit (returns empty page). Align the description with the annotation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tyIT migration The previous migration commit was too aggressive in deleting tests it assumed BaseEntityIT covers. Restore the entity-specific tests: SpreadsheetResourceIT (8 restored): - test_createSpreadsheetWithOptionalFields (displayName + description) - test_updateSpreadsheet (Spreadsheet-specific path/size fields) - test_spreadsheetWithWorksheets (@disabled — worksheet relationship) - test_listSpreadsheetsByService (?service filter) - test_spreadsheetFQNPatterns (nested directory FQN construction) - test_spreadsheetsWithAndWithoutDirectory (directory-presence variations) - test_listSpreadsheetsWithRootParameterAndPagination (root + pagination) - test_listSpreadsheetsWithRootParameterEmptyResult - test_listSpreadsheetsWithRootParameterAcrossMultipleServices (@disabled) FileResourceIT (2 restored): - test_createFileWithDisplayName - test_fileWithAllOptionalFields DirectoryResourceIT (1 restored): - test_createDirectoryWithAllFields BaseEntityIT still covers the genuinely redundant tests that stayed deleted (CRUD smoke tests, get-by-id, get-by-name with fields, delete, non-existent ID/FQN, fluent-SDK find variants). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
FolderResource: - Grammar: "Limit the number folders" -> "Limit the number of folders" FolderResourceIT: - Javadoc said Folder "supports ... followers" but supportsFollowers=false. Align doc with actual capability flags. - Awaitility hard-delete poll was catching `Exception` and treating any failure as success (would mask transient 500s). Narrow to ApiException with statusCode==404; re-throw everything else so Awaitility surfaces real errors. FileResourceIT: - Pass String IDs to service.update for consistency with the rest of the IT suite (createdFile.getId().toString()). - Compare service references via getFullyQualifiedName() instead of getName(); the latter only matches for top-level services and would silently break if the reference schema changes. SpreadsheetResourceIT: - Import @disabled and use the short annotation form instead of @org.junit.jupiter.api.Disabled (project standards prohibit FQNs in annotations). - Strengthen test_listSpreadsheetsWithRootParameter: assert the root spreadsheet we created appears in ?root=true results AND the child spreadsheet does NOT, so a broken root filter actually fails the test instead of passing on an empty list. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per reviewer request, restore the rest of the tests that the migration commit removed under the (incorrect) assumption they were fully redundant with BaseEntityIT. FileResourceIT (+8): test_createFileMinimal, test_createFileWithDescription, test_deleteFile, test_findFileById, test_findFileByNameWithFields, test_getFileByNameWithFields, test_getFileWithNonExistentId_shouldFail, test_getFileByNameWithNonExistentFQN_shouldFail. DirectoryResourceIT (+10): test_createDirectoryMinimalRequest, test_getByName, test_getByNameWithFields, test_deleteDirectory, test_findDirectoryById, test_findDirectoryByName, test_findDirectoryWithFields, test_createMultipleDirectories, test_getNonExistentDirectory_fails, test_getByNameNonExistent_fails. SpreadsheetResourceIT (+10): test_createSpreadsheet, test_createSpreadsheetMinimal, test_getSpreadsheetById, test_getSpreadsheetByName, test_deleteSpreadsheet, test_finderWithFields, test_finderByNameWithFields, test_getByNameWithFields, test_createMultipleSpreadsheetsUnderSameService, test_patchSpreadsheetAttributes. BaseEntityIT generic coverage stays intact — the subclass tests and the inherited tests now coexist (deliberate overlap, opted in by reviewer). Counts vs ab53590 (original): File 14→18 (+4 column tests added), Directory 15→15, Spreadsheet 27→27. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…luded) Prior commits restored test methods by name but trimmed bodies — e.g., dropped `assertNotNull(created)`, `assertNotNull(driveService)`, and other redundant-but-original assertions. Reviewer asked for the tests "as is", so this commit replays each restored method body byte-for-byte from ab53590 (the introducing commit, PR #24948). Preserved on top of the verbatim bodies: - FileResourceIT: getFullyQualifiedName() comparison instead of getName() in test_createAndGetFile / test_fileWithAllOptionalFields (gitar-bot review fix). - SpreadsheetResourceIT: @disabled (short form) instead of @org.junit.jupiter.api.Disabled (gitar-bot review fix). The skip state and reasons on test_spreadsheetWithWorksheets and test_listSpreadsheetsWithRootParameterAcrossMultipleServices match the original — they were disabled in PR #24948 due to backend gaps, not by this PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code Review ✅ Approved 4 resolved / 4 findingsImplements null-safe column updates in ColumnEntityUpdater to prevent NPEs on non-tabular entities, while restoring and expanding integration test coverage. All identified issues regarding test assertions, annotations, and service references were resolved. ✅ 4 resolved✅ Bug: Vacuous assertion in test_listSpreadsheetsWithRootParameter
✅ Quality: @min(0) contradicts description saying range starts at 1
✅ Quality: Fully qualified @disabled instead of import
✅ Bug: Comparing service FQN against EntityReference.getName()
OptionsDisplay: compact → Showing less information. Comment with these commands to change:
Was this helpful? React with 👍 / 👎 | Gitar |
| origColumns = listOrEmpty(origColumns); | ||
| updatedColumns = listOrEmpty(updatedColumns); | ||
| List<Column> deletedColumns = new ArrayList<>(); | ||
| List<Column> addedColumns = new ArrayList<>(); | ||
| HashMap<String, String> originalUpdatedColumnFqns = new HashMap<>(); |
|



Summary
PATCH on a File entity without columns (PDF / image / any non-tabular file) returns 500 with
NullPointerException: Cannot invoke "java.util.List.iterator()" because "updatedColumns" is null. Any edit — tags, glossary terms, description, owner — triggers it, becauseFileRepository.entitySpecificUpdateunconditionally callsupdateColumns(..., original.getColumns(), updated.getColumns(), ...).The schema (
file.json) listscolumnsas optional (onlyid,name,serviceare required), soupdated.getColumns()is legitimatelynullfor non-tabular files.ColumnEntityUpdater.updateColumnsalready relies onrecordListChangenull-coalescing its inputs, but later iteratesfor (Column updated : updatedColumns)directly — that's the NPE site. This PR null-coalesces bothorigColumnsandupdatedColumnsat the top of the method, fixing the bug for File and protecting any futureColumnEntityUpdatersubclass with optional columns.Repro
Stack:
Test plan
mvn spotless:checkpasses for openmetadata-service + openmetadata-integration-testsFileResourceIT#test_patchFileWithoutColumns_doesNotNpeexercises the bug path: create a PDF file with no columns, PATCH the description, expect 200FileResourceITin CI to confirm green🤖 Generated with Claude Code
Summary by Gitar
ColumnEntityUpdater:origColumnsandupdatedColumnsinColumnEntityUpdaterto safely handle entities likeFilethat lack column definitions.FileResourceIT,DirectoryResourceIT, andSpreadsheetResourceITincluding create, delete, and field-based retrieval.This will update automatically on new commits.