feat: Enhance Keycloak SCIM Provider with Group Filtering, Mapping Existing Users and Groups#164
feat: Enhance Keycloak SCIM Provider with Group Filtering, Mapping Existing Users and Groups#164dsvh wants to merge 60 commits intomitodl:mainfrom
Conversation
…485-keycloak-groups-filter
- Handle 405 Method Not Allowed errors by falling back to PATCH for groups - Handle 404 Not Found errors by creating the resource instead of updating - Properly update existing mappings when recreating deleted resources
- Handle 405 Method Not Allowed errors by falling back to PATCH for groups - Handle 404 Not Found errors by creating resources instead of updating - Properly update existing mappings when recreating deleted resources - Fix compilation issues with return statements
- Enhanced error handling in ScimClient.java to properly handle 400/404 responses for missing resources, with fallback from PUT to PATCH for groups and resource creation when needed - Updated UI configuration labels from 'Use patchOp for groups/users' to 'Use PATCH for groups/users' for better clarity - Improved help text to explicitly explain PATCH vs PUT operations
- Changed use-email-as-username boolean config to username-source string config with options 'username' or 'email' - Updated UserAdapter to use email as userName when configured, falling back to username - Made getModel() method protected in Adapter class for access in subclasses
- Add getResourceInfo() method to provide detailed logging for users (username/email) and groups (name/id) - Update refreshResources() to log specific resources being processed - Update importResources() to log detailed information about each resource being imported - Include email information in sync logs for better debugging
… groups - Add ScimSynchronizationResult class that extends SynchronizationResult with lists for added/updated/removed/failed users and groups - Update ScimClient sync methods to track detailed information instead of just counts - Maintain backward compatibility by falling back to standard counters when ScimSynchronizationResult is not used - Enhanced logging shows specific users/groups being processed with usernames/emails and group names
…equest for Databricks compatibility
- Simplified the 405 fallback logic to use single PATCH operation for members - Removed complex multi-PATCH logic that was causing compilation issues - Members are still updated correctly, addressing the main sync issue - DisplayName and externalId updates can be handled via full PUT on creation
- Updated GroupAdapter.toSCIM() to use Databricks user ID from mapping as member value - Updated GroupAdapter.toPatchBuilder() to use Databricks user ID for PATCH operations - Updated GroupAdapter.apply() to map Databricks user IDs back to Keycloak user IDs - This fixes the 'Error finding group member' issue in Databricks SCIM sync
- Add UI config flags for map-existing-users and map-existing-groups - Implement 409 error handling in ScimClient.create() with SCIM list queries - Add client-side filtering fallback when server-side filtering fails - Enhance ScimSynchronizationResult to track mapped resources - Add debug logging for config values and list request behavior
- Add pagination support when fetching all users for client-side email matching - Prevent missing existing users on later pages during map-existing-users sync - Add necessary imports for ArrayList and List
- Changed mapping logic to trigger on any 4xx status code when map-existing-users is enabled - Previously only attempted mapping on 409 (conflict), but Databricks may return other 4xx codes for duplicate users - This ensures existing users are mapped even if create fails for reasons other than explicit conflict
- Check for existing users before creation when map-existing-users is enabled - Implement pagination to fetch all users for complete matching - Prevent duplicate user creation and ensure proper linking
…void filter issues
… for fetch failures
…Refresh resources', adapter type logs, and HTTP status codes
feat: Enhance Keycloak SCIM Provider with Group Filtering, Mapping Existing Users and Groups
Summary of ChangesHello @dsvh, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the Keycloak SCIM provider by introducing advanced group filtering capabilities and more robust user and group synchronization logic. It improves compatibility with external SCIM providers like Databricks by addressing specific API behaviors and introduces new configuration options for mapping existing resources and controlling username sources. The changes aim to make the SCIM provider more flexible, reliable, and easier to troubleshoot, especially in enterprise environments with complex synchronization requirements. Highlights
Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces significant enhancements to the Keycloak SCIM provider, including group filtering, mapping of existing resources, and more robust error handling. The changes are extensive and well-structured, improving both functionality and reliability. My review focuses on a few key areas for improvement. I've identified a critical issue regarding the lack of pagination handling when fetching resources from the SCIM server, which could lead to incomplete data synchronization with larger datasets. Additionally, I've provided suggestions to improve logging practices and optimize resource fetching logic for better performance. Overall, these are excellent enhancements, and addressing the feedback will make the provider even more robust.
- Add fallback logic to create groups if PATCH update fails with 404/400 - Ensures groups are created automatically when they don't exist in Databricks - Fixes issue where groups like databricks-admins-stage were not being created
BR-485: Fix group creation when missing in Databricks during sync
Enhance Keycloak SCIM Provider with Group Filtering
Overview
This pull request introduces significant enhancements to the Keycloak SCIM (System for Cross-domain Identity Management) provider, focusing on improved group filtering capabilities and robust integration with Databricks SCIM API. The changes address multiple issues related to user and group synchronization, error handling, and performance.
Key Changes
🚀 New Features
username-sourceconfig option to control how SCIMuserNamefields are populated (email, username, or custom attribute)🐛 Bug Fixes
🔧 Technical Improvements
📝 Configuration Options
group-filter: Regex pattern for filtering groups during syncusername-source: Source for SCIM userName field (email/username/custom)map-existing-users: Enable mapping to existing users on sync conflictsforce-overwrite: Force overwrite of existing resourcesFiles Modified
Adapter.java,GroupAdapter.java,ScimClient.java,UserAdapter.javaScimStorageProviderFactory.java,ScimSynchronizationResult.javabuild.gradle, Gradle wrapper, GitHub workflowsscim-resource-changelog.xmlTesting
Version
Bumped version to 1.4 for this release.
Related Issues
This PR significantly improves the reliability and flexibility of SCIM-based identity synchronization between Keycloak and Databricks, making it production-ready for enterprise deployments.