feat(rdf): SPARQL playground, MCP knowledge-graph tools, insights endpoints#28042
feat(rdf): SPARQL playground, MCP knowledge-graph tools, insights endpoints#28042harshach wants to merge 1 commit into
Conversation
✅ TypeScript Types Auto-UpdatedThe generated TypeScript types have been automatically updated based on JSON schema changes in this PR. |
|
The Python checkstyle failed. Please run You can install the pre-commit hooks with |
There was a problem hiding this comment.
Pull request overview
This PR expands OpenMetadata’s RDF/knowledge-graph capabilities end-to-end: UI support for running SPARQL, spec-level ontology/context updates, and service/MCP tooling for querying, validation (SHACL), federation-guarding, inference rules, and “insights” SPARQL query builders.
Changes:
- Adds SPARQL playground client support + routing, plus supporting UI enums/interfaces and locale keys.
- Introduces/extends RDF specs (ontology changelog, JSON-LD contexts, RDF configuration schema) and adds inference-rule + federation configuration shapes.
- Adds service-side RDF utilities (ontology document serving, SHACL validator, usage/quality/activity mappers, insights query builders) and MCP tools that expose read-only SPARQL + graph utilities.
Reviewed changes
Copilot reviewed 97 out of 100 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| openmetadata-ui/src/main/resources/ui/src/rest/rdfAPI.ts | Adds SPARQL playground REST helper (runSparqlQuery) and result typing. |
| openmetadata-ui/src/main/resources/ui/src/pages/SparqlPlayground/SparqlPlayground.interface.ts | Adds local types/constants for saved queries and sample queries. |
| openmetadata-ui/src/main/resources/ui/src/locale/languages/en-us.json | Adds SPARQL playground-related labels/messages; removes one label key. |
| openmetadata-ui/src/main/resources/ui/src/locale/languages/zh-tw.json | Adds SPARQL playground-related labels/messages; removes one label key. |
| openmetadata-ui/src/main/resources/ui/src/locale/languages/zh-cn.json | Adds SPARQL playground-related labels/messages; removes one label key. |
| openmetadata-ui/src/main/resources/ui/src/locale/languages/tr-tr.json | Adds SPARQL playground-related labels/messages; removes one label key. |
| openmetadata-ui/src/main/resources/ui/src/locale/languages/th-th.json | Adds SPARQL playground-related labels/messages; removes one label key. |
| openmetadata-ui/src/main/resources/ui/src/locale/languages/ru-ru.json | Adds SPARQL playground-related labels/messages; removes one label key. |
| openmetadata-ui/src/main/resources/ui/src/locale/languages/pt-pt.json | Adds SPARQL playground-related labels/messages; removes one label key. |
| openmetadata-ui/src/main/resources/ui/src/locale/languages/pt-br.json | Adds SPARQL playground-related labels/messages; removes one label key. |
| openmetadata-ui/src/main/resources/ui/src/locale/languages/pr-pr.json | Adds SPARQL playground-related labels/messages; removes one label key. |
| openmetadata-ui/src/main/resources/ui/src/locale/languages/nl-nl.json | Adds SPARQL playground-related labels/messages; removes one label key. |
| openmetadata-ui/src/main/resources/ui/src/locale/languages/mr-in.json | Adds SPARQL playground-related labels/messages; removes one label key. |
| openmetadata-ui/src/main/resources/ui/src/locale/languages/ko-kr.json | Adds SPARQL playground-related labels/messages; removes one label key. |
| openmetadata-ui/src/main/resources/ui/src/locale/languages/ja-jp.json | Adds SPARQL playground-related labels/messages; removes one label key. |
| openmetadata-ui/src/main/resources/ui/src/locale/languages/he-he.json | Adds SPARQL playground-related labels/messages; removes one label key. |
| openmetadata-ui/src/main/resources/ui/src/locale/languages/gl-es.json | Adds SPARQL playground-related labels/messages; removes one label key. |
| openmetadata-ui/src/main/resources/ui/src/locale/languages/fr-fr.json | Adds SPARQL playground-related labels/messages; removes one label key. |
| openmetadata-ui/src/main/resources/ui/src/locale/languages/es-es.json | Adds SPARQL playground-related labels/messages; removes one label key. |
| openmetadata-ui/src/main/resources/ui/src/locale/languages/de-de.json | Adds SPARQL playground-related labels/messages; removes one label key. |
| openmetadata-ui/src/main/resources/ui/src/locale/languages/ar-sa.json | Adds SPARQL playground-related labels/messages; removes one label key. |
| openmetadata-ui/src/main/resources/ui/src/enums/codemirror.enum.ts | Adds CodeMirror mode value for SPARQL. |
| openmetadata-ui/src/main/resources/ui/src/constants/constants.ts | Adds route constant for SPARQL playground. |
| openmetadata-ui/src/main/resources/ui/src/components/KnowledgeGraph/KnowledgeGraph.tsx | Refactors metadata-mode body rendering; removes a header label. |
| openmetadata-ui/src/main/resources/ui/src/components/AppRouter/AuthenticatedAppRouter.tsx | Registers lazy-loaded SPARQL playground route. |
| openmetadata-spec/src/main/resources/rdf/ontology/openmetadata-prov.ttl | Adds missing RDF prefix in PROV extension TTL. |
| openmetadata-spec/src/main/resources/rdf/ontology/CHANGELOG.md | Introduces ontology changelog documenting fidelity changes. |
| openmetadata-spec/src/main/resources/rdf/contexts/lineage.jsonld | Adjusts lineage JSON-LD mappings to avoid predicate collisions. |
| openmetadata-spec/src/main/resources/rdf/contexts/governance.jsonld | Aligns governance mappings with SKOS predicates and adds new fields. |
| openmetadata-spec/src/main/resources/rdf/contexts/automation.jsonld | Adds JSON-LD context for automation/workflow entities. |
| openmetadata-spec/src/main/resources/rdf/contexts/ai.jsonld | Adds JSON-LD context for AI/MCP-related entities. |
| openmetadata-spec/src/main/resources/json/schema/api/configuration/rdfConfiguration.json | Adds federation config block to RDF configuration schema. |
| openmetadata-spec/src/main/resources/json/schema/api/configuration/rdf/inferenceRule.json | Adds schema for inference rule objects (CONSTRUCT/RDFS placeholder). |
| openmetadata-spec/src/main/resources/json/schema/api/configuration/rdf/customOntology.json | Adds schema for custom ontology extension definitions. |
| openmetadata-service/src/test/java/org/openmetadata/service/resources/rdf/RdfShaclValidatorTest.java | Adds SHACL validation tests for key shape constraints. |
| openmetadata-service/src/test/java/org/openmetadata/service/resources/rdf/OntologyDocumentTest.java | Adds tests for ontology serving endpoint serialization. |
| openmetadata-service/src/test/java/org/openmetadata/service/rdf/translator/RdfUsageMapperTest.java | Adds tests for RDF usage summary mapping. |
| openmetadata-service/src/test/java/org/openmetadata/service/rdf/insights/RecommendationsQueryBuilderTest.java | Adds tests for recommendations SPARQL builder validation/shape. |
| openmetadata-service/src/test/java/org/openmetadata/service/rdf/insights/CoOccurrenceQueryBuilderTest.java | Adds tests for co-occurrence/popularity/reach SPARQL builders. |
| openmetadata-service/src/main/resources/rdf/inference-rules/transitive-lineage-closure.json | Adds starter inference rule JSON. |
| openmetadata-service/src/main/resources/rdf/inference-rules/schema-tag-inheritance.json | Adds starter inference rule JSON. |
| openmetadata-service/src/main/resources/rdf/inference-rules/pii-propagation-via-lineage.json | Adds starter inference rule JSON. |
| openmetadata-service/src/main/resources/rdf/inference-rules/domain-membership-inheritance.json | Adds starter inference rule JSON. |
| openmetadata-service/src/main/java/org/openmetadata/service/resources/rdf/RdfShaclValidator.java | Adds SHACL shapes loader + validator helper. |
| openmetadata-service/src/main/java/org/openmetadata/service/resources/rdf/OntologyDocument.java | Adds merged ontology document loader + serializer for multiple formats. |
| openmetadata-service/src/main/java/org/openmetadata/service/rdf/translator/RdfUsageMapper.java | Adds mapper emitting usage-summary triples. |
| openmetadata-service/src/main/java/org/openmetadata/service/rdf/translator/RdfQualityMapper.java | Adds mapper emitting DQV quality measurement resources. |
| openmetadata-service/src/main/java/org/openmetadata/service/rdf/translator/RdfActivityMapper.java | Adds PROV activity mapping for pipeline runs. |
| openmetadata-service/src/main/java/org/openmetadata/service/rdf/translator/JsonLdTranslator.java | Loads new JSON-LD contexts and assigns column IDs for named column resources. |
| openmetadata-service/src/main/java/org/openmetadata/service/rdf/RdfUtils.java | Adds RDF types for new entities and a helper to mint column URIs. |
| openmetadata-service/src/main/java/org/openmetadata/service/rdf/insights/RecommendationsQueryBuilder.java | Adds recommendations SPARQL builder. |
| openmetadata-service/src/main/java/org/openmetadata/service/rdf/insights/ImportanceQueryBuilder.java | Adds importance ranking SPARQL builder. |
| openmetadata-service/src/main/java/org/openmetadata/service/rdf/insights/CoOccurrenceQueryBuilder.java | Adds co-occurrence/popularity/reach SPARQL builders. |
| openmetadata-service/src/main/java/org/openmetadata/service/rdf/inference/InferenceRuleValidator.java | Adds strict validation for inference rule bodies. |
| openmetadata-service/src/main/java/org/openmetadata/service/rdf/inference/InferenceRuleRegistry.java | Adds in-memory inference rule registry + starter pack loader. |
| openmetadata-service/src/main/java/org/openmetadata/service/rdf/federation/SparqlFederationGuard.java | Adds SERVICE allowlist enforcement for federated SPARQL. |
| openmetadata-service/src/main/java/org/openmetadata/service/rdf/extension/CustomOntologyRegistry.java | Adds in-memory registry for custom ontology extensions. |
| openmetadata-mcp/src/test/java/org/openmetadata/mcp/tools/OntologyDescribeToolTest.java | Adds tests for ontology describe MCP tool. |
| openmetadata-mcp/src/test/java/org/openmetadata/mcp/tools/FindByTagToolTest.java | Adds tests for find-by-tag MCP tool. |
| openmetadata-mcp/src/test/java/org/openmetadata/mcp/tools/EntityNeighborhoodToolTest.java | Adds tests for entity neighborhood MCP tool. |
| openmetadata-mcp/src/main/resources/json/data/mcp/tools.json | Registers new MCP tool schemas for KG/SPARQL features. |
| openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/SparqlQueryTool.java | Adds read-only SPARQL MCP tool with federation guard + truncation. |
| openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/ShaclValidateTool.java | Adds SHACL validation MCP tool (entity-scoped or full graph). |
| openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/OntologyDescribeTool.java | Adds ontology describe MCP tool (full ontology or DESCRIBE). |
| openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/FindByTagTool.java | Adds MCP tool to find entities by tag/glossary FQN. |
| openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/DefaultToolContext.java | Wires new MCP tools into tool dispatcher. |
| ingestion/src/metadata/workflow/profiler.py | Adds optional ontology emission step to profiler workflow. |
| docker/development/docker-compose-postgres-fuseki.yml | Adds RDF endpoint env var for Fuseki-based dev stack. |
Comments suppressed due to low confidence (1)
openmetadata-service/src/main/java/org/openmetadata/service/rdf/RdfUtils.java:20
- New RDF types were added for
workflowinstance,agentexecution,mcpexecution,automation, etc., butPROV_ACTIVITY_TYPESwas not updated. SinceJsonLdTranslator.toRdf()relies ongetProvType()to addprov:Activitytyping (when the primary rdf:type is not already a PROV class), these new execution-like entities will missrdf:type prov:Activity, reducing PROV-O compatibility and potentially breaking queries that filter byprov:Activity. Please extend the PROV type sets to cover the new entity types.
private static final Set<String> PROV_ACTIVITY_TYPES =
Set.of(
"pipeline",
"ingestionpipeline",
"storedprocedure",
"dbtpipeline",
"workflow",
"pipelinerun");
|
The Python checkstyle failed. Please run You can install the pre-commit hooks with |
…ntic correctness Addresses the open review comments on PR #28042. Each fix is independent. P0 — correctness / security - ingestion/profiler.py: drop residual OntologyEmitter import (module removed in the R2RML pivot). Every pytest that imports profiler.py was failing with ModuleNotFoundError; reverts file to match main. - RdfResource.validateGraph + ShaclValidateTool: replace `entityUri.replace(">", "")` with strict absolute-http(s)-IRI validation. Reject control chars, whitespace, quotes, and angle brackets up front. Closes the SPARQL-injection vector via newlines / # comments in the DESCRIBE template. - EntityNeighborhoodTool.buildConstructQuery: rewrite to emit each path edge in its own UNION arm with the correct subject. The previous BIND(<entity> AS ?s) applied across all arms collapsed every multi-hop edge onto the start node. P1 — semantic / robustness - CommunityComputation.parseGraph: stop emitting both directions; the FILTER in the SPARQL canonicalises pairs and Louvain.addAllEdges symmetrises the adjacency internally. Double symmetrisation was doubling every edge weight. - RdfActivityMapper: emit `prov:wasInformedBy` (Activity → Activity) instead of `prov:wasGeneratedBy` (Entity → Activity) for pipeline-run → pipeline. Closes the PROV-O domain/range inversion. - RdfQualityMapper.measurementUri: deterministic URI built from subject + metric + timestamp. Random UUID was leaking orphan QualityMeasurement nodes on every re-emit because deletion only follows subject/object on the entity itself. - SparqlQueryTool: byte-aware truncation. Previous substring-by-char cap could exceed maxBytes for multi-byte UTF-8 and never enforced a real byte limit. - ShaclValidateTool: require explicit `fullGraph=true` to validate the entire triplestore; reject otherwise. Prevents accidental OOM on multi-GB graphs. - RdfResource.getFederationGuard: synchronise the lazy-init path; the volatile field was double-checked without a lock. P2 — code quality - RdfResource.validateGraph: replace 14 inline FQN class names with proper imports (Model / ModelFactory / Lang / RDFDataMgr / RDFFormat / ValidationReport). - RdfShaclValidator + OntologyDocument: catch RuntimeException (covers RiotException) in the static-init shapes/ontology load. A corrupt classpath resource now degrades to an empty model instead of failing class init and taking down /rdf/* and MCP describe. - i18n: restore `label.view-mode` (still referenced by OntologyExplorer/FilterToolbar.tsx) and route the SparqlPlayground sample-query names through translation keys (`label.sparql-sample-*`). All 17 locales synced via `yarn i18n`. Tests updated where the fix changes observable behaviour: - EntityNeighborhoodToolTest: assert the new multi-hop subject preservation and the new depth-3 chain variable name. - CommunityComputationTest: assert the directed-single-edge behaviour (Louvain symmetrises internally). - RdfPropertyMapperTest: assert `prov:wasInformedBy` instead of `prov:wasGeneratedBy` for the pipeline activity. - SparqlPlayground.test.tsx: drive the sample button by `nameKey`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
d64a780 to
843894e
Compare
…points Closes the RDF fidelity gaps in the knowledge-graph stack and removes the experimental R2RML row-materialization feature (schema-level concept-graph is the strategy going forward; row triples don't scale). Adds - SPARQL playground page (read-only SELECT/ASK/CONSTRUCT/DESCRIBE), multiple result formats, inject-prefixes helper, save/reload queries, sample queries driven by i18n keys. - Five MCP knowledge-graph tools: SparqlQueryTool (UTF-8 safe byte truncation), EntityNeighborhoodTool (per-arm correct subjects for multi-hop paths), FindByTagTool (matches Tag.tagFQN + GlossaryTerm.fullyQualifiedName), OntologyDescribeTool (full format→MIME mapping + IRI-validated resource), ShaclValidateTool (entity-scoped or explicit fullGraph=true). - Insights endpoints under /v1/rdf/insights/: importance, communities, shortest-path, recommendations, centrality, co-occurrence. - Inference-rule registry + starter pack, federated-SPARQL allowlist guard, expanded SHACL shapes + REST validation, custom-ontology upload/extension. - PROV-O activity mapper (prov:wasInformedBy for run→pipeline; agent IRIs under the entity namespace, never the ontology prefix), DQV quality measurements with deterministic per-(subject, metric, timestamp) URIs, usage mapper, JSON-LD contexts for AI/Automation/Governance, full ontology TTL. Deletes - R2RML mapping schema/validator/registry/applier and REST endpoints. - KnowledgeGraph LinkedData UI and rdfAPI.ts R2RML helpers. - emitOntologyTriples flag on the profiler pipeline schema. Security - New shared RdfIriValidator: every SPARQL DESCRIBE path validates user-supplied IRIs as absolute http(s) and rejects control chars, newlines, # comments, quotes, and angle brackets before template interpolation. - SparqlFederationGuard lazy-init synchronised. - RdfShaclValidator + OntologyDocument catch RuntimeException on TTL load. Tests - New unit tests for every component; expanded RdfPropertyMapperTest and CommunityComputationTest. SparqlPlayground.test.tsx covers the UI. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
843894e to
e932e1c
Compare
Code Review
|
| Compact |
|
Was this helpful? React with 👍 / 👎 | Gitar
|



Describe your changes
This PR closes the RDF fidelity gaps in the knowledge-graph stack and removes the experimental R2RML row-materialization feature (the schema-level concept-graph is the strategy going forward; row triples don't scale and were the wrong target). It adds: a SPARQL playground UI, five MCP knowledge-graph tools (SparqlQueryTool, EntityNeighborhoodTool, FindByTagTool, OntologyDescribeTool, ShaclValidateTool), insights endpoints (PageRank-based importance, Louvain communities, shortest-path explain-lineage, tag/glossary co-occurrence, recommendations), an inference-rule registry with a starter pack (transitive lineage, schema-tag/domain inheritance, PII-propagation-via-lineage), federated-SPARQL allowlist, expanded SHACL shapes + REST validator, custom-ontology upload/extension, PROV-O activity mapper, DQV quality mapper, usage mapper, and JSON-LD context coverage for AI/Automation/Governance entities. Deletes the R2RML mapping schema, validator, registry, applier, materializer workflow, OntologyEmitter sink, Linked-Data UI mode, and
emitOntologyTriplesprofiler flag. Adds the OpenMetadata ontology TTL coverage (skos, dcat, prov, dprod, foaf) for the new entity types and a CHANGELOG.Type of change
Tests
SparqlQueryToolTest,EntityNeighborhoodToolTest,FindByTagToolTest,OntologyDescribeToolTest,ShaclValidateToolTest,CustomOntologyValidatorTest,SparqlFederationGuardTest,InferenceRuleValidatorTest,CentralityComputationTest,CommunityComputationTest,LouvainTest,PageRankTest,LineagePathBuilderTest,LineagePathFinderTest,ImportanceQueryBuilderTest,CoOccurrenceQueryBuilderTest,RecommendationsQueryBuilderTest,RdfShaclValidatorTest,OntologyDocumentTest,RdfUsageMapperTest,RdfJsonLdContextTest, expandedRdfPropertyMapperTest, andRdfResourceIT. SPARQL playground covered bySparqlPlayground.test.tsx.UI screen recording / screenshots
SPARQL playground + KnowledgeGraph view: screenshot/recording to be attached on PR.
🤖 Generated with Claude Code
Summary by Gitar
SearchIndexExecutorandIndexingPipelinefor improved scalability.RedisJobNotifierfor cluster-wide reindex orchestration and added adaptive backoff logic.RdfIriValidator,SparqlFederationGuard, and inference rule registry (InferenceRuleValidator) for compliant RDF querying.CentralityComputation,CommunityComputation, and lineage path finding.SparqlQueryToolandEntityNeighborhoodTool.CustomOntologyRegistryandOntologyDocumentfor custom-ontology extension and JSON-LD context coverage.DbTunemodule and related diagnostic utilities.This will update automatically on new commits.