Skip to content

Poor Alignment with Existing Ontologies and Standards #6

@dragon-ai-agent

Description

@dragon-ai-agent

Problem

While BioStride mentions ontology integration, the schema uses mostly custom enums and free-text fields where community ontologies could provide greater semantic precision. There are no explicit mappings to OBI, UBERON, or other standards.

Current State

  • Empty mappings: Schema classes show empty "Mappings" sections or just self-references
  • Unmapped enumerations: TechniqueEnum, PreparationTypeEnum have no ontology CURIEs
  • Missing OBI alignment: ExperimentRun, SamplePreparation could map to OBI assay/process terms
  • No technique mappings: SAXS, cryo-EM could reference EDAM or OBI technique terms
  • Undefined quality metrics: Resolution, Rg, completeness lack standardized definitions

Examples of missing mappings:

  • ExperimentRun → could map to OBI "data acquisition"
  • SamplePreparation → could map to OBI sample preparation processes
  • Instrument → could map to OBI "instrument device" or BFO "device"
  • sample_type enum values → could map to OBI or Sample Ontology terms

Impact

  • Poor interoperability: Cannot easily integrate with other ontology-annotated datasets
  • Semantic isolation: Creates mini-ontology instead of reusing community standards
  • Query limitations: Cannot leverage semantic web tools for BioStride data
  • Definition gaps: Some classes lack descriptions that could come from ontologies

Suggested Solutions

1. Add ontology mappings to schema classes

Use LinkML mappings slot:

  • ExperimentRun → OBI:0001911 (if relevant for data acquisition)
  • SamplePreparation → appropriate OBI process term
  • Instrument → BFO "device" or OBI instrument term

2. Embed ontology identifiers in enumerations

Add meaning field to enum values:

  • TechniqueEnum.cryo_em → EDAM:operation_364 "cryo EM"
  • TechniqueEnum.saxs → EDAM:operation_3450 (or similar)
  • Currently these are "None" in docs

3. Expand OntologyTerm usage

Replace hardcoded enums with OntologyTerm references:

  • sample_type → OntologyTerm (referencing CLO, Sample Ontology, etc.)
  • experiment_type → OntologyTerm for experiment classification
  • Follow the existing ImageFeature.terms pattern

4. Link quality metrics to standards

Map QualityMetrics fields (resolution, completeness) to:

  • mmCIF/PDBx metadata standards
  • Small Angle Scattering community terms
  • PDBe metadata definitions

Target Ontologies

  • OBI (Ontology for Biomedical Investigations): 2,500+ terms for experiments, assays, devices
  • EDAM (bioinformatics operations): Technique and data type terms
  • UBERON: Anatomy terms (already mentioned for biological context)
  • BFO (Basic Formal Ontology): Upper-level terms for devices, processes
  • mmCIF/PDBx: Structural biology metadata standards

Benefits

  • Improved interoperability with existing datasets
  • Semantic web compatibility for advanced queries
  • Reuse of vetted definitions instead of reinventing terms
  • Future-proofing through standards alignment

Priority

High - Critical for community adoption and semantic interoperability.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions