Skip to content

[FEATURE] Dedicated SKOS Vocabulary Module (Using Existing Components) #319

@KaifAhmad1

Description

@KaifAhmad1

Manage controlled vocabularies, taxonomies, and term relationships on top of Semantica’s existing ontology and triplet‑store stack.

No new Python package is added; we extend the current ontology and RDF infrastructure.


Code and Test Layout

  • Main implementation goes in existing packages:
    • semantica/ontology/ (for public APIs and namespace handling).
    • semantica/triplet_store/ (for RDF storage and SPARQL).
  • Tests for SKOS features should be added under:
    • tests/ontology/ (ontology‑level helpers).
    • tests/triplet_store/ (triple store behavior).
  • Do not create new top-level folders; reuse semantica/ and tests/ only.

Feature Description

  • Stores SKOS concept schemes, concepts, and collections in the existing RDF store.
  • Provides search, browse, and selection for SKOS concepts.
  • Uses SKOS vocabularies to drive suggestions, autocomplete, and validation.
  • Integrates with ontology and graph workflows as a first‑class semantic asset.

Why This Is Important

  • Improves semantic search accuracy and term matching.
  • Centralizes enterprise vocabularies and taxonomies.
  • Reduces ambiguity and drift in labels and domain terminology.
  • Moves Semantica toward being a complete semantic layer platform, not just a graph toolkit.

Existing Building Blocks

These are existing modules and files in Semantica that SKOS support will build on:

  • Ontology and namespaces
    • semantica/ontology/namespace_manager.py
      • Manages prefixes and base URIs (OWL, RDF, etc.).
    • semantica/ontology/module_manager.py
      • Handles modular ontologies and imports.
    • semantica/ontology/engine.py
      • High‑level ontology API (generation, validation, export).
  • Triplet store and SPARQL
    • semantica/triplet_store/triplet_store.py
      • Generic RDF triple storage operations.
    • semantica/triplet_store/query_engine.py
      • SPARQL execution and optimization.
    • semantica/triplet_store/config.py
      • Configuration for triplet store backends.
  • Documentation and tests
    • docs/reference/ontology.md
    • docs/reference/triplet_store.md
    • tests/ontology/test_ontology_comprehensive.py
    • tests/triplet_store/test_triplet_store.py

New Work on Top of Existing Code

All work should extend the files above; no new top‑level modules are required:

  • Add SKOS namespace support
    • In semantica/ontology/namespace_manager.py:
      • Ensure SKOS prefix and base URI are defined.
      • Optionally add helpers for building SKOS URIs.
  • Add lightweight SKOS helpers on top of the triplet store
    • In semantica/triplet_store/triplet_store.py:
      • Optional helper methods for adding and retrieving SKOS concepts using existing triple APIs.
  • Add ontology‑level vocabulary APIs
    • In semantica/ontology/engine.py (or a small helper in the same package):
      • New public methods:
        • list_vocabularies()
        • list_concepts(scheme_uri: str)
        • search_concepts(query: str, scheme_uri: Optional[str] = None)
      • These should delegate to QueryEngine from semantica/triplet_store/query_engine.py.
  • Add docs and tests
    • docs/reference/ontology.md:
      • New “SKOS Vocabulary Management” section with usage examples.
    • tests/ontology/test_ontology_comprehensive.py:
      • Tests for ontology‑level SKOS helpers.
    • tests/triplet_store/test_triplet_store.py:
      • Tests that SKOS triples can be stored and retrieved correctly.

Implementation Plan

  1. SKOS namespace support

    • In ontology/namespace_manager.py:
  2. RDF storage of SKOS vocabularies

    • In triplet_store/triplet_store.py:
      • Confirm that storing SKOS triples requires no backend changes.
      • If needed, add small helper methods such as:
        • add_skos_concept(...)
        • get_skos_concepts(...)
      • Internally these should just call the generic triple APIs.
  3. Search and retrieval helpers

    • In ontology/engine.py (or a small helper within the ontology package):
      • Implement public methods:
        • list_vocabularies()
        • list_concepts(scheme_uri: str)
        • search_concepts(query: str, scheme_uri: Optional[str] = None)
      • Use triplet_store/query_engine.QueryEngine to issue SPARQL SELECT queries over the RDF store, matching labels, altLabels, and notations.
  4. Integration points

    • Keep this feature opt‑in and non‑breaking:
      • When SKOS vocabularies exist, they can be used by other modules (e.g. semantic extraction, KG, mapping) via public ontology APIs.
    • Do not introduce new modules; feed everything through existing ontology and triplet_store entrypoints.
  5. Documentation

    • In docs/reference/ontology.md:
      • Add a “SKOS Vocabulary Management” section with examples:
        • Importing a SKOS vocabulary via existing RDF import tools.
        • Listing and searching concepts through OntologyEngine.
    • Optionally cross‑link from docs/reference/triplet_store.md.

When This Feature Is Considered Complete

  • SKOS vocabularies (schemes, concepts, collections) can be:
    • Imported into the existing RDF store.
    • Listed and searched using ontology APIs that wrap QueryEngine.
  • No new top‑level Python package is created.
  • Tests cover:
    • Basic SKOS vocabulary ingestion and retrieval.
    • At least one search scenario (e.g. prefLabel + altLabel matching).
  • Documentation explains how contributors and users can work with SKOS vocabularies in Semantica.

Metadata

Metadata

Assignees

No one assigned

    Labels

    coreCore Semantica logic and abstractionsenhancementNew feature or requesthelp wantedExtra attention is needed

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions