Skip to content

Sync: align ncbo/ontologies_linked_data with the AgroPortal codebase#266

Draft
mdorf wants to merge 76 commits intodevelopfrom
chore/ontoportal-lirmm-goo-compat
Draft

Sync: align ncbo/ontologies_linked_data with the AgroPortal codebase#266
mdorf wants to merge 76 commits intodevelopfrom
chore/ontoportal-lirmm-goo-compat

Conversation

@mdorf
Copy link
Member

@mdorf mdorf commented Mar 16, 2026

This PR introduces a large set of refactors and features that bring ncbo/ontologies_linked_data (OLD) significantly closer to the development branch maintained by the AgroPortal team.

This work is part of a broader effort to align several BioPortal components with the implementations maintained by the AgroPortal team. As part of this effort, other repositories in the stack have been updated to adopt their AgroPortal counterparts. Because the BioPortal and AgroPortal versions of ontologies_linked_data have diverged significantly over time, a full replacement was not feasible here. Instead, this PR performs a substantial synchronization effort that incorporates key improvements from AgroPortal while preserving BioPortal-specific functionality.

The primary goals of this effort are:

  • Align OLD with the AgroPortal architecture and feature set
  • Modernize the codebase and infrastructure
  • Ensure compatibility with the AgroPortal-based versions of goo and sparql-client adopted in related PRs

Prerequisites

This PR depends on the AgroPortal-based replacements of the following repositories:

These PRs should be merged prior to this one, as ontologies_linked_data relies on the updated GOO indexing model and SPARQL client behavior introduced there.

Major changes

Alignment with AgroPortal implementation

This PR incorporates a significant amount of refactoring and functionality developed in the AgroPortal version of OLD. The goal is to bring the two implementations back into closer alignment, while retaining BioPortal-specific features where full convergence is not desirable.

Key areas affected include:

  • ontology submission processing
  • indexing and search infrastructure
  • backend triple store support
  • authentication and authorization
  • testing and infrastructure

Search and indexing improvements

Schemaless Solr support

This PR updates OLD to work with the schemaless Solr architecture enabled by the new GOO indexing model.

Key changes include:

  • dynamic Solr schema generation defined directly in model classes
  • refactoring of search and indexing logic to support schemaless Solr
  • improvements to indexing configuration and model-level indexing support

Models can now enable indexing with a simple declaration such as:

enable_indexing(:ontology_metadata)

where the symbol corresponds to the Solr core name.

Expanded indexing coverage

In addition to ontology terms, the updated system now supports indexing of ontology metadata.

Multi triple-store backend support

Historically, OLD supported only:

  • 4store
  • AllegroGraph

The AgroPortal implementation introduces support for additional triple stores, and this PR brings that capability into BioPortal. Supported triple stores now also include:

  • Virtuoso
  • GraphDB

Supporting changes were made across backend integration layers and mappings logic.

Ontology submission processing refactor

A major internal refactor restructures the ontology submission processing workflow.

Previously, much of the logic was implemented through large concerns and tightly coupled service logic. This PR introduces a modular processing pipeline composed of discrete operations responsible for tasks such as:

  • metadata extraction
  • missing label generation
  • obsolete class detection
  • RDF generation
  • indexing operations

This architecture improves maintainability and aligns the processing pipeline with the AgroPortal implementation.

Authentication and authorization updates

This PR includes improvements to the authentication and authorization subsystems:

  • introduction of OAuth user authentication support
  • improvements to authorization handling and permission checks
  • configuration updates supporting multiple authentication providers

Infrastructure and dependency modernization

Several updates modernize the project and align it with the current stack used across the OntoPortal ecosystem:

  • upgrade from Minitest 4 to Minitest 6
  • compatibility updates for latest ActiveSupport
  • migration to Ruby 3.2
  • dependency updates and general build environment improvements

Testing infrastructure updates

This PR integrates the OntoPortal testkit, which is being adopted across the BioPortal/OntoPortal stack.

Changes include:

  • integration of the shared testkit framework
  • updates to test configuration and dependencies
  • improvements to CI and containerized test execution

BioPortal-specific improvements

In addition to the AgroPortal-derived work, this PR includes several improvements authored specifically for the BioPortal implementation.

Improved label generation logic

Several commits improve the logic used to generate labels for ontology classes when explicit labels are not present. These changes enhance the reliability and consistency of label generation during ontology processing and indexing.

Improved search and indexing

The AgroPortal schemaless Solr implementation failed to bring in some more fine-grained functionality that separated the NCBO search from AgroPortal's, such as search on OBO IDs and short IDs. That functionality has now been re-implemented using the schemaless Solr.

Summary

This PR represents a substantial update to ontologies_linked_data, bringing the BioPortal implementation closer to the AgroPortal codebase while preserving BioPortal-specific functionality.

Key outcomes include:

  • alignment with the AgroPortal architecture
  • support for schemaless Solr
  • expanded triple-store backend compatibility
  • refactored ontology submission processing
  • OAuth authentication support
  • modernized infrastructure and testing
  • integration of the OntoPortal testkit

While not a full codebase replacement like the goo and sparql-client migrations, this PR performs a significant synchronization effort and prepares OLD to operate correctly with the updated stack.

mdorf added 30 commits August 6, 2025 10:31
…copy the portal language label into the generic one
…copy the portal language label into the generic one
alexskr and others added 23 commits February 12, 2026 10:52
…264)

CI: refactor docker-based test runner and add linux container tests

see ncbo/goo#173
…ntologies_linked_data into chore/ontoportal-lirmm-goo-compat
…e raptor_free_world is bound with zero args, but its finalizer calls it with one pointer arg
…TestSearch#test_search_ontology_data [test/models/test_search.rb:135]
…ntologySubmissionArchiver instead of LinkedData::Models::OntologySubmission
…ntologies_linked_data into chore/ontoportal-lirmm-goo-compat
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants