This repository is archived. The unified version of this tool is available at: Ortholog-Finder-Tool
A web-based bioinformatics tool for querying and analyzing orthologous proteins across five eukaryotic model organisms. The tool integrates ortholog relationships from multiple databases with pathway annotations and genome-wide cell-size screen data to identify evolutionarily conserved regulators.
This is the original draft/prototype version (2014–2017) that preceded the published Ortholog Finder Tool v1.1. It served as the development platform for the multi-database ortholog query engine (Lekeres class) and the flat-file data integration pipeline.
| Abbreviation | Species | Common name |
|---|---|---|
| AT | Arabidopsis thaliana | Thale cress |
| DM | Drosophila melanogaster | Fruit fly |
| HS | Homo sapiens | Human |
| SC | Saccharomyces cerevisiae | Budding yeast |
| SP | Schizosaccharomyces pombe | Fission yeast |
- Multi-database ortholog resolution — Proteins are queried across 6 ortholog databases (HomoloGene, orthoMCL v5, InParanoid v8, eggNOG v4, COG/KOG, PomBase) using UniProt KB accessions as the common identifier.
- Pathway annotation — KEGG and Reactome pathway memberships are retrieved for each protein, enabling identification of shared pathway context across orthologous groups.
- Screen data integration — Published genome-wide cell-size regulatory gene lists (Jorgensen, Moretto, Hayles, Björklund, Neumann) are cross-referenced with ortholog results.
- Query levels — Progressive filtering from all orthologs (
orth) through pathway-annotated (path), shared-pathway (same), to screen-hit subsets (size_mut1–6).
| Database | Version | Date |
|---|---|---|
| HomoloGene | — | 2014 |
| orthoMCL | 5 | 2013 |
| InParanoid | 8.0 | December 2013 |
| eggNOG | 4.0 | December 2013 |
| COG/KOG | — | 2014 |
| PomBase | V2.19 | 2014 |
| Database | Purpose |
|---|---|
| KEGG | Metabolic and signaling pathway annotations |
| Reactome | Curated biological pathway annotations |
| Database | Version | Purpose |
|---|---|---|
| UniProt | 2014_04 | Common identifier (KB accession) and gene name mapping |
.
├── index.php / main.php # Entry points and routing
├── _includes/ # PHP core logic
│ ├── mysql.php # Database connection
│ ├── functions.php # FajlBeolvas + Lekeres classes (query engine)
│ ├── mylog.php # Visitor logging (CSV + GeoIP)
│ └── page_*.php # Page templates
├── _dataset/ # Ortholog and pathway data (CSV/TSV, ~40 MB)
│ ├── ALL_ortholog_dbs_merged.csv # 225K ortholog pairs (6 DBs)
│ ├── kegg_pathways_uniprot.tsv # KEGG annotations
│ ├── reactome_pathways_uniprot.tsv # Reactome annotations
│ └── *_interact_deg2_exp.csv # Cell-size screen data per species
├── _media/ # Frontend assets (JS, CSS, images)
├── _query/ # Pre-computed JSON for DataTables
├── _download/ # Downloadable data files
├── tests/ # PHPUnit tests (32 tests)
└── work/ # Development resources
- Backend: PHP 5.x (updated for PHP 8.2 compatibility in 2025)
- Frontend: jQuery 1.11.2, jQuery DataTables 1.10.5
- Database: MySQL 5.7+ (for GeoIP logging)
- Testing: PHPUnit (32 tests: CSV parsing, config integrity, routing)
# Install dev dependencies
php composer.phar install
# Run all tests
php vendor/bin/phpunit
# Run with verbose output
php vendor/bin/phpunit --testdox32 tests covering:
- FajlBeolvasTest — CSV/TSV file parsing (faj, path, db, reg types)
- IncludeValuesTest — Configuration values and static data integrity
- PageRoutingTest — URL routing logic and include file mapping
This tool was developed by Zoltán Dul as part of his PhD research at King's College London (2013–2018):
"A system level approach to identify novel cell size regulators"
The thesis describes a systems biology strategy combining ortholog analysis, Gene Ontology annotation, protein-protein interaction networks, and high-throughput cell size screening data to identify novel regulators of cell size across eukaryotes.
- Ashburner M, Ball CA, Blake JA, et al. (2000). Gene Ontology: tool for the unification of biology. Nature Genetics, 25(1):25–29. doi:10.1038/75556
- Powell S, Forslund K, Szklarczyk D, et al. (2014). eggNOG v4.0: nested orthology inference across 3686 organisms. Nucleic Acids Research, 42(D1):D231–D239. PubMed: 24297252
- The UniProt Consortium (2015). UniProt: a hub for protein information. Nucleic Acids Research, 43(D1):D99–D106. PubMed: 25348405
- Li L, Stoeckert CJ Jr, Roos DS (2003). OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Research, 13(9):2178–2189. PubMed: 12952885
Originally developed in 2014 at King's College London & Fondazione Edmund Mach as the first prototype of the ortholog query engine. Updated for PHP 8.2 compatibility in 2025. Superseded by the unified Ortholog Finder Tool v1.1.
Zoltán Dul King's College London, Randall Centre for Cell and Molecular Biophysics
Copyright (C) 2014–2018 Zoltán Dul
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License v2 as published by the Free Software Foundation.