This repository is archived. The unified version of this tool is available at: Ortholog-Finder-Tool
A web-based bioinformatics tool for querying and analyzing orthologous proteins across five eukaryotic model organisms. The tool integrates ortholog relationships from multiple databases with pathway annotations and genome-wide cell-size screen data to identify evolutionarily conserved regulators.
This is the original draft/prototype version (2014β2017) that preceded the published Ortholog Finder Tool v1.1. It served as the development platform for the multi-database ortholog query engine (Lekeres class) and the flat-file data integration pipeline.
| Abbreviation | Species | Common name |
|---|---|---|
| AT | Arabidopsis thaliana | Thale cress |
| DM | Drosophila melanogaster | Fruit fly |
| HS | Homo sapiens | Human |
| SC | Saccharomyces cerevisiae | Budding yeast |
| SP | Schizosaccharomyces pombe | Fission yeast |
- Multi-database ortholog resolution β Proteins are queried across 6 ortholog databases (HomoloGene, orthoMCL v5, InParanoid v8, eggNOG v4, COG/KOG, PomBase) using UniProt KB accessions as the common identifier.
- Pathway annotation β KEGG and Reactome pathway memberships are retrieved for each protein, enabling identification of shared pathway context across orthologous groups.
- Screen data integration β Published genome-wide cell-size regulatory gene lists (Jorgensen, Moretto, Hayles, BjΓΆrklund, Neumann) are cross-referenced with ortholog results.
- Query levels β Progressive filtering from all orthologs (
orth) through pathway-annotated (path), shared-pathway (same), to screen-hit subsets (size_mut1β6).
| Database | Version | Date |
|---|---|---|
| HomoloGene | β | 2014 |
| orthoMCL | 5 | 2013 |
| InParanoid | 8.0 | December 2013 |
| eggNOG | 4.0 | December 2013 |
| COG/KOG | β | 2014 |
| PomBase | V2.19 | 2014 |
| Database | Purpose |
|---|---|
| KEGG | Metabolic and signaling pathway annotations |
| Reactome | Curated biological pathway annotations |
| Database | Version | Purpose |
|---|---|---|
| UniProt | 2014_04 | Common identifier (KB accession) and gene name mapping |
.
βββ index.php / main.php # Entry points and routing
βββ _includes/ # PHP core logic
β βββ mysql.php # Database connection
β βββ functions.php # FajlBeolvas + Lekeres classes (query engine)
β βββ mylog.php # Visitor logging (CSV + GeoIP)
β βββ page_*.php # Page templates
βββ _dataset/ # Ortholog and pathway data (CSV/TSV, ~40 MB)
β βββ ALL_ortholog_dbs_merged.csv # 225K ortholog pairs (6 DBs)
β βββ kegg_pathways_uniprot.tsv # KEGG annotations
β βββ reactome_pathways_uniprot.tsv # Reactome annotations
β βββ *_interact_deg2_exp.csv # Cell-size screen data per species
βββ _media/ # Frontend assets (JS, CSS, images)
βββ _query/ # Pre-computed JSON for DataTables
βββ _download/ # Downloadable data files
βββ tests/ # PHPUnit tests (32 tests)
βββ work/ # Development resources
- Backend: PHP 5.x (updated for PHP 8.2 compatibility in 2025)
- Frontend: jQuery 1.11.2, jQuery DataTables 1.10.5
- Database: MySQL 5.7+ (for GeoIP logging)
- Testing: PHPUnit (32 tests: CSV parsing, config integrity, routing)
# Install dev dependencies
php composer.phar install
# Run all tests
php vendor/bin/phpunit
# Run with verbose output
php vendor/bin/phpunit --testdox32 tests covering:
- FajlBeolvasTest β CSV/TSV file parsing (faj, path, db, reg types)
- IncludeValuesTest β Configuration values and static data integrity
- PageRoutingTest β URL routing logic and include file mapping
This tool was developed by ZoltΓ‘n Dul as part of his PhD research at King's College London (2013β2018):
"A system level approach to identify novel cell size regulators"
The thesis describes a systems biology strategy combining ortholog analysis, Gene Ontology annotation, protein-protein interaction networks, and high-throughput cell size screening data to identify novel regulators of cell size across eukaryotes.
- Ashburner M, Ball CA, Blake JA, et al. (2000). Gene Ontology: tool for the unification of biology. Nature Genetics, 25(1):25β29. doi:10.1038/75556
- Powell S, Forslund K, Szklarczyk D, et al. (2014). eggNOG v4.0: nested orthology inference across 3686 organisms. Nucleic Acids Research, 42(D1):D231βD239. PubMed: 24297252
- The UniProt Consortium (2015). UniProt: a hub for protein information. Nucleic Acids Research, 43(D1):D99βD106. PubMed: 25348405
- Li L, Stoeckert CJ Jr, Roos DS (2003). OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Research, 13(9):2178β2189. PubMed: 12952885
Originally developed in 2014 at King's College London & Fondazione Edmund Mach as the first prototype of the ortholog query engine. Updated for PHP 8.2 compatibility in 2025. Superseded by the unified Ortholog Finder Tool v1.1.
ZoltΓ‘n Dul King's College London, Randall Centre for Cell and Molecular Biophysics
Copyright (C) 2014β2018 ZoltΓ‘n Dul
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License v2 as published by the Free Software Foundation.