A unified bioinformatics web tool for exploring evolutionarily conserved proteins across model organisms. Combines multi-database ortholog search with Gene Ontology annotation extension via ortholog-based Venn diagram analysis.
PhD Thesis: Dul, Z. (2019). A system-level approach to identify novel cell size regulators. King's College London. Downloand from KCL Pure website
- Overview
- Two Modes
- Model Organisms
- Live Versions
- Installation
- Configuration
- Project Structure
- Data Sources
- Academic References
- History
- Author
- License
The Ortholog Finder Tool was developed as part of a PhD project at King's College London (2013β2018) to systematically identify conserved regulators of cell size across eukaryotic model organisms. It integrates orthologous protein relationships from multiple databases and cross-references them with pathway annotations and genome-wide functional screen data.
Version 1.6 unifies two previously separate tools into a single application:
| Previously | Now |
|---|---|
| Ortholog Finder Tool v1.5 | Mode A: Ortholog Search |
| GeneOntology Extension Tool v1.0 | Mode B: GO Extension |
Multi-database ortholog lookup with pathway annotations and cell-size screen hit data.
- 5 species: AT, DM, HS, SC, SP
- 6 ortholog databases: HomoloGene, orthoMCL v5, InParanoid v8, eggNOG v4, COG/KOG, PomBase
- Pathway data: KEGG and Reactome annotations per protein
- Screen data: Published cell-size regulatory gene lists (Jorgensen, Moretto, Hayles, BjΓΆrklund, Neumann)
- Query levels:
orth(all orthologs),path(with pathways),same(shared pathways),size_mut1β6(screen hit filters) - Output: Interactive sortable/searchable DataTable + CSV download
- Data: 225,000 ortholog pairs across 40 MB of pre-processed flat files
Gene Ontology annotation extension via eggNOG ortholog groups, visualized with Venn diagrams.
- 7 species: AT, CE, DM, DR, HS, SC, SP
- 85 GO Slim Generic terms (from GO Consortium)
- Venn diagram types: Classic Venn (2β7 sets) and Edwards-Venn (2β7 sets)
- Configurable threshold: Minimum species co-occurrence (2β7)
- 4 result tables: Filtered hits, expanded list, conserved core, novel annotation suggestions
- Data: 75 eggNOG export CSVs + 12 SVG diagram templates
- Backend: MySQL queries against
orthology_databasesandgeneontologytables
The GO Extension mode identifies potential gaps in Gene Ontology annotations by computing an H/M ratio for each GO term within each orthologous group:
- Ortholog group resolution β Proteins are mapped to COG/KOG orthologous groups via the eggNOG database (v4). UniProt KB accessions serve as the common identifier across all 7 species.
- GO annotation lookup β For each orthologous group, the tool retrieves GO Slim annotations for all member proteins from the Gene Ontology database.
- H/M score calculation β For each GO term within a group, the tool computes the ratio of species carrying the annotation (Homology) to the total number of species with members in the group (Membership). A high H/M ratio combined with a missing annotation in one species suggests a candidate for annotation extension.
- Visualization β Results are presented in interactive DataTables and Edwards-Venn diagrams (2β7 species) showing the overlap and gaps of annotations across the selected species.
| Organism | Code | Taxonomy ID | Mode A | Mode B |
|---|---|---|---|---|
| Arabidopsis thaliana | AT | 3702 | β | β |
| Caenorhabditis elegans | CE | 6239 | β | β |
| Drosophila melanogaster | DM | 7227 | β | β |
| Danio rerio | DR | 7955 | β | β |
| Homo sapiens | HS | 9606 | β | β |
| Saccharomyces cerevisiae | SC | 559292 | β | β |
| Schizosaccharomyces pombe | SP | 4896 | β | β |
| Tool | URL | Status |
|---|---|---|
| Ortholog Search & GO Tool | orthologfindertool.com | Online |
- PHP 5.6 or higher
- MySQL 5.7 or higher
- Apache or compatible web server
-
Clone the repository:
git clone https://github.com/ZoliQua/Ortholog-Finder-Tool.git cd Ortholog-Finder-Tool -
Create
.envfile from the template:cp .env.example .env
-
Edit
.envwith your database credentials:DB_HOST=localhost DB_USER=root DB_PASS=your_password DB_NAME=ortholog DB_SOCKET= DB_PORT=3306
-
Set up MySQL database:
For Mode A (Ortholog Search): No MySQL tables required for the core query β all data is file-based (CSV/TSV). The MySQL connection is only used for GeoIP visitor logging (
ip2ctable).For Mode B (GO Extension): Requires
orthology_databasesandgeneontologytables in theorthologydatabase. These contain the ortholog mappings and GO annotation data used by the QueryGO class. -
Point your web server to the project root and open
index.phpin a browser.
| Variable | Description | Default |
|---|---|---|
DB_HOST |
MySQL hostname | localhost |
DB_USER |
MySQL username | root |
DB_PASS |
MySQL password | (empty) |
DB_NAME |
MySQL database name | ortholog |
DB_SOCKET |
Unix socket path (optional, for MAMP/XAMPP) | (empty) |
DB_PORT |
MySQL port (0 = default) | 3306 |
orthologfindertool-v1.1/
β
βββ index.php β Redirect to main.php
βββ main.php β Unified router (mode switching)
βββ download.php β CSV export handler
βββ dumper.php β GO batch export (all 85 terms)
β
βββ includes/
β βββ mysql.php β Database connection (.env based)
β βββ mylog.php β Visitor logging (CSV + GeoIP)
β βββ ip2country.php* β GeoIP country detection
β β
β βββ functions.php β FajlBeolvas + Lekeres classes (Mode A core)
β βββ inc_analyzer.php β QueryGO class (Mode B core)
β βββ inc_analyzer_dumper.php β QueryGOExport (batch CSV)
β βββ inc_functions.php β VennDiagram + SVG_File + helpers
β βββ inc_variables.php β Species, GO terms, POST handling
β β
β βββ page_landing.php β Mode selector (landing page)
β βββ page_ortholog_form.php β Mode A: query form
β βββ page_ortholog_results.php β Mode A: results + DataTable
β βββ page_2_analysis_{species}.php β Mode A: per-species table templates
β βββ page_go_analyzer.php β Mode B: GO form + results + 4 DataTables
β βββ page_sources.php β References page
β βββ page_aboutus.php β About page
β
βββ _dataset/ β Mode A data files (40 MB)
β βββ ALL_ortholog_dbs_merged.csv β 225K ortholog pairs (6 DBs)
β βββ kegg_pathways_uniprot.tsv β KEGG pathway annotations
β βββ reactome_pathways_uniprot.tsv β Reactome pathway annotations
β βββ regular_names.txt β UniProt ID β gene name mapping
β βββ {AT,DM,HS,SC,SP}_interact_deg2_exp.csv β Cell-size screen data
β
βββ source/ β Mode B source data (15 MB)
β βββ eggNOG-export-*-7.csv (Γ75) β eggNOG ortholog group exports
β βββ ortholog_*_sample.svg (Γ12) β Venn diagram SVG templates
β
βββ output/ β Mode B generated output (15 MB)
β βββ eggNOG-export-*-7.csv (Γ75) β Processed results
β βββ GO*-Venn-Diagram-*.svg (Γ49) β Generated Venn diagrams
β
βββ _query/ β Mode A pre-computed results (23 MB)
β βββ jsonquery_{species}.txt β DataTables JSON (per species)
β
βββ media/
β βββ css/unified.css β Merged stylesheet
β βββ js/ β jQuery 1.11.2 + DataTables 1.10.5
β βββ images/ β Logos, icons, organism photos
β
βββ _log/ β Visitor logs (not tracked)
βββ .env.example β Environment template
βββ .gitignore
βββ LICENSE.md β GNU GPL v2
| URL | Page |
|---|---|
main.php |
Landing page β mode selector |
main.php?mode=ortholog |
Ortholog Search form |
main.php?mode=go |
GO Extension form |
main.php?page=source |
References |
main.php?page=about |
About Us |
| Database | Version | Date |
|---|---|---|
| BioGRID | 3.2.112 | April 2014 |
| eggNOG | 4.0 | December 2013 |
| InParanoid | 8.0 | December 2013 |
| orthoMCL | 5 | 2013 |
| HomoloGene | β | 2014 |
| COG/KOG | β | 2014 |
| Database | URL |
|---|---|
| KEGG | genome.jp/kegg |
| Reactome | reactome.org |
| Database | Version | Purpose |
|---|---|---|
| UniProt | 2014_04 | ID mapping & protein names |
| PomBase | V2.19 | PombeβHuman/Cerevisiae curated orthologs |
| intAct | 2013-11-20 | Protein interaction data |
| MitoCheck | ens73 | H. sapiens phenotype data |
| Source | Details |
|---|---|
| Gene Ontology | 85 GO Slim Generic terms |
| eggNOG v4 | Ortholog group β protein mapping |
Dul, Z. (2019). A system-level approach to identify novel cell size regulators. PhD thesis, King's College London. KCL Pure
| Species | Publication | Journal | Year |
|---|---|---|---|
| S. cerevisiae | Jorgensen P et al. β Systematic identification of pathways that couple cell growth and division in yeast | Science | 2002 |
| S. cerevisiae | Moretto F et al. β A pharmaco-epistasis strategy reveals a new cell size controlling pathway in yeast | Mol Syst Biol | 2013 |
| S. pombe | Hayles J et al. β A genome-wide resource of cell cycle and cell shape genes of fission yeast | Open Biology | 2013 |
| S. pombe | Graml V et al. β A genomic multiprocess survey of machineries that control and link cell shape, microtubule organization, and cell-cycle progression | Dev Cell | 2014 |
| D. melanogaster | BjΓΆrklund M et al. β Identification of pathways regulating cell size and cell-cycle progression by RNAi | Nature | 2006 |
| H. sapiens | Neumann B et al. β Phenotypic profiling of the human genome by time-lapse microscopy reveals cell division genes | Nature | 2010 |
Ashburner M et al. (2000). Gene Ontology: tool for the unification of biology. Nature Genetics, 25(1), 25β29. doi:10.1038/75556
Powell S, Forslund K, Szklarczyk D, et al. (2014). eggNOG v4.0: nested orthology inference across 3686 organisms. Nucleic Acids Research, 42(D1), D231βD239. PubMed: 24297252
The UniProt Consortium (2015). UniProt: a hub for protein information. Nucleic Acids Research, 43(D1), D99βD106. PubMed: 25348405
| Library | Version | License |
|---|---|---|
| jQuery | 1.11.2 | MIT |
| jQuery DataTables | 1.10.5 | MIT |
| ip2country by Blagoj Janevski | β | GNU GPL v2 |
This tool was originally published as two separate web applications during 2015β2018:
-
Ortholog Finder Tool (orthologfindertool.com) β Multi-database ortholog search with pathway and screen data integration. Archived source: ZoliQua/Ortholog-Finder-Tool-Draft
-
GeneOntology Extension Tool (go.orthologfindertool.com) β GO annotation extension via eggNOG ortholog groups with Venn diagram visualization. Archived source: ZoliQua/Ortholog-Finder-Tool-GO
Version 1.6 (November 2025) unifies both tools into a single application with a shared navigation and mode-selection interface. No data or backend logic was changed β only the UI was consolidated.
The git history of this repository preserves the full commit histories of both original repositories.
Dr. ZoltΓ‘n Dul Dentist, Phd graduate (2014β2019), King's College London
- π§ zoltan.dul@gmail.com
- Prof. N. Shaun B. Thomas β Cell Cycle & Epigenetics Team, Division of Cancer Studies, King's College London
- Dr. Attila CsikΓ‘sz-Nagy β CsikΓ‘sz-Nagy Group, Randall Division of Cell and Molecular Biophysics, King's College London
- Dr. Azeddine Si Ammour β Genomics and Biology of Fruit Crop, Fondazione Edmund Mach, San Michele all'Adige
This project is licensed under the GNU General Public License v2.0.
Built at King's College London & Fondazione Edmund Mach