Skip to content

A curated collection of open datasets and ontologies from the TALOS AI4SSH project (University of Crete), supporting research in Digital Humanities, Computational Philology, Greek NLP, Semantic Web, and cultural heritage data.

License

Notifications You must be signed in to change notification settings

TALOS-AI4SSH/Open-Data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TALOS Open Data Repository

Available languages: English | Ελληνικά


This repository provides a curated collection of open datasets and ontologies developed within the TALOS AI4SSH project (University of Crete, ERA Chair).
These resources support Digital Humanities, Computational Philology, and Semantic Web research, with a focus on modeling, annotating, and analyzing ancient and modern Greek textual, historical, and cultural data.


📂 Contents

The repository currently hosts 10 open datasets/ontologies:

  1. Open Dataset 1 – Modeling an Archaeological Site (Göbekli Tepe, Turkey)
    Event-centric ontology for archaeological site representation.

  2. Open Dataset 2 – Archaic Lyric Poetry Ontology (ALyrA)
    Ontoterminology for archaic lyric poets (799–430 BC).

  3. Open Dataset 3 – Modeling Events (Classical Period)
    Event ontologies for the Classical period.

  4. Open Dataset 4 – Modeling Events (Hellenistic Period)
    Event ontologies for the Hellenistic period.

  5. Open Dataset 5 – Ancient Greek and Chinese Philosophers Ontology
    Cross-cultural ontology modeling philosophers, their works, and intellectual traditions.

  6. Open Dataset 6 – Collection of Datasets on Greek NLU (OYXOY)
    Benchmarking suite for Greek Natural Language Understanding.

  7. Open Dataset 7 – Greek Dialect Corpus
    A large-scale corpus of Greek dialects (Cypriot, Cretan, Pontic, Northern Greek).

  8. Open Dataset 8 – Modern Greek Literature
    Digitized corpus of interwar poetry and prose.

  9. Open Dataset 9 – Ancient Oratory Ontology
    Ontology of legal proceedings in Athenian courts (419–323 BC).

  10. Open Dataset 10 – Ontology of Legal Bodies in Classical Athens
    Ontology defining legal bodies presiding over/judging cases in Classical Athens.


Usage

Each dataset includes:

  • Ontology files (OWL/RDF) for use in semantic web applications.
  • Ontodictionaries (HTML) with terms and proper names.
  • Visualization options (WebVOWL, Protégé).
  • SPARQL queries and competency questions.
  • Zenodo DOI links for citation and reproducibility.

To explore a dataset:

  1. Navigate to its corresponding folder.
  2. Follow the instructions in the dataset-specific README.
  3. Cite using the provided Zenodo DOI.

Citation

If you use resources from this repository, please cite the relevant dataset(s) individually (see each README).

For general reference to the collection:

TALOS AI4SSH Project (2024). TALOS Open Data Repository. University of Crete.


License

All datasets are distributed under the Creative Commons Attribution–NonCommercial–NoDerivatives (CC BY-NC-ND 4.0) license.
You are free to share and redistribute the material under the following conditions:

  • BY: Credit must be given to the creator(s).
  • NC: Only non-commercial uses are permitted.
  • ND: No derivatives or adaptations are allowed.

Contributions

Contributions, feedback, and collaborations are warmly welcomed.
Please contact the corresponding authors listed in each dataset’s README or open an issue in this repository.


More Information

  • TALOS Project Website: talos-ai4ssh.uoc.gr
  • Publications and supplementary material available via linked Zenodo DOIs in each dataset.

About

A curated collection of open datasets and ontologies from the TALOS AI4SSH project (University of Crete), supporting research in Digital Humanities, Computational Philology, Greek NLP, Semantic Web, and cultural heritage data.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages