Corrections to LatinISE:

extract_quadruples_LatinISE.py

correct_lemmas_pos_LatinISE.py

Prepare corpus data for SemEval task 1:

split_sentencßes_LatinISE.py

Calculate inter-annotator agreement for SemEval task 1 annotated data:

interannotator_LatinISE.py

Prepare annotated data to be processed by Dominik's script that clusters senses:

process_annotations_LatinISE.py

Prepare subcorpora for SemEval:

prepare_LatinISE_for_SemEval.py

Run checks on SemEval subcorpora:

semeval_check.py NB: this script has an error, because it tries to find left contexts of annotated sentences in the subcorpora, which are lemmatized so it doesn't find the contexts.

Prepare final files:

defaults write com.apple.Finder AppleShowAllFiles true [Press Return] killall Finder

cd "/Users/bmcgillivray/Documents/OneDrive/The Alan Turing Institute/OneDrive - The Alan Turing Institute/Research/2019/Latin corpus/LatinISE 4/for Codalab/semeval2020_ulscd_lat/corpus1/lemma"

gzip LatinISE1.gz

cd "/Users/bmcgillivray/Documents/OneDrive/The Alan Turing Institute/OneDrive - The Alan Turing Institute/Research/2019/Latin corpus/LatinISE 4/for Codalab/semeval2020_ulscd_lat/corpus2/lemma"

gzip LatinISE2.gz

Prepare annotated data for SemEval resource paper:

Prepare_annotation_for_SemEval.py

Folder Nexus WG4 UC4.2.1:

This folder contains the code of the work done for the Nexus Linguarum COST action, Working Grouop 4, use case 4.2.1 on humanities.

process_LatinISE_for_Nexus.py by Barbara McGillivray

This Python 3 script takes as input the .xml files from the LatinISE corpus, and returns a series of files that can be used to train diachronic word embeddings according to the following format:

file name: <language_code><file_identifier>, where - <language_code> (see Dublin Core Language and ISO 639.2)--> lat for Latin - = | | , see https://en.wikipedia.org/wiki/ISO_8601: 1BCE=+000, 2BCE=-0001, 1CE=+0001, etc. - <file_identifier> = specific to each dataset general metadata, according to Dublin Core terms and depending on the availability of this type of information in the corpus (title, author, publisher, type, etc.) or alternatively, links to the original files with metadata (if available);

Final_STSM_medical_latin.ipynb by Paola Marongiu (adapted from previous script by Barbara McGillivray)

This notebook analyses Latin medical lexicon in LatinISE with word embeddings.

Embedding_analysis.ipynb by Barbara McGillivray and Paola Marongiu

This notebook is based on the previous one and generalises the analysis to a set of terms related to "revolution". Code inspired from https://github.com/BarbaraMcG/latinise/blob/master/lvlt22/Semantic_change.ipynb.

Folder lvlt22:

Scripts for paper presented at "Latin vulgaire latin tardif" conference in September 2022: https://www.lvlt14.ugent.be/programme/ .

See readme file inside this folder for instructions.

Folder Christianity_semantic_change:

Code for Valentina Lunardi's project during her visit to King's College London.

Name		Name	Last commit message	Last commit date
Latest commit History 184 Commits
.ipynb_checkpoints		.ipynb_checkpoints
ICLL23		ICLL23
Nexus WG4 UC4.2.1		Nexus WG4 UC4.2.1
christianity_semantic_change		christianity_semantic_change
lvlt22		lvlt22
neolatin		neolatin
.gitignore		.gitignore
README.md		README.md
correct_lemmas_pos_LatinISE.py		correct_lemmas_pos_LatinISE.py
extract_quadruples_LatinISE.py		extract_quadruples_LatinISE.py
prepare_LatinISE_for_SemEval.py		prepare_LatinISE_for_SemEval.py
prepare_annotations_for_SemEval.py		prepare_annotations_for_SemEval.py
prepare_annotations_for_SemEval_new.py		prepare_annotations_for_SemEval_new.py
process_annotations_LatinISE.py		process_annotations_LatinISE.py
semeval_check.py		semeval_check.py
split_sentences_LatinISE.py		split_sentences_LatinISE.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Corrections to LatinISE:

Prepare corpus data for SemEval task 1:

Calculate inter-annotator agreement for SemEval task 1 annotated data:

Prepare annotated data to be processed by Dominik's script that clusters senses:

Prepare subcorpora for SemEval:

Run checks on SemEval subcorpora:

Prepare final files:

Prepare annotated data for SemEval resource paper:

Folder Nexus WG4 UC4.2.1:

Folder lvlt22:

Folder Christianity_semantic_change:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Corrections to LatinISE:

Prepare corpus data for SemEval task 1:

Calculate inter-annotator agreement for SemEval task 1 annotated data:

Prepare annotated data to be processed by Dominik's script that clusters senses:

Prepare subcorpora for SemEval:

Run checks on SemEval subcorpora:

Prepare final files:

Prepare annotated data for SemEval resource paper:

Folder Nexus WG4 UC4.2.1:

Folder lvlt22:

Folder Christianity_semantic_change:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages