This repository contains the research conducted by Team Vulcan during the International Semantic Web Summer School (ISWS 2025), based on the research problem proposed by Prof. Harald Sack.
Our study explores the potential of Large Language Models (LLMs) for Named Entity Recognition (NER) and Entity Linking (EL) in historical, multilingual, and culturally complex texts, specifically focusing on Giorgio Vasari’s 1568 Le vite de’ più eccellenti pittori, scultori e architettori.
We investigated prompt-based techniques for semantic information extraction without fine-tuning, and examined the integration of CIDOC-CRM ontology to scaffold cultural heritage entity recognition and knowledge graph construction.
- RQ1: How accurately and consistently can LLMs perform NER and EL on multilingual, historical documents?
- RQ2: Can reliable ground truths be created without being fluent in a given source language?
- RQ3: Can ontologies enhance the evidence-finding process and the reliability of LLM-extracted data?
- Gauri Bhagwat
- Kristian Noullet
- Balázs Mosolygó
- Ruben Peeters
- Lucrezia Pograri