Fact Annotation is a lightweight Streamlit-based tool for semi-structured text annotation, designed for tasks that go beyond traditional Named Entity Recognition. Instead of defining entities based on their mentions in text, this tool allows annotators to highlight mentions of factual information by associating aliases with abstract "facts" present in the text. It's especially useful for building datasets where the goal is to identify and label information without enforcing strict mention-entity ties — such as disambiguated facts or reference points used in downstream reasoning or evaluating information extraction systems.
| Tool | Version |
|---|---|
| Python | 3.11 (confirmed) – 3.12 should also work |
| pip | 23 or later |
| Other dependencies | Installed automatically via requirements.txt |
Tip: Replace
python3.11withpythonorpython3if those point to 3.11+ on your system.
git clone https://github.com/ViktorooReps/fact-annotation
cd fact-annotationCreate & activate a virtual environment
python3.11 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activateInstall dependencies
pip install -r requirements.txtLaunch the app
streamlit run app.pyCreate a Service Account in the Google Cloud console
- enable the Google Sheets API
- generate a JSON key file.
Share every Sheet you plan to use with the account’s client_email (Editor permission) or make the sheet public.
Add the credentials to secrets.toml in .streamlit directory at the project root:
[gcp_service_account]
type = "service_account"
project_id = "(...)"
private_key_id = "(...)"
private_key = "-----BEGIN PRIVATE KEY-----(...)-----END PRIVATE KEY-----\n"
client_email = "(...)@(...).iam.gserviceaccount.com"
client_id = "(...)"
auth_uri = "https://accounts.google.com/o/oauth2/auth"
token_uri = "https://oauth2.googleapis.com/token"
auth_provider_x509_cert_url = "https://www.googleapis.com/oauth2/v1/certs"
client_x509_cert_url = "https://www.googleapis.com/robot/v1/metadata/(...)/(...)%40(...).iam.gserviceaccount.com"
universe_domain = "googleapis.com"