Skip to content
Adeoye Sunday edited this page Nov 27, 2024 · 24 revisions

Semi-Automatic Text Annotation System

Overview

The STAS System is a modular framework designed to facilitate the iterative annotation process for machine learning tasks such as text classification and sequence labeling. This system automates the annotation process by allowing machine learning models to label data, which can then be reviewed and validated by human annotators. The system provides a flexible architecture for managing sample selection, annotation, stopping conditions, and iterative fine-tuning of models.

With the inclusion of a Streamlit-based graphical user interface (GUI), the system allows users to interact with the application, manage the annotation process, validate annotations, and upload datasets through an intuitive web interface.

Key Features

  • Iterative Annotation: Automates the process of fine-tuning models and annotating data iteratively, ensuring continual model improvement.
  • Configurable Stopping Conditions: Allows you to define custom conditions to stop the annotation process based on metrics such as the acceptance rate or other criteria.
  • Flexible Annotation Model: Supports both text classification and sequence labeling annotation tasks.
  • Sample Selection: Automatically selects samples for annotation using different selection strategies (e.g., random selection).
  • Streamlit UI: Provides a simple and interactive web interface for annotators to log in, manage annotations, validate sample labels, and upload datasets.

Table of contents

Clone this wiki locally