-
Notifications
You must be signed in to change notification settings - Fork 0
Project Structure
The STAS project is organized into a modular directory structure, allowing for scalability, maintainability, and separation of concerns. Below is a detailed overview of each module and its responsibilities.
src/
│
├── annotation/ # Annotation types
├── api/ # Exposes the application interface (controller and UI)
├── dao/ # Manages database operations (Data Access Objects)
├── data/ # Contains datasets and data-related files
├── i_entities/ # Defines core interfaces and base entities
├── metric/ # Implements evaluation metrics for model performance
├── model/ # Contains machine learning models and related utilities
├── sample/ # Sample creation
├── selector/ # Implements logic for sample selection
├── stopping_conditions/ # Handles stopping criteria for iterative processes
├── utils/ # Utility scripts for common operations
├── config.yaml # Configuration file for project parameters
Handles the logic for annotating data, supporting both classification and sequence-based annotations.
- classification_annotation.py: Implements functionality for text classification annotation.
- sequence_annotation.py: Handles sequence-based annotation tasks (e.g., Named Entity Recognition).
Exposes interfaces for interacting with the system, including a controller for backend logic and a UI module for user interaction.
- controller.py: Manages request handling, orchestrating between modules.
- ui.py: Implements a basic user interface for interacting with the system.
Manages data storage and retrieval operations, including database-specific implementations.
- mongo_dao.py: Provides CRUD operations for a MongoDB backend.
Defines interfaces and base entities used throughout the project, enabling modularity and extensibility.
- annotation_interface.py: Abstracts the annotation process.
- dao_interface.py: Standardizes DAO implementations.
- model_interface.py: Provides a blueprint for machine learning models.
- sample_interface.py: Defines the structure and behavior of data samples.
- stop_condition_interface.py: Interface for stopping condition implementations.
- Additional files: Base classes for metrics, experiments, iteration, logging, etc.
Implements evaluation metrics for assessing model performance.
- metric_factory.py: Factory pattern for creating metric instances.
Contains machine learning models and related utilities.
- ner_model.py: Implements a Named Entity Recognition (NER) model.
- model_factory.py: Factory for creating and managing models.
Manages creation and manipulation of data samples.
- sample_factory.py: Factory for creating data samples.
- TextClassificationSample.py: Manages text classification sample creation.
- sequence_to_Sequence_sample.py: Manages sequence-to-sequence samples.
Implements logic for selecting samples from datasets for annotation or training.
- selector_factory.py: Factory for creating sample selectors.
- random_selector.py: Randomly selects samples.
- __init__.py: Module initializer.
Implements stopping criteria for iterative processes such as training or annotation.
- acceptance_rate.py: Stopping condition based on acceptance rate.
Contains utility scripts and helper functions used across modules.
- config_loader.py: Parses and loads configuration settings.
- loader.py: Handles data loading operations.
- config.yaml: Centralized configuration file for defining project parameters.
- Start with api/controller.py to understand how the iterative annotation process works.
- Explore annotation/ and sample/ for data processing workflows.
- Look into model/ and metric/ for model training and evaluation.
- Utilize the utilities in utils/ for configuration and data loading.