Skip to content
/ he-tool Public

A tool for assisting human annotators with MQM-based annotation of translations, with pre-configured categories designed for Japanese-English.

License

Notifications You must be signed in to change notification settings

yaraku/he-tool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Human Evaluation Tool

A web-based tool for conducting human evaluation of machine translation outputs. This tool allows evaluators to assess and compare translations from different systems, mark errors, and provide detailed feedback.

Features

  • User authentication and authorization
  • Support for multiple language pairs
  • Error marking and categorization
  • Severity level assessment
  • Side-by-side comparison of translations
  • Progress tracking
  • Results aggregation and export

Demo

Below is a quick video showing how the Human Evaluation Tool looks and works:

Demo.mov

Project Structure

The project consists of three main components:

  • backend/: Flask-based REST API server
  • frontend/: React-based web application
  • public/: Static assets and built files

Prerequisites

  • Python 3.10 or later
  • Node.js 18 or later
  • PostgreSQL 13 or later
  • Poetry (Python package manager)
  • npm (Node.js package manager)

Installation and Setup

Option 1: Using Docker (Recommended)

  1. Build the Docker image:
docker build -t yaraku/human-evaluation-tool .

Note: The Docker build installs Poetry versions that satisfy >=1.5,<1.7 by default. You can override the constraint with --build-arg POETRY_VERSION_CONSTRAINT="==1.6.1" if you need to pin an exact release.

  1. Run the container:
docker run --rm -it -p 8000:8000 yaraku/human-evaluation-tool

Option 2: Manual Setup

  1. Install prerequisites:

    • Python 3.10 or later
    • Node.js 18 or later
    • PostgreSQL 13 or later
    • Poetry (Python package manager)
    • npm (Node.js package manager)
  2. Set up PostgreSQL:

# Start PostgreSQL service
sudo service postgresql start

# Create database and set password
sudo -u postgres createdb he_tool
sudo -u postgres psql -c "ALTER USER postgres WITH PASSWORD 'postgres';"
  1. Set up the backend:
cd backend

# Install dependencies (Poetry 1.5.x - 1.6.x)
poetry install

# Create and configure .env file
cat > .env << EOL
FLASK_APP=human_evaluation_tool:app
FLASK_ENV=development
DB_HOST=localhost
DB_PORT=5432
DB_NAME=he_tool
DB_USER=postgres
DB_PASSWORD=postgres
JWT_SECRET_KEY=development-secret-key
EOL

# Initialize and run migrations
poetry run flask db init
poetry run flask db migrate
poetry run flask db upgrade

# Start the backend server (development)
poetry run python main.py

# Or launch with Gunicorn (production-style)
poetry run gunicorn --bind 0.0.0.0:8000 human_evaluation_tool:app
  1. Set up the frontend (in a new terminal):
cd frontend

# Install dependencies
npm install

# Create and configure .env file
echo "VITE_API_URL=http://localhost:5000" > .env

# Start the development server
npm run dev
  1. Access the application:

Usage

When you run the backend without PostgreSQL credentials it falls back to a local SQLite database. That database is pre-populated with a demo user (yaraku@yaraku.com / yaraku) and a "Sample Evaluation" so you can explore the workflow immediately.

  1. Access the application at http://localhost:5173
  2. Log in with the demo credentials above or register a new account
  3. Open the "Sample Evaluation" to try the annotation UI, or create a new evaluation project
  4. Upload documents and system outputs when running your own studies
  5. Start evaluating translations

Development

Backend Development

The backend is built with Flask and uses:

  • SQLAlchemy for database ORM
  • Flask-JWT-Extended for authentication
  • Flask-Migrate for database migrations

Key commands:

cd backend
poetry run flask db migrate  # Create new migrations
poetry run flask db upgrade  # Apply migrations
poetry run python main.py  # Run development server
poetry run gunicorn --bind 0.0.0.0:8000 human_evaluation_tool:app  # Run with Gunicorn

Frontend Development

The frontend is built with React and uses:

  • Vite for build tooling
  • TailwindCSS for styling
  • React Query for data fetching

Key commands:

cd frontend
npm run dev  # Start development server
npm run build  # Build for production
npm run preview  # Preview production build

Database Schema

The application uses a PostgreSQL database with the following main entities:

  • Users: Evaluators and administrators
  • Documents: Source texts for evaluation
  • Systems: MT systems being evaluated
  • Evaluations: Evaluation projects
  • Annotations: User annotations and feedback
  • Markings: Error markings and categorizations

For a detailed ER diagram, see backend/README.md.

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. For backend changes, install dependencies with poetry install --with dev and run:
    • poetry run black --check src tests
    • poetry run isort --check-only src tests
    • poetry run flake8 src tests
    • poetry run mypy src tests
    • poetry run pytest
  4. Commit your changes
  5. Push to the branch
  6. Create a Pull Request and ensure the Backend CI workflow passes when touching backend code.

License

This project is licensed under the GPL-3.0 License - see the LICENSE file for details.

About

A tool for assisting human annotators with MQM-based annotation of translations, with pre-configured categories designed for Japanese-English.

Topics

Resources

License

Stars

Watchers

Forks

Contributors 2

  •  
  •