-
Notifications
You must be signed in to change notification settings - Fork 0
Forced Alignment
- AMP: Audiovisual Metadata Platform
- Documentation
- For Collection Managers
- MGMs (Metadata Generation Mechanisms)
Forced Alignment is the process that, based on audio and a text transcript of the audio, adds timestamps to the transcript.\
- Audio (mp3, wav, possibly other formats)
- Corresponding transcript in a text format with no time codes.
- Gentle Transcript (json) - Transcript with time codes in the Gentle delivered json format.
- AMP Transcript Aligned (json) - Aligned transcript in the AMP JSON format.
MGMs in AMP
The Gentle Forced Alignment MGM takes an AMP transcript as input and the audio file related to the item to generate the AMP transcript output with updated time codes.
Parameters:
- Audio (mp3, wav, possibly other formats) and transcript (plain text).
- In AMP, this tool was created to correct time codes of a transcript that went through the Human MGM for correction because the BBC transcript editor used in the correction process results in corrected speech with wrong time codes.\
Realigning a transcript with misaligned time codes
An item had the transcript corrected by a human using the BBC transcript editor. During this correction process, the editor had to add several chunks of speech, which the BBC editor did not align with time codes. The CM wants the resulting transcript to go through Forced Alignment to correct the problems.

Schema
Sample Output
Forced
alignment workflow.png
(image/png)\
Document generated by Confluence on Feb 25, 2025 10:39
- MGMs (Metadata Generation Mechanisms)
- Transferring files to the AMP Dropbox (IU only)
- Supplemental Files in Workflows
- AMPPD REST API
- REST API for AMP Content Entities
- REST API for Workflow/Job Related Actions
- Development Environment Setup
- Galaxy Workflow Editor
-
MGM Adapters
- Applause Detection
- AWS Comprehend
- AWS Transcribe
- Azure Shot Detection
- Azure Video Indexer - for Developers
- Azure Video OCR
- Contact Sheet MGMs
- Facial Recognition (FR)
- Gentle Forced Alignment
- HPC Batch Scheduling
- Human MGMs
- INA Speech Segmenter
- INA Speech Segmenter HPC
- Input Supplement
- JSON to VTT Generator
- Kaldi HPC
- PySceneDetect
- spaCy
- Tesseract
- Vocabulary Tagging.
- Packaging System
- Report on HPC vs Local Environments for INA/Kaldi Workflow
- Technical Tips for Dev
- torque_amp - An experiment in external Galaxy scheduling
- Workflow Engine Analysis
- Creating a test collection
- Workflow display debug information
- MGM Schema, Adapter, and Output Data Types Change History
- AMPPD Data Model
- Sample IIIF Manifest
- Survey of ML tools 2023
- HPC Whisper Experiment notes (2024)
- Resources