-
Notifications
You must be signed in to change notification settings - Fork 1
Overview
The Timecode Indexing Module (TIM) is a browser-based tool for creating, editing and formatting metadata for multimedia documents, specifically text-and-timecode-based “indexes” for oral histories and other long-form audio or video (A/V). TIM features a flexible workspace for editing text and timecode assets in the production of indexes and synchronized transcripts. Users can apply markdown language to populate timecode-level metadata for the OHMS.xml or WebVTT formats. A conceptual diagram of TIM’s general capabilities is given below.

Conceptual diagram of the Timecode Indexing Module (TIM).
TIM includes many features ideal for indexing oral history interviews and works with the indexing architecture for OHMS (the Oral History Metadata Synchronizer). Users parse timecodes and text in the TIM Editor into key OHMS metadata fields (i.e., timecodes, titles, synopses, keywords, and notes/partial transcript). Using OHMS’ XML file format (based on the OHMS.xsd metadata schema), data can be moved in and out to one or more oral history timecode index environments, most commonly the OHMS Viewer.
An OHMS.xml index can also be displayed in Aviary, an AV content management system (CMS) that uses the same data structure used in OHMS. See example indexes here in OHMS (select the “Play Interview” tab), and in Aviary. An index developed in TIM can also be viewed in a spreadsheet environment, via a .csv file.
TIM can also be used to refine automatic speech recognition (ASR) transcripts or to modify, refine, and export them. This can be formally done through the WebVTT format. Since TIM is essentially a text editor with in-text timecode functionality, data can also be moved in and out via simple copy/paste. Thus TIM facilitates unpublished, backend-oriented and free-form modes (e.g., A/V note taking). TIM can be used to generate and proof timecodes in documents whether or not they are intended to land in electronically linked environments.
The diagram below presents some different use case concepts for TIM and the major TIM functions pertaining to each. The “OHMS.xml indexers” use case is the primary scenario for which TIM was built, and the “ASR transcript editors” has been pursued by some users. The remaining two use cases are experimental scenarios for which informal examples exist. Deeper descriptions of these uses and additional use cases are discussed here.
The TIM project is spearheaded by Douglas Lambert, currently a Research Scientist at the University at Buffalo, who previously worked with oral historian Michael Frisch developing indexing practices through a consulting firm called The Randforce Associates. Zack Ellis of ThierStory partnered on the original development of TIM along with the Centre for Contemporary and Digital History (see Versions).
The theory and practice that led to the creation of TIM evolved primarily from the field of oral history. Oral history depends on recorded media, and while source recordings have been considered valuable and essential since the 1960s, word-for-word transcripts were considered the only way to work with recorded content throughout the 20th century (see, e.g., Larson 2019). The ability to consolidate and connect media on computers by the 2000s led to the digitization of oral histories, then the synchronization of text and media through timecodes-–both between transcriptions and their associated media, and also via more free-form indexes. Multiple index techniques, processes, platforms, and display interfaces have emerged since the late 1990s, with early pioneers including The Shoah Foundation and Densho. Systems like OHMS popularized the concept and the process, and also made indexing tools available to oral historians and others. Indexing offered a faster, simpler way to map out content by linking summary text directly to media timecodes in digital environments.
An index is an asset for its creators or public users in many ways:
- An index provides summarizing text, more readable than a transcript
- Indexing at the timecode level can offer a visual, browseable overview of long media files
- Indexing allows for improved access to specific themes across interviews/collections
- An index favors the use of practical, meaningful labeling in natural language, rather than relying only on language that is strictly literal, as is found in transcripts
- Indexing can make more interviews publically accessible with fewer resources compared to traditional transcript approaches
- Indexing is a high-impact process for interview analysis in pedagogical applications
The Oral History Association is the primary professional organization for oral history theory and practice. Indexing was documented and discussed in a collection of essays called Oral History in the Digital Age in 2012. The website is a resource for knowledge on oral history practice examined through the lens of rapidly-evolving technology. For more information, use the search term “indexing” to find essays by leaders in the field including Michael Frisch, Doug Boyd, Doug Lambert, Janneken Smucker, and Brooke Bryan.
Indexing is a metaphor that invokes back-of-the-book indexing, where segments and excerpts of content can be highlighted, and where meaningful access to that content is facilitated. Depending on the context, including the size of the collection, an A/V “index” (oral history or not) may look more like a table of contents, and may or may cross reference interviews in many different ways. The most elaborate indexes for collections may make use of controlled vocabularies (e.g., The National WWII Museum). Building a vocabulary for a collection involves establishing a meta layer of organized thematic meaning that can be applied across interview segments, yielding a powerful retrieval mechanism that is theme-based. A collection of recordings that includes taxonomies, thesauri, or any custom formatted content maps allows users to approach a collection through a thoughtfully-composed conceptual framework, as opposed to exploring by interviewees unknown to users, or through shot-in-the dark searches of transcriptions.
In oral history, historically, indexing was a process that was an alternative to word-for-word transcription. Synchronized transcripts (or "timecoded transcripts) evolved in parallel, and the resulting interfaces for index and transcript data have always been distinct. Improved ASR has added another element and has begun to blur the distinctions. Both approaches are about making long-form A/V content more accessible to users, and new approaches that hybridize the two forms are emerging.
Douglas Lambert provides a review of the 20+ year assent of Oral History Indexing as of 2023. Free ePrints are available here.
Working with one A/V media file at a time, an indexer in TIM establishes periodic markers corresponding to themes or highlights in a recording, systematically creating better access to long-form recordings. Some key features include:
- Workspace is based on a free-form text-editing environment
- Videos/media can be loaded from a local computer or accessed via URL
- Indexing timecodes are in-text, actively linked to the media, and hand-editable
- Timecodes and their associated title, synopsis and keyword fields can be defined in TIM and later viewed as an index in the OHMS or Aviary environments.
- Using markdown, described below, users define the index fields in TIM and transfer them to OHMS/Aviary via an xml file (or via CSV or VTT).
- Users can add notes or transcripts, automated or professional, and when available, the system will recognize any timecodes in those texts.
- Users can be non-technical or professional archivists, and the indexes for private use or public display.
- In-progress work can be saved and retrieved in various formats (i.e., text, CSV, VTT), or as a project-level JSON file.
No login or account is needed to use TIM. It is strictly browser-based, meaning it also has no central data server to save projects. Projects must be saved manually using the “Project JSON” option in the “Export” menu or by using the “Save project as JSON” icon. An active TIM project saved in JSON format retains the location of the active media file, of any transcript active in the transcript area, and of the contents of the notes field and its markdown code.
Index data can be sent to OHMS or Aviary as OHMS-formatted .xml files. Data can also be formulated as closed captioning or subtitle formats (.vtt files) or in generic formats (.txt and .csv) for a variety of uses.
Zack Ellis of TheirStory demonstrates TIM in a video he made, hosted by TheirStory on the Aviary platform. The video includes a timecode index (developed in TIM) for those who wish to read browse the video non-linearly: An introduction to TIM.
- Workspace Components
- The TIM Editor Workspace For Affiliating Text and Timecodes
- Keyboard Shortcuts for Media and Timecodes
- Timecodes
- The Timeline
- Transcript Resources
- The OHMS Metadata Model
- Markdown for OHMS.xml Fields
- The Preview Area
- Exporting to OHMS or Aviary via OHMS.xml, CSV, or VTT
- Uploading OHMS.xml or CSV into "OHMS in Aviary"
- WebVTT as alternate to OHMS.xml
- WebVTT syntax
- Why WebVTT?
- How to make a WebVTT in TIM
- Uploading OHMS.xml or WebVTT into Aviary
- Coming soon
