Preliminary plans for the Vin Scelsa Archive Project, developed for (hopeful) future work in collaboration with Fordham University Libraries and WFUV.
- The goal of the overall project is to reformat Vin's collection of audio interview recordings and make them as accessible as copyright and storage will allow.
- The goal of this GitHub repository is to (eventually) capture metadata and transcriptions for the interviews as well as provide status and workflow documentation.
.
├── interviews/ # folder per interview
│ ├── yyyy-mm-dd-guest-name/ # template folder, showing naming format
│ │ ├── _metadata.csv # metadata for individual interview and digitiazed audio file
│ │ ├── _transcription_raw.md # automatic transcription text of individual interview, generated by WhisperAI
│ │ ├── _transcription_edited.md # edited transcription text of individual interview, reviewed by a human
├── metadata/
│ ├── vs-metadata-template.csv # metadata template for the project
│ ├── compiled/ # directory for compiled metadata for the project
│ │ ├── vs-metadata-master.csv # compiled master metadata file
├── workflow/ # workflow documentation
│ ├── vs-project-board.tsv # exported data from Project Board status tracker
├── .github/workflows/
│ ├── create-interview-folder.yml # Action for _metadata.csv file upload to trigger new folder creation in interviews/ directory
├── LICENSE # Opted for MIT
└── README.md # what you're currently reading 🙂
- From the /metadata directory, download the template file vs-metadata-template.csv.
- Open template, add any known metadata for the interview and save the updated file using the interview date and guest name in the following naming format: yyyy-mm-dd-guest-name_metadata.csv.
- Upload the _metadata.csv file to vs-archive-project/interviews. This will trigger the Action create-interview-folder.yml which creates a new folder for the corresponding interview, containing the _metadata.csv file.
- Check vs-archive-project/interviews to confirm new folder was created, and use this folder for any future file uploads related to this interview.
Currently, a very basic metadata schema is in place for capturing essential identifying details of these interview recordings. Further evaluation should take place before the project progresses further to close the following Issues:
- Expand the metadata schema and align it with PBCore. To refine the fields used, refer to the PBCore 2.1 GitHub repository, and remember to update any previous versions of metadata files and templates.
- Establish identification (ID) number schema and plan. The physical items need ID numbers and barcodes and each corresponding digital item should be assigned as an instantiation numbers of that main ID number.
vs-archive-project-board is used as this project's central task list. Contributors to this project should add Issues and then use the Project Board to assign roles and select from the following status levels:
- To Do
- In Progress
- Review
- Done
@Contributors: Please continue to update assignments and status levels as Issues are added and tasks progress.
- Tom Waits interview information source: Tom Waits Library.info
- Lou Reed interview information source: WFUV.org
- ChatGPT was used to help troubleshoot errors encountered while developing the Action script for create-interview-folder.yml
- Primary GitHub information (and help) source: Chris Diaz