Audio & Video Transcription and Generation Project

📌 Project Description

This project enables users to:

Convert audio and video files into transcribed text using OpenAI Whisper.
Generate synthetic speech using gTTS (Google Text-to-Speech) with male and female voices.
Create videos with generated audio overlaid on an image background.

🚀 Features

Automatic Speech Recognition (ASR): Converts spoken content from audio/video into text.
Text-to-Speech (TTS): Generates male and female voices from the text.
Video Generation: Combines generated speech with an image to create a video. (on working)

📂 Folder Structure

project-folder/
│── data_file/                # Folder for input audio/video files
│── transcript_file/          # Folder for storing transcriptions and generated files
│── project_file.py           # Main script for processing audio/video files
│── audio.py                  # Script for generating audio from text
│── requirements.txt          # Dependencies list
│── README.md                 # Project documentation

📊 Data Sources

This project utilizes diverse audio and video data from multiple sources:

Kaggle Dataset: Pre-existing audio datasets from Kaggle.
Self-Recorded Videos: Custom video recordings for personalized content.
YouTube Videos: Extracted audio and video for transcription and analysis.
AI-Generated Videos: Videos generated using artificial intelligence tools.
Text-to-Speech (TTS) Audio: Synthetic audio generated via TTS for experimentation.

🛠️ Installation Guide

1️⃣ Clone the Repository

git clone https://github.com/AritraOfficial/Media-Transcriber.git
cd Media-Transcriber

2️⃣ Set Up a Virtual Environment (Optional but Recommended)

python -m venv venv
source venv/bin/activate  # On macOS/Linux
venv\Scripts\activate    # On Windows

3️⃣ Install Dependencies

pip install -r requirements.txt

4️⃣ Install FFmpeg

FFmpeg is required for processing audio/video. Install it from:

Windows: Download FFmpeg

Linux/macOS:

sudo apt install ffmpeg  # Ubuntu/Debian
brew install ffmpeg      # macOS (Homebrew)

Ensure FFmpeg is added to your system `PATH`:

ffmpeg -version  # Check if FFmpeg is correctly installed

🏃‍♂️ Usage

1️⃣ Generate Speech from Text (Male & Female Voice)

Modify the script to choose between male or female voice.

python audio.py

2️⃣ Generate Video with Background Image and Voice

python video.py

This will:

Overlay generated speech onto background.jpg.
Create a video file with the generated audio.

3️⃣ Run the Transcription Script

python project_file.py

This will:

Scan data_file/ for audio/video files.
Use OpenAI Whisper to transcribe them.
Save transcriptions in the transcript_file/ folder.

🔧 Troubleshooting

1️⃣ FFmpeg Not Found

🔍 Issue: FFmpeg does not recognize

Add FFmpeg to System PATH:

Open Environment Variables:
Press Win + R, type sysdm.cpl, and hit Enter.
Go to the Advanced tab → Click Environment Variables.
Find the "Path" Variable:
In System Variables, scroll down to Path → Click Edit.
Click New, then add this path:

{
    C:\ffmpeg_location/bin
}

Click OK and close all windows.
Restart your terminal after installation.

🔍 Issue: FFmpeg is installed and working in Command Prompt (cmd), but not in the VS Code terminal.

Solution: Add FFmpeg Path to VS Code Environment

Manually Add FFmpeg Path in VS Code:

Open VS Code.
Go to Settings by pressing Ctrl + Shift + P.
Search for terminal.integrated.env.windows in the search bar.
Click on Edit in settings.json to modify the configuration.
Add the following lines inside the JSON file:

{
    "terminal.integrated.env.windows": {
        "Path": "C:\\ffmpeg\\bin;${env:Path}"
    }
}

2️⃣ Missing Dependencies

Run:

pip install -r requirements.txt

📜 License

This project is open-source and available under the MIT License.

🙌 Acknowledgments

OpenAI Whisper for speech recognition.
gTTS for text-to-speech conversion.
MoviePy for video creation.

📞 Contact

For issues or contributions, create a GitHub issue or reach out at Gmail.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio & Video Transcription and Generation Project

📌 Project Description

🚀 Features

📂 Folder Structure

📊 Data Sources

🛠️ Installation Guide

1️⃣ Clone the Repository

2️⃣ Set Up a Virtual Environment (Optional but Recommended)

3️⃣ Install Dependencies

4️⃣ Install FFmpeg

Ensure FFmpeg is added to your system `PATH`:

🏃‍♂️ Usage

1️⃣ Generate Speech from Text (Male & Female Voice)

2️⃣ Generate Video with Background Image and Voice

3️⃣ Run the Transcription Script

🔧 Troubleshooting

1️⃣ FFmpeg Not Found

🔍 Issue: FFmpeg does not recognize

🔍 Issue: FFmpeg is installed and working in Command Prompt (cmd), but not in the VS Code terminal.

Solution: Add FFmpeg Path to VS Code Environment

2️⃣ Missing Dependencies

📜 License

🙌 Acknowledgments

📞 Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data_file		data_file
trans		trans
transcript_file		transcript_file
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
audio_file.py		audio_file.py
project_file.py		project_file.py
requirements		requirements

License

AritraOfficial/MediaFlow

Folders and files

Latest commit

History

Repository files navigation

Audio & Video Transcription and Generation Project

📌 Project Description

🚀 Features

📂 Folder Structure

📊 Data Sources

🛠️ Installation Guide

1️⃣ Clone the Repository

2️⃣ Set Up a Virtual Environment (Optional but Recommended)

3️⃣ Install Dependencies

4️⃣ Install FFmpeg

Ensure FFmpeg is added to your system PATH:

🏃‍♂️ Usage

1️⃣ Generate Speech from Text (Male & Female Voice)

2️⃣ Generate Video with Background Image and Voice

3️⃣ Run the Transcription Script

🔧 Troubleshooting

1️⃣ FFmpeg Not Found

🔍 Issue: FFmpeg does not recognize

🔍 Issue: FFmpeg is installed and working in Command Prompt (cmd), but not in the VS Code terminal.

Solution: Add FFmpeg Path to VS Code Environment

2️⃣ Missing Dependencies

📜 License

🙌 Acknowledgments

📞 Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Ensure FFmpeg is added to your system `PATH`:

Packages