Health Transcribe

A web-based application that enables real-time, multilingual translation between patients and healthcare providers. This application converts spoken input into text, provides a live transcript, and offers a translated version with audio playback.

Features

Voice-to-Text: Uses browser-based Web Speech API for transcription, with optional Google Speech-to-Text API enhancement
Medical Terminology Enhancement: Uses Google Gemini to improve medical term accuracy
Real-Time Translation: Translates text between languages using Google Cloud Translation API (with Gemini as fallback)
Audio Playback: "Speak" button for audio playback of translated text
Mobile-First Design: Responsive and optimized for both mobile and desktop use
Dual Transcript Display: Shows both original and translated transcripts in real-time
Language Selection: Allows users to choose input and output languages

Tech Stack

Backend: FastAPI
Frontend: React with Next.js
Speech Recognition: Browser-based Web Speech API (with optional Google Speech-to-Text API)
Translation: Google Cloud Translation API (with Gemini as fallback)
AI Enhancement: Google Gemini API
Deployment: Google cloud, Docker (with options for Google Cloud Run and other platforms)

API Requirements

The application is designed to work with minimal API requirements:

Required: Google Gemini API (for medical term enhancement and fallback translation)
Optional: Google Cloud Translation API (for better translation quality)
Optional: Google Cloud Speech-to-Text API (for better transcription quality)

If the optional APIs are not configured, the application will fall back to browser-based speech recognition and Gemini-based translation.

Google Cloud API Setup

Setting up Google Cloud Translation API

Create a Google Cloud Project:
- Go to the Google Cloud Console
- Create a new project or select an existing one
- Note your Project ID
Enable the Translation API:
- In the Google Cloud Console, go to "APIs & Services" > "Library"
- Search for "Cloud Translation API"
- Click on it and press "Enable"
Create Service Account Credentials:
- Go to "APIs & Services" > "Credentials"
- Click "Create Credentials" > "Service Account"
- Fill in the service account details and grant it the "Cloud Translation API User" role
- Create a key for this service account (JSON format)
- Download the JSON key file and rename it to api.json
Place the api.json file: The application will look for your credentials file in these locations (in order):
- The path specified in the GOOGLE_APPLICATION_CREDENTIALS environment variable
- Current directory (api.json)
- Backend directory (backend/api.json)
- Backend app directory (backend/app/api.json)
Using the setup script (recommended): We've provided a setup script that will help you place the credentials file in the correct location:
```
./setup_google_credentials.sh
```
This script will:
- Ask for the location of your api.json file if not found in the current directory
- Copy it to the backend directory
- Set the GOOGLE_APPLICATION_CREDENTIALS environment variable in your shell configuration
Verify Your Credentials: Run the test script to verify your credentials are working:
```
cd backend
python -m app.test_credentials
```
Select Translation Provider in the UI:
- In the application, use the "Translation Quality" dropdown to select:
  - "Google Translate API" for the best translation quality
  - "Gemini API" for enhanced context in medical translations
  - "Auto" to use Google if available, otherwise Gemini

Setting up Google Cloud Speech-to-Text API (Optional)

Enable the Speech-to-Text API:
- In the Google Cloud Console, go to "APIs & Services" > "Library"
- Search for "Speech-to-Text API"
- Click on it and press "Enable"
Use the same service account credentials as for the Translation API

Setup Instructions

Prerequisites

Python 3.9+
Node.js 16+
Google Gemini API key (required)
Google Cloud account with Translation API enabled (optional but recommended)
Google Cloud account with Speech-to-Text API enabled (optional)

Backend Setup

Navigate to the backend directory:
```
cd backend
```

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:
```
pip install -r requirements.txt
```

Create a .env file with your API keys:

# Required
GEMINI_API_KEY=your_gemini_api_key

# Optional (for Google Cloud APIs)
GOOGLE_APPLICATION_CREDENTIALS=path/to/your/credentials.json

Run the server:

uvicorn app.main:app --reload --port 6000

Frontend Setup

Navigate to the frontend directory:
```
cd frontend
```
Install dependencies:
```
npm install
```
Create a .env.local file with your API endpoint:
```
NEXT_PUBLIC_API_URL=http://localhost:8000*
```
Run the development server:
```
npm run dev
```
Open http://localhost:3000 in your browser

Using the Application Without Google Cloud APIs

The application is designed to work even without Google Cloud APIs:

Speech Recognition: Uses the browser's built-in Web Speech API
Translation: Uses Google Gemini as a fallback translator
Language Support: Provides a default set of common languages

While the application works without Google Cloud APIs, using them will provide:

Better transcription accuracy, especially for medical terms
More reliable translation
Support for more languages

Deployment

The application is configured for deployment on Vercel abd Google cloud. You can deploy both the frontend and backend to Google run using the provided scripts.

This script will guide you through deploying:

Backend only
Frontend only
Both backend and frontend

Manual Deployment

If you prefer to deploy manually:

Backend Deployment

Navigate to the backend directory:
```
cd backend
```
Deploy to Vercel:
```
vercel --prod
```
Set up environment variables in the Vercel project settings:
- GEMINI_API_KEY: Your Google Gemini API key
- GOOGLE_CREDENTIALS_JSON: The JSON string generated by the prepare_vercel_credentials.py script

For more reliable and consistent deployment, we've added Docker support to the application. This allows you to containerize both the frontend and backend, making deployment more predictable across different environments.

Docker Setup

Make sure you have Docker and Docker Compose installed on your machine.

For local development with Docker:

# Start both frontend and backend
docker-compose up

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md
deploy-cloud-run.sh		deploy-cloud-run.sh
deploy-docker.sh		deploy-docker.sh
deploy.sh		deploy.sh
docker-compose.yml		docker-compose.yml
package-lock.json		package-lock.json
setup.sh		setup.sh
setup_google_credentials.sh		setup_google_credentials.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Health Transcribe

Features

Tech Stack

API Requirements

Google Cloud API Setup

Setting up Google Cloud Translation API

Setting up Google Cloud Speech-to-Text API (Optional)

Setup Instructions

Prerequisites

Backend Setup

Frontend Setup

Using the Application Without Google Cloud APIs

Deployment

Manual Deployment

Backend Deployment

Docker Setup

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Health Transcribe

Features

Tech Stack

API Requirements

Google Cloud API Setup

Setting up Google Cloud Translation API

Setting up Google Cloud Speech-to-Text API (Optional)

Setup Instructions

Prerequisites

Backend Setup

Frontend Setup

Using the Application Without Google Cloud APIs

Deployment

Manual Deployment

Backend Deployment

Docker Setup

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages