Real-time multilingual translation platform powered by AI and Computer Vision
Features โข Architecture โข Quick Start โข Tech Stack โข Team
Transearly is an innovative AI-powered translation platform that combines advanced computer vision with natural language processing to provide seamless, real-time translation services. The platform supports multiple translation modes including text, voice, image, and document translation.
- ๐ธ Image Translation with OCR - Detect and translate text in images using Google Cloud Vision API
- ๐ค Voice Translation - Real-time speech-to-text translation
- ๐ Document Translation - Support for PDF, DOCX, XLSX, PPTX, CSV formats
- ๐ฌ Text Translation - Fast and accurate text translation across 8+ languages
- ๐ฑ Mobile-First Design - Beautiful React Native mobile application
- โก Real-time Processing - WebSocket integration for live translation updates
- ๐จ Interactive UI - Bounding box detection with tap-to-translate popups
- Smart OCR Detection - Uses Google Cloud Vision API for accurate text detection
- Interactive Bounding Boxes - Tap on detected text regions to view translations
- Multi-language Support - Automatically detects and translates text in images
- Paragraph Grouping - Intelligently groups text into meaningful segments
- Real-time Recording - Record and translate voice input
- Audio Playback - Listen to original recordings
- Multiple Languages - Support for 8+ languages
- File Format Support - PDF, DOCX, XLSX, PPTX, CSV
- Batch Processing - Queue-based translation for large documents
- Progress Tracking - Real-time translation progress via WebSocket
- Format Preservation - Maintains original document formatting
- Instant Translation - Fast text-to-text translation
- Context-Aware - Preserves semantic meaning
- Copy to Clipboard - Easy result sharing
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ TRANSEARLY PLATFORM โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Mobile App โ โ Backend API โ
โ (React Native) โโโโโโโโโโบโ (NestJS) โ
โ โ REST โ โ
โ โข Camera Screen โ WebSocket โข Translation Service โ
โ โข Voice Recording โ โ โข Queue Processing โ
โ โข Text Input โ โ โข Google Vision OCR โ
โ โข File Upload โ โ โข AI Translation โ
โโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโ
โ โ โ
โโโโโโโผโโโโโโโ โโโโโโโโโผโโโโโโโ โโโโโโโโผโโโโโโ
โ Google โ โ OpenRouter โ โ Redis โ
โ Cloud โ โ (Gemini) โ โ Queue โ
โ Vision โ โ โ โ โ
โโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโ
transearly/
โโโ transearly-api/ # Backend NestJS API Server
โ โโโ src/
โ โ โโโ modules/
โ โ โ โโโ translator/ # Translation services & controllers
โ โ โโโ main.ts
โ โ โโโ app.module.ts
โ โโโ package.json
โ
โโโ mobile-app/ # React Native Mobile Application
โโโ src/
โ โโโ screens/ # App screens
โ โโโ components/ # Reusable components
โ โโโ services/ # API integration
โ โโโ navigation/ # Navigation setup
โโโ package.json
- Node.js 18+
- npm or yarn
- Expo CLI (for mobile app)
- Google Cloud Vision API credentials
- OpenRouter API key
git clone https://github.com/your-org/transearly.git
cd transearlycd transearly-api
npm install
# Create .env file
cp .env.example .env
# Add your credentials to .env
GOOGLE_APPLICATION_CREDENTIALS=./google-cloud-key.json
OPENROUTER_API_KEY=your_openrouter_key
OPENROUTER_BASE_URL=https://openrouter.ai/api/v1/chat/completions
OPENROUTER_MODEL=google/gemini-2.0-flash-exp:free
# Start the server
npm run start:devThe API will run on http://localhost:5010
cd mobile-app
npm install
# Create .env file
cp .env.example .env
# Update API_URL in src/config/api.config.js
# For local development: http://YOUR_LOCAL_IP:5010
# Start the app
npm start
# or use tunnel mode
npm run tunnel- Go to Google Cloud Console
- Create a new project or select existing one
- Enable Cloud Vision API
- Create a Service Account and download JSON key
- Save the key as
google-cloud-key.jsonintransearly-api/
- Framework: NestJS (Node.js)
- Queue: Bull + Redis
- OCR: Google Cloud Vision API
- AI Translation: OpenRouter (Gemini Flash)
- WebSocket: Socket.io
- Document Processing: pdf-lib, mammoth, exceljs, pptxgenjs
- Framework: React Native + Expo
- UI Components: React Native core components
- Navigation: React Navigation
- HTTP Client: Axios
- Real-time: Socket.io Client
- Media: Expo Camera, Expo Audio
- Google Cloud Vision API - Text detection and OCR
- OpenRouter - AI translation (Gemini Flash model)
- Redis - Queue management
๐ป๐ณ Vietnamese | ๐ฌ๐ง English | ๐ช๐ธ Spanish | ๐ซ๐ท French ๐ฉ๐ช German | ๐ฏ๐ต Japanese | ๐ฐ๐ท Korean | ๐จ๐ณ Chinese
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ [<] Image Translation [๐ป๐ณ] โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ โโโโโโโโโโโโโโโโโโโโ โ
โ โ ็ฅ้ๅฎ็่้ข... โโโ Tap โ
โ โโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Translation โ โ
โ โ Original: ็ฅ้ๅฎ็... โ โ
โ โ Translated: Chรบc bแบกn... โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ [๐ท Translate] โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
POST /translator/text # Translate text
POST /translator/image # Translate image with OCR
POST /translator/upload # Upload document for translation
GET /translator/status/:jobId # Check translation job status
GET /translator/download/:fileName # Download translated document
{
"success": true,
"targetLanguage": "Vietnamese",
"segments": [
{
"position": {
"x": 15.5,
"y": 20.3,
"width": 30.2,
"height": 5.1
},
"original": "็ฅ ้ ๅฎ ็ ่ ้ข ไน ๆ
ๅ
ๆปก ๅฟซ ไน",
"translated": "Chรบc chuyแบฟn ฤi ฤแบฟn hแปc viแปn Ngแปc Bแบฃo Tแปฅy ฤแบงy niแปm vui"
}
]
}|
Trฦฐฦกng Nguyแป n Tiแบฟn ฤแบกt Full-stack Developer |
Nguyแป n Minh Thแบฏng Backend Developer |
Nguyแป n Bรก Trung Nguyรชn Mobile Developer |
Nguyแป n Hแปฏu Anh Tuแบฅn AI/ML Engineer |
โ Text Translation โ Image Translation with OCR โ Voice Recording & Translation โ Document Translation (PDF, DOCX, XLSX, PPTX, CSV) โ Real-time WebSocket Updates โ Interactive Bounding Box UI ๐ง Multi-language Audio Output ๐ง Offline Translation Mode ๐ง Translation History & Favorites
We welcome contributions! Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Google Cloud Vision API for powerful OCR capabilities
- OpenRouter for AI translation services
- NestJS for the robust backend framework
- Expo for streamlined React Native development
For questions or support, please open an issue or contact the team.