A Go daemon that syncs MongoDB to Elasticsearch and Milvus in realtime. Perfect for building hybrid search systems combining traditional text search with vector similarity search.
Based on Monstache by Ryan Wynn, with added support for Milvus vector database.
- Dual-Engine Sync: Simultaneously sync MongoDB to both Elasticsearch and Milvus
- Real-time Streaming: Uses MongoDB Change Streams for instant data updates
- Vector Search Ready: Native support for Milvus/Zilliz Cloud vector database
- High Availability: Cluster mode with automatic failover
- Direct Reads: Bulk load existing data with parallel processing
- Flexible Mapping: Custom field mappings and transformations via JavaScript or Go plugins
- Data Relations: Support for document relationships across collections
- GridFS Support: Index file content from MongoDB GridFS
- Time Machine: Historical data indexing with timestamps
- Alerting: Built-in alert system (Feishu/Lark integration, customizable)
- Monitoring: HTTP endpoints for health checks and statistics
- Resume from last position (timestamp or token-based)
- Configurable batch sizes and concurrency
- Comprehensive logging and metrics
- Docker and Kubernetes support
- Automatic reconnection and error handling
- Why Monstache-Milvus?
- Architecture
- Quick Start
- Installation
- Configuration
- Usage Examples
- MongoDB Setup
- Development
- FAQ
- Contributing
- License
Perfect for building:
- π Hybrid Search Systems: Combine keyword search (Elasticsearch) with semantic search (Milvus)
- π€ AI/ML Applications: Sync embeddings from MongoDB to Milvus for similarity search
- π Real-time Analytics: Keep your search and vector databases in sync with MongoDB
- π Data Migration: Migrate large MongoDB datasets to Elasticsearch and Milvus efficiently
| Feature | Original Monstache | Monstache-Milvus |
|---|---|---|
| Elasticsearch Sync | β | β |
| Milvus/Zilliz Sync | β | β |
| Dual Engine Writes | β | β |
graph TB
MongoDB[(MongoDB)] -->|Change Streams| Monstache[Monstache-Milvus]
MongoDB -->|Direct Read| Monstache
Monstache -->|Text Data| Elasticsearch[(Elasticsearch)]
Monstache -->|Vector Data| Milvus[(Milvus/Zilliz)]
Monstache -->|GridFS| Files[File Processing]
Monstache -->|Scripts| Transform[JS/Go Plugins]
style Monstache fill:#4CAF50
style MongoDB fill:#47A248
style Elasticsearch fill:#005571
style Milvus fill:#00ADD8
Detailed architecture diagram: architecture.mermaid
- Change Detection: Monitors MongoDB using Change Streams or Oplog
- Transformation: Apply custom mappings, filters, and transformations
- Dual Write:
- Milvus receives vector data for similarity search
- Elasticsearch receives full documents for text search
- Progress Tracking: Save resume tokens for fault tolerance
- Go 1.21+ (for building from source)
- MongoDB 3.6+ (4.0+ recommended for Change Streams)
- Elasticsearch 7.0+ (optional)
- Milvus 2.0+ or Zilliz Cloud account (optional)
# 1. Clone the repository
git clone https://github.com/doing-cr7/monstache-milvus.git
cd monstache-milvus
# 2. Copy and configure
cp config.example.toml config.toml
vim config.toml # Edit with your MongoDB, ES, and Milvus credentials
# 3. Build and run
make build
./bin/monstache -f config.toml# MongoDB connection
mongo-url = "mongodb://user:pass@localhost:27017"
# Elasticsearch (optional)
elasticsearch-urls = ["http://localhost:9200"]
# Milvus/Zilliz (optional)
zilliz-enabled = true
zilliz-addr = "https://your-cluster.zillizcloud.com:19530"
zilliz-api-key = "your-api-key"
zilliz-collection-name = "your_collection"
# What to sync
change-stream-namespaces = ["mydb.mycollection"]# Check health
curl http://localhost:8080/healthz
# Check statistics
curl http://localhost:8080/stats# Clone repository
git clone https://github.com/doing-cr7/monstache-milvus.git
cd monstache-milvus
# Build binary
go build -o bin/monstache monstache.go
# Or use Makefile
make build# Using Docker
docker pull doing-cr7/monstache-milvus:latest
docker run -d \
-v /path/to/config.toml:/config.toml \
doing-cr7/monstache-milvus:latest \
-f /config.tomlversion: '3.8'
services:
monstache:
image: doing-cr7/monstache-milvus:latest
volumes:
- ./config.toml:/config.toml
command: -f /config.toml
environment:
- MONSTACHE_MONGO_URL=${MONGO_URL}
- MONSTACHE_ZILLIZ_API_KEY=${ZILLIZ_API_KEY}
restart: unless-stoppedSee docker/release/README.md for Kubernetes deployment examples.
mongo-url = "mongodb://localhost:27017"
elasticsearch-urls = ["http://localhost:9200"]
change-stream-namespaces = [""] # Watch all databases# MongoDB
mongo-url = "mongodb://user:pass@mongo1:27017,mongo2:27017/admin?replicaSet=rs0"
# Elasticsearch
elasticsearch-urls = ["http://es1:9200", "http://es2:9200"]
elasticsearch-max-conns = 10
# Milvus
zilliz-enabled = true
zilliz-addr = "your-milvus-endpoint:19530"
zilliz-api-key = "your-api-key"
zilliz-collection-name = "embeddings"
zilliz-max-conns = 4
zilliz-max-docs = 256
# High Availability
cluster-name = "prod-sync-cluster"
resume = true
resume-strategy = 1 # Token-based
# Performance
direct-read-concur = 4
elasticsearch-max-docs = 1000
# Monitoring
enable-http-server = true
http-server-addr = ":8080"- Basic: config.example.toml
- Detailed Guide: CONFIGURATION.md
- Environment Variables: See CONFIGURATION.md
mongo-url = "mongodb://localhost:27017"
elasticsearch-urls = ["http://localhost:9200"]
change-stream-namespaces = ["mydb.products"]
[[mapping]]
namespace = "mydb.products"
index = "products_index"mongo-url = "mongodb://localhost:27017"
# Enable Milvus sync
zilliz-enabled = true
zilliz-addr = "localhost:19530"
zilliz-api-key = "your-key"
zilliz-collection-name = "document_embeddings"
# Sync specific collection with embeddings
change-stream-namespaces = ["mydb.documents"]# Sync to both Elasticsearch and Milvus
mongo-url = "mongodb://localhost:27017"
# Text search in Elasticsearch
elasticsearch-urls = ["http://localhost:9200"]
# Vector search in Milvus
zilliz-enabled = true
zilliz-addr = "localhost:19530"
zilliz-api-key = "your-key"
zilliz-collection-name = "vectors"
# Watch same collection
change-stream-namespaces = ["mydb.articles"]Create a JavaScript transformation:
// transform.js
module.exports = function(doc) {
// Add computed field
doc.fullName = doc.firstName + " " + doc.lastName;
// Filter out sensitive data
delete doc.password;
return doc;
}Configure it:
[[script]]
namespace = "mydb.users"
path = "./transform.js"# Bulk load existing data
direct-read-namespaces = ["mydb.products"]
direct-read-concur = 4 # Parallel workers
direct-read-split-max = 4 # Split large collections
# Exit after initial sync (optional)
exit-after-direct-reads = trueMonstache requires specific MongoDB permissions to function properly.
// Connect to MongoDB
use admin
// Create dedicated user
db.createUser({
user: "",
pwd: "",
roles: [
{ role: "readWrite", db: "admin" },
{ role: "readWrite", db: "<logic db>" },
{ role: "readWrite", db: "monstache" },
{ role: "clusterMonitor", db: "admin" }
]
})# Replica Set (recommended)
mongodb://monstache:password@mongo1:27017,mongo2:27017/?replicaSet=rs0
# Standalone (for development only)
mongodb://monstache:password@localhost:27017
# With authentication database
mongodb://monstache:password@localhost:27017/admin?authSource=admin# Install dependencies
go mod download
# Build
make build
# Build for specific platform
GOOS=linux GOARCH=amd64 make buildmonstache-milvus/
βββ monstache.go # Main application
βββ monstache_test.go # Tests
βββ dao/
β βββ milvus/ # Milvus integration
βββ pkg/
β βββ oplog/ # Oplog processing
βββ monstachemap/ # Plugin system
βββ docker/ # Docker configurations
βββ config.example.toml # Example configuration
Steps to contribute:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Q: What's the difference from original Monstache?
A: We've added native Milvus/Zilliz support for vector search, enabling hybrid search systems that combine text and semantic search.
Q: Can I sync to only Milvus (without Elasticsearch)?
A: Yes! Set zilliz-enabled = true and omit elasticsearch-urls. You can use either or both.
Q: Does it support MongoDB standalone?
A: For development, yes. For production, MongoDB replica set is required for Change Streams.
Q: What happens if Monstache crashes?
A: It resumes from the last saved position (timestamp or token) when resume = true is configured.
Q: How fast is the sync?
A: Depends on your setup. Typically processes 1000-5000 docs/sec. Use elasticsearch-max-conns and zilliz-max-conns to tune.
Q: How to handle large existing datasets?
A: Use direct-read-namespaces with direct-read-concur for parallel bulk loading.
Q: "Unable to connect to MongoDB"
A: Check connection string, replica set name, and network connectivity. Verify with mongo CLI first.
Q: "Change streams are not supported"
A: Requires MongoDB 3.6+ in replica set mode. For standalone, use enable-oplog = true (legacy).
Q: "Zilliz collection not found"
A: Create the collection in Milvus/Zilliz first. Monstache doesn't auto-create collections.
Q: Performance is slow
A: Tune batch sizes (elasticsearch-max-docs, zilliz-max-docs), increase workers (elasticsearch-max-conns), or check network latency.
For more issues: GitHub Issues
# Enable HTTP server
enable-http-server = true
http-server-addr = ":8080"
# Feishu/Lark alerts (customize for your system)
is-feishu = true
alert-api-url = "https://your-webhook-url"
alert-robot-key = "your-key"Endpoints:
GET /healthz- Health checkGET /stats- Sync statisticsGET /instance- Instance information
elasticsearch-max-conns = 10 # Concurrent workers
elasticsearch-max-docs = 1000 # Batch size
elasticsearch-max-bytes = 8388608 # 8MB batch size
elasticsearch-max-seconds = 1 # Flush intervalzilliz-max-conns = 4 # Concurrent workers
zilliz-max-docs = 256 # Batch size
zilliz-max-bytes = 2097152 # 2MB batch size
zilliz-max-seconds = 500 # 0.5s flush interval (in ms)direct-read-concur = 4 # Parallel workers
direct-read-split-max = 4 # Split large collections
direct-read-no-timeout = true # No cursor timeoutThis project is built upon the excellent work of:
- Monstache Project - The foundation of this tool
- Milvus Team - Amazing vector database
Special thanks to all contributors who help improve this project!
This project is licensed under the MIT License - see the LICENSE file for details.
This project uses:
- Monstache - MIT License
- Milvus Go SDK - Apache 2.0 License
- Elastic Go Client - MIT License
- MongoDB Go Driver - Apache 2.0 License
- π Bug Reports: GitHub Issues
- π‘ Feature Requests: GitHub Issues
If you find this project helpful, please consider giving it a β!
Made with β€οΈ by the Monstache-Milvus community
Documentation β’ Issues β’ Contributing