A full-stack demonstration of distributed coordination and cloud-native storage using:
- Go backend — HTTP API, SHA-256 deduplication, multi-worker goroutines
- etcd — distributed locking so only one worker processes each unique file
- MongoDB / MongoDB Atlas — persistent file metadata and binary storage (GridFS)
- React frontend — drag-and-drop upload UI with live status polling and file downloads
- System Architecture
- Features
- Quick Start
- Connecting to MongoDB Atlas
- Project Structure
- API Reference
- Configuration
- Host Mode (File Deletion)
- How It Works
- Key Concepts
User (Browser)
|
v
React Frontend (embedded SPA, port 8080)
| POST /upload
| GET /status/:hash
| GET /download/:hash
| DELETE /delete/:hash (host only — requires X-Host-Token header)
v
Go Backend API (port 8080)
|
+-- SHA-256 Hash --> Deduplication Store (in-memory)
|
+-- File Metadata --> MongoDB "files" collection
|
+-- File Content --> MongoDB GridFS <-- /download/:hash
|
+-- Disk Fallback --> ./uploads/ (if MongoDB unavailable)
|
+-- Work Queue --> Worker-1 -+
Worker-2 -+--> etcd /locks/<hash>
Worker-3 -+
|
etcd Server (port 2379)
| Feature | Description |
|---|---|
| Drag-and-drop upload | Upload any file from the browser |
| SHA-256 deduplication | Same content uploaded twice gives instant duplicate response |
| Distributed locking | etcd ensures exactly one worker processes each file across a cluster |
| MongoDB persistence | File metadata and binaries stored in MongoDB / Atlas across restarts |
| GridFS binary storage | Large files stored in MongoDB GridFS |
| Download button | Download any completed file directly from the browser |
| Host-only file deletion | Authenticated hosts can delete any file via a 🗑 Delete button |
| Graceful degradation | Works fully without etcd or MongoDB (standalone / disk-only mode) |
| Docker-first | One command to start everything |
| Tool | Version |
|---|---|
| Docker & Docker Compose | any recent version |
git clone https://github.com/punithsai18/dsCaseStudy.git
cd dsCaseStudySkip this step to use the bundled local MongoDB container.
Create a .env file in the project root:
MONGO_URI=mongodb+srv://<username>:<password>@<cluster>.mongodb.net/?retryWrites=true&w=majority
MONGO_DB=dscasestudySee Connecting to MongoDB Atlas for the full setup guide.
docker compose up --buildExpected output:
dscasestudy-app | Connected to MongoDB (db: dscasestudy)
dscasestudy-app | Connected to etcd at etcd:2379
dscasestudy-app | [worker-1] started
dscasestudy-app | [worker-2] started
dscasestudy-app | [worker-3] started
dscasestudy-app | Server listening on http://localhost:8080
Visit http://localhost:8080 in your browser.
From another device on the same network, find your IP:
hostname -I | awk '{print $1}' # Linux / macOS
ipconfig # WindowsThen open http://<your-IP>:8080.
- Drag a file onto the drop zone — watch it move through
queued -> processing -> done - Click Download once it is done to download the file from MongoDB GridFS
- Upload the same file again — see the instant
duplicateresponse - Upload multiple files simultaneously — observe workers processing them in parallel
MongoDB Atlas is MongoDB's fully-managed cloud database. The backend accepts any valid connection string via the MONGO_URI environment variable — including Atlas SRV URIs.
- Sign up at mongodb.com/atlas (free tier available)
- Create a free M0 cluster (512 MB, 3-node replica set)
- In Database Access, create a database user with
readWritepermission - In Network Access, add your IP address (or
0.0.0.0/0for access from any IP) - Click Connect > Drivers and copy the connection string. It looks like:
mongodb+srv://<username>:<password>@cluster0.abc12.mongodb.net/?retryWrites=true&w=majority
Option A — .env file (recommended)
Create a .env file in the project root (next to docker-compose.yml):
MONGO_URI=mongodb+srv://<username>:<password>@cluster0.abc12.mongodb.net/?retryWrites=true&w=majority
MONGO_DB=dscasestudyDocker Compose automatically reads this file.
Option B — pass on the command line
MONGO_URI="mongodb+srv://..." docker compose up --buildOption C — edit docker-compose.yml directly
services:
app:
environment:
MONGO_URI: "mongodb+srv://<user>:<pass>@<cluster>.mongodb.net/?retryWrites=true&w=majority"
MONGO_DB: dscasestudyWhen using Atlas you do not need the bundled mongo:7 container. Edit docker-compose.yml:
- Delete the
mongo:service block - Remove
mongo:fromapp.depends_on - Delete the
mongo_data:volume entry
Then restart:
docker compose down && docker compose up --buildcurl http://localhost:8080/health{
"etcd": true,
"mongo": true,
"status": "ok",
"timestamp": "2025-01-15T10:23:44Z",
"workers": 3
}"mongo": true confirms the backend is connected to your Atlas cluster.
| Collection / Bucket | Contents |
|---|---|
dscasestudy.files |
File metadata (hash, filename, size, status, worker, timestamps) |
dscasestudy.fs.files |
GridFS file descriptors |
dscasestudy.fs.chunks |
GridFS binary chunks (actual file data) |
dsCaseStudy/
+-- main.go # Go backend: API server, workers, MongoDB integration
+-- go.mod # Go module dependencies
+-- go.sum # Dependency checksums
+-- vendor/ # Vendored dependencies (no internet required to build)
+-- docker-compose.yml # Runs etcd, mongo (local), and the app
+-- Dockerfile # Multi-stage build (embeds React into Go binary)
+-- uploads/ # Created at runtime (disk fallback for uploaded files)
|
+-- frontend/ # React application
+-- index.html
+-- package.json
+-- src/
+-- main.jsx
+-- App.jsx # Upload UI, status polling, download button
+-- App.css # Dark-theme styles
Upload a file for processing.
Request: multipart/form-data with field file
Response:
{
"hash": "a3f5d9e87b12c4f0...",
"status": "queued",
"message": "File queued for processing"
}Status values: queued | duplicate
Poll processing status for a file.
Response:
{
"hash": "a3f5d9e87b12c4f0...",
"filename": "report.pdf",
"size": 204800,
"status": "done",
"worker": "worker-2",
"started_at": "2025-01-15T10:23:44Z",
"done_at": "2025-01-15T10:23:47Z"
}Status values: queued | processing | done | duplicate | error
List all uploaded files and their statuses.
Response: JSON array of FileStatus objects (same shape as /status/:hash).
Download a file by its SHA-256 hash.
- Streams from MongoDB GridFS when available
- Falls back to the local
./uploads/directory otherwise - Sets
Content-Disposition: attachmentso the browser downloads the file
Delete a file by its SHA-256 hash. Requires host authentication.
Request headers:
| Header | Value |
|---|---|
X-Host-Token |
The value configured in the HOST_TOKEN environment variable |
Response (200 — success):
{
"status": "deleted",
"hash": "a3f5d9e87b12c4f0..."
}Error responses:
| Status | Meaning |
|---|---|
403 Forbidden |
Token missing, wrong, or HOST_TOKEN not set on the server |
404 Not Found |
No file with that hash exists |
On success the file is removed from: the in-memory store, disk (./uploads/), MongoDB GridFS, and the files metadata collection.
Backend health check.
Response:
{
"status": "ok",
"etcd": true,
"mongo": true,
"workers": 3,
"timestamp": "2025-01-15T10:23:44Z"
}All settings are controlled via environment variables.
| Variable | Default | Description |
|---|---|---|
PORT |
8080 |
HTTP server listen port |
ETCD_ENDPOINT |
localhost:2379 |
etcd server address |
MONGO_URI |
mongodb://localhost:27017 |
MongoDB connection string. Accepts local URIs and Atlas SRV (mongodb+srv://...) |
MONGO_DB |
dscasestudy |
MongoDB database name |
HOST_TOKEN |
(empty) | Secret token that enables the DELETE /delete/:hash endpoint. When empty, all delete requests are rejected. See Host Mode (File Deletion). |
VITE_API_URL |
auto-detect | Backend URL used by the React dev server (frontend/.env.local) |
# Start etcd only (for distributed locking)
docker compose up etcd -d
# Run the Go server
go run main.go
# With Atlas:
MONGO_URI="mongodb+srv://..." go run main.goFor frontend hot-reload:
cd frontend
npm install
npm run dev # Vite dev server on :5173Create frontend/.env.local:
VITE_API_URL=http://localhost:8080docker compose down # stop Docker containers
PORT=9090 go run main.go # Linux / macOS — different port
$env:PORT="9090"; go run main.go # Windows PowerShellThe application supports a host mode that lets a privileged user delete any uploaded file. The feature is disabled by default — it only activates when HOST_TOKEN is set on the server.
Docker Compose (.env file):
HOST_TOKEN=replace-with-a-strong-secretDocker Compose reads .env automatically. Add HOST_TOKEN to the app service's environment block in docker-compose.yml if you prefer to set it there:
services:
app:
environment:
HOST_TOKEN: "replace-with-a-strong-secret"Without Docker:
HOST_TOKEN=replace-with-a-strong-secret go run main.go- Open http://localhost:8080
- Click the 🔑 Host Login button in the top-right area of the header
- Enter the same token you set in
HOST_TOKEN - Click Login (or press Enter)
A 🔑 Host Mode badge replaces the login button to confirm authentication. The token is stored in sessionStorage — it is automatically cleared when the browser tab is closed.
While in Host Mode, every file row shows a 🗑 Delete button. Click it to permanently delete the file from the server.
If the token is rejected by the server (e.g. it was changed and the page was not reloaded), the UI automatically logs out and displays an error message.
- Choose a long, random token (e.g.
openssl rand -hex 32). - The token is compared using a constant-time algorithm on the server to prevent timing attacks.
sessionStorageis accessible to JavaScript on the same page. For higher security, consider serving the application over HTTPS so the token travels only over an encrypted connection.- When
HOST_TOKENis not set, theDELETE /delete/:hashendpoint always returns403 Forbidden, regardless of what the client sends.
Browser --> POST /upload --> Go API
|
Read bytes, compute SHA-256 hash
|
+-- Already seen? --> return "duplicate"
|
+-- New file --> save to disk (sync)
save to GridFS (async)
upsert metadata to MongoDB (async)
enqueue processing job
return "queued"
Worker picks job from queue
|
+-- etcd available? --YES--> acquire /locks/<hash>
| |
| Mark "processing"
| Simulate work (3 s)
| Mark "done"
| Upsert metadata to MongoDB
| Release lock
|
+-- etcd unavailable? -------> Same steps, no lock
Browser --> GET /download/<hash> --> Go API
|
Lookup hash in store
|
+-- GridFS available? --YES--> stream from GridFS
|
+-- Fallback --> stream from ./uploads/<hash>_<name>
On restart the backend re-hydrates its in-memory store:
- Query MongoDB
filescollection — restore all known file records - Scan
./uploads/— add any files not already in the store
All files survive container restarts even without MongoDB (disk-only mode).
Files are indexed by their SHA-256 hash, not their names. This enables:
- Deduplication — identical content is stored once
- Integrity verification — hash confirms the file was not corrupted
- Deterministic locking — same content always maps to the same lock key
GridFS splits files into 255 KB chunks stored in two collections:
fs.files— file metadata (name, size, upload date, custom metadata)fs.chunks— binary data chunks
Atlas replicates these automatically across nodes in the cluster.
etcd uses the Raft consensus algorithm which guarantees:
- Linearizability — every lock acquisition is globally ordered
- Fault tolerance — cluster survives minority node failures
- TTL-based lease expiry — crashed workers release locks automatically after 30 s
| Service | If unavailable |
|---|---|
| etcd | Workers still process files; no distributed locking |
| MongoDB | Files saved to ./uploads/ only; downloads served from disk |
The backend always starts regardless of which services are reachable.