Welcome to the course repository containing all homework assignments exploring distributed systems, scalability, and cloud architecture. Each assignment builds on core concepts with hands-on implementation and performance analysis.
A foundational REST API project demonstrating web service basics and load testing.
Key Topics:
- RESTful API design patterns
- Go HTTP framework (Gin)
- In-memory data structures
- Load testing and performance metrics
- Single vs multi-instance deployment
Tech Stack: Go, Gin, Docker, Terraform, Python/Locust
Key Files:
HW-1/web-service-gin/main.go- Service implementationHW-1/web-service-gin/load_testing.py- Basic load testsHW-1/web-service-gin/advanced_load_testing.py- Advanced metrics
Infrastructure as Code project for cloud resource provisioning and management.
Key Topics:
- Terraform fundamentals
- AWS EC2 instance provisioning
- Security groups and network configuration
- Infrastructure state management
- SSH access and remote management
Tech Stack: Terraform, AWS (EC2, VPC), SSH
Key Files:
HW-2/main.tf- Infrastructure definitionHW-2/terraform.tfvars- Configuration variablesHW-2/hw2.pem- SSH key pair
Deep dive into concurrent programming patterns and performance tradeoffs in Go.
Key Topics:
- Race condition detection and prevention
- Synchronization primitives (Mutex, RWMutex)
- Concurrent-safe data structures (sync.Map)
- Context switching analysis
- I/O blocking patterns
- Atomic operations and lock-free programming
Tech Stack: Go, Docker, Python/Locust
Key Experiments:
- Race conditions (4 implementations with varying fixes)
- Context switching overhead analysis
- File I/O buffering strategies
- Load testing with Locust framework
Key Files:
HW-3/race-condition*.go- Race condition examplesHW-3/context-switching-experiment.go- Context switching analysisHW-3/file-io-experiment.go- I/O pattern testingHW-3/*_EXPERIMENT.md- Detailed analysis documents
Distributed data processing pipeline demonstrating big data processing patterns.
Key Topics:
- MapReduce programming model
- Data partitioning and distributed processing
- Container orchestration
- Docker networking and communication
- Horizontal scaling analysis
- Fault tolerance and retry mechanisms
Tech Stack: Go, Docker, Docker Compose, Python
Architecture:
- Splitter: Partitions input data into chunks
- Mapper: Processes chunks in parallel (word counting)
- Reducer: Aggregates results from mappers
Key Files:
HW-4/mapreduce/orchestrator.py- Pipeline orchestrationHW-4/mapreduce/performance.py- Metrics collectionHW-4/mapreduce/mapper/main.go- Mapper implementationHW-4/mapreduce/reducer/main.go- Reducer implementationHW-4/mapreduce/splitter/main.go- Data splitter
Running the Pipeline:
cd HW-4/mapreduce
python orchestrator.py run # Run full pipeline
python orchestrator.py scale # Test scaling performanceAPI-first development using OpenAPI specification with comprehensive deployment options.
Key Topics:
- API-first development methodology
- OpenAPI 3.0 specification design
- RESTful endpoint implementation
- Docker containerization
- Infrastructure as Code (Terraform)
- Load testing and performance validation
Tech Stack: Go, OpenAPI 3.0, Docker, Terraform, AWS, Python/Locust
API Endpoints:
POST /products/{productId}/details- Add product detailsGET /products/{productId}- Retrieve product information
Deployment Options:
- Local development with
go run - Containerized with Docker
- Cloud deployment with Terraform (Part 2 & 3 configurations)
Key Files:
HW-5/src/main.go- API implementationHW-5/src/api.yaml- OpenAPI specificationHW-5/terraform/- Infrastructure definitionsHW-5/locustfile-fast.py- Load testing
Advanced software architecture following Parnas' decomposition principles for clean module design.
Key Topics:
- Modular architecture design (Parnas decomposition)
- Interface-based design and dependency injection
- Concurrent data structures (sync.Map)
- Clean code and separation of concerns
- Single responsibility principle
- Testable module design
Tech Stack: Go, Docker, Terraform, AWS, Python/Locust
Module Architecture:
main (composition root)
├── model (data representation)
├── store (concurrent-safe storage - sync.Map)
├── seeddata (seed data source)
├── generator (100K product generation)
├── search (search algorithm)
└── handler (HTTP transport layer)
Key Concept: Each module hides one design decision behind a stable interface:
model→ Product data representationstore→ Storage mechanism & concurrencyseeddata→ Seed catalog sourcegenerator→ Expansion strategysearch→ Matching algorithmhandler→ HTTP serialization
Performance:
- 100K concurrent products
- ~5000 RPS single instance
- ~15000 RPS with 3-instance load balancing
- P50 latency: 5-10ms
Key Files:
HW-6/main.go- Service compositionHW-6/model/product.go- Data modelHW-6/store/store.go- Concurrent storageHW-6/search/search.go- Search implementationHW-6/handler/handler.go- HTTP handlersHW-6/terraform/- Infrastructure
- Go: 1.16+
- Docker & Docker Compose: Latest stable
- Terraform: 1.0+
- Python: 3.8+ with pip
- AWS Account: For HW-2, HW-5, HW-6 deployments
- Git: For version control
HW-1: Album Service
cd HW-1/web-service-gin
go mod tidy
go run main.go
curl http://localhost:8080/albumsHW-2: EC2 Infrastructure
cd HW-2
terraform init
terraform plan
terraform applyHW-3: Concurrency Experiments
cd HW-3
go run -race race-condition-2.go
go run context-switching-experiment.go
docker-compose -f docker-compose-locust.yml upHW-4: MapReduce Pipeline
cd HW-4/mapreduce
docker-compose build
python orchestrator.py run
python orchestrator.py scaleHW-5: Product API
cd HW-5/src
go mod tidy
go run main.go
curl -X POST http://localhost:8080/products/1/details -H "Content-Type: application/json" -d '{"product_id":1,"sku":"ABC-123"}'HW-6: Product Search
cd HW-6
go mod tidy
go run main.go
curl "http://localhost:8080/search?q=laptop"-
HW-1 & HW-2: Foundations
- Basic REST API design
- Infrastructure provisioning
-
HW-3 & HW-4: Concurrency & Scalability
- Concurrent programming patterns
- Distributed data processing
-
HW-5 & HW-6: Advanced Architecture
- API specification-driven design
- Modular software architecture
- Data partitioning and distribution
- MapReduce programming model
- Horizontal scaling strategies
- Load balancing and failover
- Race conditions and synchronization
- Mutex vs RWMutex tradeoffs
- Lock-free data structures
- Context switching overhead
- Load testing methodologies
- Modular design principles (Parnas decomposition)
- Interface-based design
- Dependency injection
- Single responsibility principle
- Clean code practices
- Infrastructure as Code (Terraform)
- Docker containerization
- Container orchestration
- AWS services (EC2, VPC, ELB, CloudWatch)
- CI/CD pipelines
- RESTful principles
- OpenAPI specification
- Error handling
- Performance optimization
- Scaling patterns
| Assignment | Throughput | Latency (P50) | Optimization |
|---|---|---|---|
| HW-1 | ~500 RPS | 20-30ms | Single instance baseline |
| HW-3 | Varies | Synchronization dependent | Race condition analysis |
| HW-4 | 1 file/sec | Split, map, reduce stages | Horizontal scaling |
| HW-5 | ~1000 RPS | 10-20ms | OpenAPI specification |
| HW-6 | ~5000 RPS | 5-10ms | Optimized architecture |
- Go 1.16+ - Backend services
- Python 3.8+ - Load testing, orchestration
- Bash/PowerShell - Scripting
- Docker - Containerization
- Docker Compose - Multi-container orchestration
- Terraform - Infrastructure as Code
- AWS - Cloud provider
- Locust - Load testing framework
- Prometheus - Metrics collection (if applicable)
- CloudWatch - AWS monitoring
Each homework directory includes:
- README.md - Complete assignment overview
- Experiment markdown files - Detailed analysis and results
- Source code - Well-commented implementations
- Configuration files - Terraform, Docker compose, API specs
After completing this course, you will understand:
✅ How to design and implement scalable distributed systems ✅ Concurrency patterns and performance tradeoffs in Go ✅ Data processing techniques for large-scale systems ✅ Cloud infrastructure provisioning and management ✅ API design using industry-standard specifications ✅ Modular architecture and clean code principles ✅ Performance testing and optimization strategies ✅ Container orchestration and deployment patterns
For specific assignment help, refer to the individual README files:
Course materials for CS 6650: Advanced Design of Large-Scale Distributed Systems
- Start with HW-1: Get familiar with REST APIs and load testing
- Move to HW-2: Learn infrastructure provisioning
- Deep dive HW-3: Understand concurrency patterns
- Explore HW-4: Master distributed processing
- Build HW-5: Design with specifications
- Architect HW-6: Apply modular design principles
Good luck! 🚀