-
Notifications
You must be signed in to change notification settings - Fork 16
Open
Description
Description
We've been experiencing memory usage limits with our Python/FastAPI backend. This issue tracks the migration of our backend services from Python to Go, incorporating Memcached for caching to improve performance and reduce memory footprint.
Background
Current issues with Python backend:
- High memory consumption under load
- GIL limitations affecting concurrent request handling
- Memory leaks in long-running async operations
- Slower startup times affecting container orchestration
Objectives
- Reduce memory usage by 70-80%
- Improve request throughput by 3-5x
- Reduce cold start times for containerized deployments
- Maintain feature parity with existing Python implementation
- Improve caching strategy with Memcached
Proposed Architecture Changes
1. Technology Stack Migration
From:
- Python 3.13 with FastAPI
- SQLAlchemy ORM (async)
- Alembic for migrations
- In-memory caching
- asyncio for concurrency
To:
- Go 1.24+
- Gin or Fiber web framework
- GORM or sqlx for database
- golang-migrate for migrations
- Memcached for distributed caching
- Goroutines for concurrency
2. Service Architecture Changes
Current Structure:
backend/
├── app/
│ ├── apis/endpoints/ # FastAPI routers
│ ├── core/ # Config, security
│ ├── crud/ # Database operations
│ ├── models/ # SQLAlchemy models
│ └── schemas/ # Pydantic schemas
Proposed Go Structure:
backend-go/
├── cmd/
│ └── server/ # Main application entry
├── internal/
│ ├── api/ # HTTP handlers
│ ├── auth/ # Authentication logic
│ ├── cache/ # Memcached client wrapper
│ ├── config/ # Configuration management
│ ├── database/ # DB connection and queries
│ ├── middleware/ # HTTP middleware
│ ├── models/ # Domain models
│ ├── realtime/ # WebSocket handlers
│ └── storage/ # Storage service client
├── pkg/
│ └── utils/ # Shared utilities
└── migrations/ # SQL migrations
3. Key Component Migrations
Authentication & Security
Current (backend/app/core/security.py):
def create_access_token(subject: str, expires_delta: timedelta = None) -> str:
# JWT creation with python-joseProposed Go:
// internal/auth/jwt.go
func CreateAccessToken(subject string, expiresDelta time.Duration) (string, error) {
// Use github.com/golang-jwt/jwt/v5
}Database Layer
Current (backend/app/db/session.py):
engine = create_async_engine(settings.DATABASE_URL)
AsyncSessionLocal = async_sessionmaker(engine)Proposed Go:
// internal/database/connection.go
type DB struct {
*sql.DB
cache *memcache.Client
}
func NewDB(dsn string, cacheAddr string) (*DB, error) {
// Initialize both DB and Memcached connections
}API Endpoints
Current (backend/app/apis/endpoints/files.py):
@router.post("/upload", response_model=FileUploadResponse)
async def upload_file_to_bucket(
db: AsyncSession = Depends(get_db),
upload_request: FileUploadInitiateRequest,
requester: Union[User, Literal["anon"], None] = Depends(get_current_user_or_anon),
):
# File upload logicProposed Go:
// internal/api/files.go
func (h *Handler) UploadFile(c *gin.Context) {
// Parse request
// Check auth via middleware
// Cache check
// Process upload
// Update cache
}4. Memcached Integration
Caching Strategy:
// internal/cache/client.go
type CacheClient struct {
mc *memcache.Client
ttl int
}
// Cache keys pattern
const (
UserCacheKey = "user:%s"
BucketCacheKey = "bucket:%s"
FileCacheKey = "file:%s"
TableMetaCacheKey = "table:meta:%s"
)Cache Usage Example:
func (s *UserService) GetUser(ctx context.Context, userID string) (*models.User, error) {
// Try cache first
cacheKey := fmt.Sprintf(UserCacheKey, userID)
if cached, err := s.cache.Get(cacheKey); err == nil {
return unmarshalUser(cached)
}
// Database fetch
user, err := s.db.GetUser(ctx, userID)
if err != nil {
return nil, err
}
// Update cache
s.cache.Set(cacheKey, marshalUser(user), 300) // 5 min TTL
return user, nil
}5. Docker Compose Changes
Add Memcached Service:
# docker-compose.yml
services:
# ... existing services ...
memcached:
image: memcached:1.6-alpine
container_name: selfdb_memcached
ports:
- "11211:11211"
command: memcached -m 256 -I 2m
restart: unless-stopped
networks:
- selfdb_network
healthcheck:
test: ["CMD", "echo", "stats", "|", "nc", "localhost", "11211"]
interval: 10s
timeout: 5s
retries: 5
backend:
build:
context: ./backend-go
dockerfile: Dockerfile
environment:
- MEMCACHED_ADDR=memcached:11211
- CACHE_TTL=300
depends_on:
- postgres
- memcached
- storage_service6. Migration Strategy
Phase 1: Core Services (Week 1-2)
- Set up Go project structure
- Implement configuration management
- Create database connection pool with GORM/sqlx
- Set up Memcached client
- Implement JWT authentication
- Create middleware (CORS, Auth, Rate Limiting)
Phase 2: API Endpoints (Week 3-4)
- Migrate health check endpoints
- Migrate auth endpoints (login, register, refresh)
- Migrate user management endpoints
- Add comprehensive caching for user data
Phase 3: Storage Integration (Week 5)
- Migrate storage service client
- Implement file upload/download endpoints
- Migrate bucket management
- Cache file metadata aggressively
Phase 4: Advanced Features (Week 6-7)
- Migrate WebSocket/realtime functionality
- Implement SQL execution endpoint
- Migrate table management
- Migrate schema management
- Migrate cloud functions management
Phase 5: Testing & Optimization (Week 8)
- Performance benchmarking
- Load testing comparison
- Memory profiling
- Cache hit ratio optimization
- API compatibility testing
7. Performance Targets
Memory Usage:
- Python Backend: ~500MB-1GB per instance
- Go Backend Target: <100MB per instance
- Memcached: 256MB dedicated
Request Latency:
- Auth endpoints: <50ms (with cache)
- File metadata: <20ms (with cache)
- Large file operations: No change (storage-bound)
Throughput:
- Current: ~1000 req/s per instance
- Target: 5000+ req/s per instance
8. Breaking Changes & Compatibility
API Compatibility:
- Maintain exact same REST API structure
- Same request/response formats
- Same authentication headers
- WebSocket protocol unchanged
Configuration Changes:
- New environment variables for Memcached
- Modified DATABASE_URL format for Go
- New cache TTL configurations
9. Rollback Strategy
- Keep Python backend in maintenance mode
- Use feature flags for gradual rollout
- Implement API gateway for A/B testing
- Maintain database compatibility
Acceptance Criteria
- All existing API endpoints migrated with same interface
- All tests passing with >90% coverage
- Memory usage reduced by at least 70%
- Request throughput increased by at least 3x
- Cache hit ratio >80% for frequently accessed data
- Zero downtime migration completed
- Documentation updated for Go implementation
- Docker images optimized (multi-stage builds)
- Monitoring and metrics implemented
- Load testing shows stable performance under stress
Additional Considerations
-
Dependency Management:
- Use Go modules for dependency management
- Pin all dependency versions
- Regular security updates
-
Error Handling:
- Implement structured logging with zerolog
- Proper error wrapping and context
- Graceful degradation when cache unavailable
-
Monitoring:
- Prometheus metrics for Go application
- Memcached statistics monitoring
- Custom dashboards for cache performance
-
Development Workflow:
- Hot reload for development (Air)
- Makefile for common tasks
- Pre-commit hooks for formatting
References
- Current Python implementation: main.py
- FastAPI to Gin migration guide
- GORM documentation
- Memcached best practices
- Go concurrency patterns
Metadata
Metadata
Assignees
Labels
No labels