Skip to content

[Proposal] Add Graceful Shutdown Support with Context-Based Lifecycle Management #10

@codersgyan

Description

@codersgyan

Problem Statement

Currently, the application doesn't provide a built-in mechanism for graceful shutdown, which can lead to:

  • Abrupt termination of in-flight requests/operations
  • Potential data loss or corruption during writes
  • Ungraceful disconnect from databases and external services
  • Poor experience during deployments or scaling events

When the application receives termination signals (SIGTERM, SIGINT), it should complete ongoing work before exiting rather than terminating immediately.

Proposed Solution

Implement a graceful shutdown mechanism that:

  1. Listens for OS signals (SIGTERM, SIGINT, SIGQUIT)
  2. Triggers shutdown sequence with configurable timeout
  3. Stops accepting new requests while completing existing ones
  4. Cleanly closes resources (DB connections, message queues, file handles)
  5. Returns appropriate exit codes based on shutdown success

Proposed Implementation

High-Level Approach

// pkg/shutdown/graceful.go

type GracefulShutdown struct {
    timeout time.Duration
    signals []os.Signal
    cleanup []func(context.Context) error
}

func New(timeout time.Duration) *GracefulShutdown
func (gs *GracefulShutdown) Register(cleanup func(context.Context) error)
func (gs *GracefulShutdown) Wait(ctx context.Context) error

Integration Example

func main() {
    shutdown := graceful.New(30 * time.Second)
    
    // Register cleanup functions
    shutdown.Register(server.Shutdown)
    shutdown.Register(db.Close)
    shutdown.Register(cache.Close)
    
    // Wait for signal
    if err := shutdown.Wait(context.Background()); err != nil {
        log.Fatal(err)
    }
}

Benefits

  • Zero-downtime deployments - complete in-flight requests before shutdown
  • Data integrity - ensure transactions complete
  • Better observability - log shutdown progress and issues
  • Configurable timeouts - balance graceful vs. forced termination
  • Composable - works with various server types (HTTP, gRPC, workers)

Implementation Details

Key Components

  1. Signal Handler: Goroutine listening for OS signals
  2. Cleanup Registry: Ordered list of cleanup functions
  3. Context Management: Propagate cancellation with timeout
  4. Error Aggregation: Collect and report all cleanup errors
  5. Shutdown Coordinator: Orchestrate the shutdown sequence

Considerations

  • Should cleanup functions run in parallel or sequential?
    • Proposal: Sequential with order control (dependencies)
  • What happens if cleanup exceeds timeout?
    • Proposal: Force exit after timeout, log which cleanups didn't complete
  • Should we support shutdown hooks at different phases?
    • Proposal: Start with simple ordered list, extend if needed

Alternatives Considered

  1. Use existing library (e.g., github.com/oklog/run)

    • Pro: Battle-tested
    • Con: Additional dependency, may be overkill
  2. Minimal signal handling only

    • Pro: Simple
    • Con: Doesn't coordinate multiple resources
  3. Built-in package (proposed)

    • Pro: Tailored to project needs, no external deps
    • Con: Need to maintain ourselves

Questions for Maintainers

  • Does this align with the project's architecture and goals?
  • Are there existing patterns in the codebase I should follow?
  • Should this be a separate package or integrated into existing code?
  • What's the preferred approach for testing shutdown behavior?
  • Any specific requirements around signal handling on different platforms?

Next Steps

If this proposal is accepted, I'm happy to:

  1. Create a detailed design doc
  2. Submit a PR with implementation
  3. Add comprehensive tests and documentation
  4. Provide examples for common use cases

Looking forward to your feedback!


Environment:

  • Go version: 1.25+
  • Target platforms: Linux, macOS, Windows

References:

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions