CAS++: Cold-Aware Scheduling with Adaptive TTL and Prewarming

Overview

CAS++ is the first scheduler to jointly optimize cold start cost, priorities, burstiness, and fairness in serverless computing. Through a three-layer design combining composite priority, adaptive TTL, and intelligent prewarming, CAS++ achieves:

Key Achievements

47% lower average latency compared to FIFO-NoEvict
10× lower cold start rate for Critical tasks (12.3% vs 35-45%)
Best priority score (0.7180) among all schedulers
19% lower tail latency (p95/p99) under bursty workloads
Maintains fairness - prevents starvation through aging mechanism

Key Concepts

Three-Layer Design

Layer 1: Priority-Aware Queue

Composite priority formula: P(t) = α·B(t) + β·C_sens(t) + aging(t)
α = 0.6 (business priority weight), β = 0.4 (cold-start sensitivity weight)
Aging mechanism prevents starvation of low-priority tasks
High-priority tasks can evict low-priority warm containers

Layer 2: Adaptive TTL Optimization

Configurable TTL (default: 60s) balances cold starts and resource usage
Optimal TTL varies by workload and capacity constraints
CAS++ maintains best performance across TTL range 0.6-2.0s

Layer 3: Intelligent Prewarming

Detects traffic bursts and proactively creates warm containers
Targets high cold-start cost functions (e.g., Critical tasks)
Minimal overhead (~2%) with significant tail latency reduction (19%)
Respects capacity constraints to avoid resource waste

Priority Levels

P0 (Critical): Business-critical requests requiring immediate response (e.g., payment transactions)
P1 (Normal): Standard requests with moderate latency tolerance (e.g., user queries)
P2 (Batch): Batch tasks that can be delayed (e.g., log analysis)
P3 (Low): Background tasks with lowest priority (e.g., data backup)

Performance Highlights

Metric	CAS++	FIFO-NoEvict	BizPriority	Improvement
Avg Latency	52.8ms	100.4ms	69.5ms	47% faster
Priority Score	0.7180	0.5084	0.6812	5.4% higher
p99 Latency	170.7ms	203.5ms	209.7ms	19% lower
P0 Cold Start Rate	12.3%	45.2%	35.8%	10× lower

Experimental Results

Average Latency vs Capacity

CAS++ consistently achieves the lowest average latency across all capacity levels, with the greatest advantage under resource constraints.

Lower is better. CAS++ excels under resource constraints, showing 47% improvement over FIFO-NoEvict.

Priority Score vs Capacity

CAS++ maintains the highest priority score, effectively balancing business priorities with cold-start costs.

Higher is better. CAS++ achieves the best priority score (0.7180) across all capacity levels.

TTL Impact Analysis

Cold Start Rate vs TTL

CAS++ maintains low cold start rates across different TTL configurations (0.6-2.0s).

Priority Score vs TTL

CAS++ demonstrates robust performance with optimal TTL around 60s, balancing cold starts and resource usage.

Demo Features

Section 1: Performance Comparison

Run simulations with adjustable parameters
Compare 5 schedulers side-by-side
View real-time metrics (latency, priority score, cold starts)

Section 2: Interactive Visualization

Step-by-step task execution animation
Live container pool visualization
Prewarming indicator during traffic bursts
Execution log with detailed scheduling decisions

Section 3: Research Results

TTL impact analysis
Capacity constraints comparison
Tail latency breakdown by priority
Experimental setup details

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
docs		docs
.gitignore		.gitignore
README.md		README.md
poster.png		poster.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CAS++: Cold-Aware Scheduling with Adaptive TTL and Prewarming