-
Notifications
You must be signed in to change notification settings - Fork 37
Open
Description
The current LangGraph Helm chart (langchain/langgraph-cloud) does not support configuring the following critical Kubernetes pod spec fields through Helm values:
terminationGracePeriodSeconds- Controls how long Kubernetes waits before forcefully terminating a podlifecycle.preStop- Allows running commands before a container is terminated
These fields are essential for production deployments where long-running graph operations need time to complete gracefully during pod termination.
Use Case
In our production environment, we have LangGraph operations that can run for up to 15 minutes. Without proper termination grace period configuration:
- Long-running graphs are forcefully terminated after the default 30 seconds
- Users experience failures when pods are evicted or during rolling updates
- No proper connection draining occurs, leading to "connection refused" errors
Current Workaround
Currently, we have to manually patch deployments after each Helm deployment:
# API Server
kubectl patch deployment langgraph-api-server -p '{
"spec": {
"template": {
"spec": {
"terminationGracePeriodSeconds": 900,
"containers": [{
"name": "api-server",
"lifecycle": {
"preStop": {
"exec": {
"command": ["/bin/sh", "-c", "sleep 60"]
}
}
}
}]
}
}
}
}'
# Worker
kubectl patch deployment langgraph-queue -p '{
"spec": {
"template": {
"spec": {
"terminationGracePeriodSeconds": 900,
"containers": [{
"name": "queue",
"lifecycle": {
"preStop": {
"exec": {
"command": ["/bin/sh", "-c", "sleep 60"]
}
}
}
}]
}
}
}
}'Proposed Solution
Add support for these fields in the Helm values.yaml:
apiServer:
deployment:
# Add support for pod-level termination grace period
terminationGracePeriodSeconds: 900
# Add support for container lifecycle hooks
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 60"]
# Existing fields...
resources:
requests:
cpu: 2000m
memory: 4Gi
queue:
deployment:
# Same support for worker pods
terminationGracePeriodSeconds: 900
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 60"]Template Changes Required
In the deployment templates (api-server-deployment.yaml and similar):
spec:
template:
spec:
{{- if .Values.apiServer.deployment.terminationGracePeriodSeconds }}
terminationGracePeriodSeconds: {{ .Values.apiServer.deployment.terminationGracePeriodSeconds }}
{{- end }}
containers:
- name: {{ .Values.apiServer.name }}
{{- if .Values.apiServer.deployment.lifecycle }}
lifecycle:
{{- toYaml .Values.apiServer.deployment.lifecycle | nindent 12 }}
{{- end }}
# ... rest of container specAdditional Context
This is particularly important for:
- Kubernetes clusters with node autoscaling (Karpenter, Cluster Autoscaler)
- Production environments with strict SLAs
- Long-running AI/ML workloads typical in LangGraph applications
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels