Last Updated: 2026-01-04 Repository: github.com/DataDog/datadog-operator Purpose: Kubernetes Operator for deploying and managing Datadog Agent
| Item | Value |
|---|---|
| Language | Go 1.25.5 |
| Framework | controller-runtime v0.20.4 |
| Main Branch | main |
| API Version | v2alpha1 (current), v1alpha1 (deprecated) |
| Kubernetes Support | 1.16+ |
| Entry Point | cmd/main.go |
The Datadog Operator is a Kubernetes Operator that provides declarative deployment and lifecycle management of the Datadog Agent on Kubernetes clusters. It offers advantages over Helm charts and manual DaemonSets:
- Built-in defaults based on Datadog best practices
- Configuration validation to prevent mistakes
- First-class Kubernetes API resource
- Included in Kubernetes reconciliation loop
- Support for 43+ monitoring features
datadog-operator/
├── api/ # Kubernetes API definitions (CRDs)
│ ├── datadoghq/v2alpha1/ # Current API version (PRIMARY)
│ ├── datadoghq/v1alpha1/ # Deprecated API version
│ └── datadoghq/common/ # Shared types
├── cmd/ # Executables
│ ├── main.go # Operator manager (MAIN ENTRY)
│ ├── kubectl-datadog/ # kubectl plugin
│ └── check-operator/ # Health checker
├── internal/controller/ # Controller implementations
│ ├── datadogagent/ # Primary controller (CORE)
│ │ ├── component/ # Agent components (node, cluster, checks)
│ │ ├── feature/ # 43 feature handlers
│ │ ├── merger/ # Config merging logic (31 mergers)
│ │ ├── override/ # Resource overrides
│ │ └── controller.go # Main reconciliation logic
│ ├── datadogagentprofile/ # Agent profiles (beta)
│ ├── datadogmonitor/ # Monitor CRD
│ ├── datadogslo/ # SLO CRD
│ ├── datadogdashboard/ # Dashboard CRD
│ └── setup.go # Controller registration
├── pkg/ # Shared packages
│ ├── kubernetes/ # K8s utilities, RBAC
│ ├── config/ # Configuration management
│ ├── secrets/ # Secret backends
│ ├── constants/ # Global constants
│ └── controller/utils/ # Datadog API clients
├── config/ # Kubernetes manifests (Kustomize)
│ ├── crd/ # CRD definitions
│ ├── rbac/ # RBAC configs
│ ├── manager/ # Operator deployment
│ └── samples/ # Example CRs
├── test/e2e/ # End-to-end tests
├── examples/ # Configuration examples
├── docs/ # Documentation
├── Makefile # Build automation
├── Dockerfile # Container build
└── go.mod # Go dependencies
| CRD | API Version | Status | Purpose |
|---|---|---|---|
| DatadogAgent | v2alpha1 | Current | Primary resource for agent deployment |
| DatadogAgentInternal | v1alpha1 | Internal | Internal reconciliation state |
| DatadogAgentProfile | v1alpha1 | Beta | Advanced agent configuration profiles |
| DatadogMonitor | v1alpha1 | Stable | Datadog monitor management |
| DatadogSLO | v1alpha1 | Stable | Service Level Objective management |
| DatadogDashboard | v1alpha1 | Stable | Dashboard management |
| DatadogGenericResource | v1alpha1 | Stable | Generic resource creation |
Location: api/datadoghq/v2alpha1/datadogagent_types.go (primary)
File: internal/controller/datadogagent/controller.go
Key Methods:
Reconcile(): Main reconciliation loopreconcileV2(): v2alpha1 reconciliation logichandleFinalizer(): Cleanup on deletion
Reconciliation Flow:
Reconcile()
→ Load DatadogAgent CR
→ Validate configuration
→ Load features (43 feature handlers)
→ Build component manifests (Agent, ClusterAgent, ClusterChecksRunner)
→ Apply/Update Kubernetes resources
→ Update status
Location: internal/controller/datadogagent/feature/
Architecture: Each feature is a self-contained package implementing the Feature interface.
Examples:
apm/- Application Performance Monitoringcspm/- Cloud Security Posture Managementnpm/- Network Performance Monitoringlogcollection/- Log collectionadmissioncontroller/- Admission controller
Feature Registration: Features self-register via init() functions.
Interface:
type Feature interface {
ID() string
Configure(dda *DatadogAgent) error
ManageDependencies(store *Store) error
ManageClusterAgent(mgrInterface) error
ManageNodeAgent(mgrInterface) error
}Location: internal/controller/datadogagent/component/
Components:
- agent/: Node Agent (DaemonSet)
- clusteragent/: Cluster Agent (Deployment)
- clusterchecksrunner/: Cluster Checks Runner (Deployment)
Each component manager handles:
- Manifest generation
- Resource application
- Status updates
- RBAC creation
Location: internal/controller/datadogagent/merger/
Purpose: Merge user configuration with feature defaults
Pattern: 31 specialized merger handlers for different resource types
- Example:
merger/daemonset.go,merger/deployment.go
Flow:
User Config → Feature Defaults → Global Config → Final Manifest
| Package | Purpose | Key Files |
|---|---|---|
pkg/kubernetes |
K8s platform detection, RBAC, object utils | platform.go, rbac/ |
pkg/config |
Configuration management | config.go |
pkg/secrets |
Secret backend integration | secrets.go |
pkg/controller/utils |
Datadog API client, metrics forwarding | datadog.go |
pkg/constants |
Global constants | constants.go |
pkg/images |
Image configuration | image.go |
-
Create feature package:
mkdir -p internal/controller/datadogagent/feature/myfeature
-
Implement Feature interface:
// myfeature/feature.go package myfeature import "github.com/DataDog/datadog-operator/internal/controller/datadogagent/feature" func init() { feature.Register(feature.MyFeatureIDType, buildMyFeature) } type myFeature struct { // fields } func (f *myFeature) ID() string { return string(feature.MyFeatureIDType) } func (f *myFeature) Configure(dda *v2alpha1.DatadogAgent) error { /* ... */ } // Implement other interface methods
-
Add feature to API types:
- Update
api/datadoghq/v2alpha1/datadogagent_types.go - Add feature configuration struct
- Update
-
Write tests:
// myfeature/feature_test.go var _ = Describe("MyFeature", func() { // Ginkgo tests })
-
Generate manifests:
make generate make manifests
-
Edit API types:
api/datadoghq/v2alpha1/datadogagent_types.go -
Add validation markers:
// +kubebuilder:validation:Enum=value1;value2 // +kubebuilder:validation:Optional
-
Regenerate CRDs:
make generate make manifests
-
Update conversion webhooks if needed:
- Location:
api/datadoghq/v2alpha1/datadogagent_conversion.go
- Location:
Main reconciliation: internal/controller/datadogagent/controller.go
Pattern:
func (r *Reconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
// 1. Fetch resource
// 2. Handle deletion (finalizers)
// 3. Validate configuration
// 4. Process features
// 5. Generate manifests
// 6. Apply resources
// 7. Update status
// 8. Return result
}Location: internal/controller/datadogagent/merger/
Example: Merging DaemonSet configurations
// merger/daemonset.go
func DaemonSetMerger(manager *merge.PodTemplateManager, spec *v2alpha1.DatadogAgentComponentOverride) {
// Merge logic
}Key Merger Types:
PodTemplateManager: Pod-level mergingContainerManager: Container-level mergingVolumeManager: Volume merging
# Unit tests
make test
# Unit tests with coverage
make test-coverage
# Integration tests
make integration-tests
# E2E tests
cd test/e2e
make e2e-testsFramework: Ginkgo + Gomega
Example:
var _ = Describe("DatadogAgent Controller", func() {
Context("When reconciling a resource", func() {
It("Should create DaemonSet", func() {
// Test logic
Expect(err).NotTo(HaveOccurred())
})
})
})Location: internal/controller/datadogagent/testutils/
Available helpers:
NewDatadogAgent(): Create test DDANewDeployment(): Create test Deployment- Mock clients and interfaces
# Install dependencies
go mod download
# Format code
make fmt
# Run linter
make vet
# Build operator
make build
# Run locally (requires kubeconfig)
make run# Build image
make docker-build IMG=<your-registry>/datadog-operator:tag
# Push image
make docker-push IMG=<your-registry>/datadog-operator:tag# Install CRDs
make install
# Deploy operator
make deploy IMG=<your-registry>/datadog-operator:tag
# Uninstall
make undeploySee RELEASING.md for full release process.
All monitoring capabilities are implemented as pluggable features that self-register at initialization.
Three main components, each with dedicated managers:
- NodeAgent: DaemonSet for node-level monitoring
- ClusterAgent: Deployment for cluster-level operations
- ClusterChecksRunner: Deployment for distributed checks
Users can override generated configurations at multiple levels:
apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
spec:
override:
nodeAgent:
containers:
agent:
env:
- name: MY_VAR
value: my_valueStatus updates use structured conditions and sub-resource status to report:
- Deployment progress
- Configuration errors
- Component health
RBAC is dynamically generated based on enabled features:
- Location:
pkg/kubernetes/rbac/ - Generated at reconciliation time
Pluggable secret backend system:
- Support for external secret providers
- Command-based secret resolution
- Location:
pkg/secrets/
| What | Where |
|---|---|
| API definitions | api/datadoghq/v2alpha1/ |
| Controller logic | internal/controller/datadogagent/controller.go |
| Feature implementations | internal/controller/datadogagent/feature/<feature-name>/ |
| Configuration merging | internal/controller/datadogagent/merger/ |
| RBAC generation | pkg/kubernetes/rbac/ |
| Constants/defaults | pkg/constants/ |
| Utilities | pkg/controller/utils/ |
| CRD manifests | config/crd/bases/ |
| Example configs | config/samples/ or examples/ |
| Documentation | docs/ |
| File | Purpose |
|---|---|
cmd/main.go |
Operator entry point, flag parsing, manager setup |
internal/controller/setup.go |
Controller registration |
internal/controller/datadogagent/controller.go |
Main reconciliation logic |
api/datadoghq/v2alpha1/datadogagent_types.go |
Primary CRD definition (~92KB) |
pkg/constants/constants.go |
Global constants |
Makefile |
Build targets and automation |
Common search patterns:
# Find feature implementations
find internal/controller/datadogagent/feature -name "feature.go"
# Find CRD types
grep -r "type Datadog" api/
# Find controller reconcilers
grep -r "func (r \*.*Reconciler) Reconcile" internal/
# Find merger implementations
ls internal/controller/datadogagent/merger/
# Find test files
find . -name "*_test.go" | head -20Location: cmd/main.go
| Flag | Default | Purpose |
|---|---|---|
metrics-addr |
:8080 |
Metrics endpoint |
enable-leader-election |
true |
HA support |
loglevel |
info |
Log level |
datadogAgentEnabled |
true |
Enable DatadogAgent controller |
datadogMonitorEnabled |
false |
Enable Monitor controller |
datadogAgentProfileEnabled |
false |
Enable Profile controller (beta) |
supportExtendedDaemonset |
false |
Use ExtendedDaemonSet |
secretBackendCommand |
"" |
Secret backend executable |
| Variable | Purpose |
|---|---|
DD_API_KEY |
Datadog API key |
DD_APP_KEY |
Datadog application key |
DD_SITE |
Datadog site (datadoghq.com, datadoghq.eu, etc.) |
WATCH_NAMESPACE |
Namespace to watch (empty = all) |
The following features are enabled by default when a DatadogAgent resource is created:
- Cluster Agent
- Admission Controller
- Cluster Checks
- Kubernetes Event Collection
- Kubernetes State Core Check
- Live Container Collection
- Orchestrator Explorer
- UnixDomainSocket transport
- Process Discovery
- Control Plane Monitoring
Issue: CRD not found
Solution: Run make install to install CRDs
Issue: Webhook validation errors Solution: Ensure webhook certificates are valid and webhook service is running
Issue: Feature not enabled
Solution: Check feature configuration in DatadogAgent spec, verify feature is registered in init()
Issue: RBAC errors Solution: Verify ServiceAccount has correct ClusterRole bindings
# Run with debug logging
go run ./cmd/main.go --loglevel=debug
# Enable profiling
go run ./cmd/main.go --profiling-enabled=true# Check operator logs
kubectl logs -n datadog deployment/datadog-operator-controller-manager
# Get DatadogAgent status
kubectl get datadogagent -o yaml
# Check generated resources
kubectl get daemonset,deployment -l app.kubernetes.io/managed-by=datadog-operator
# Validate CRD
kubectl explain datadogagent.spec.features- Follow standard Go conventions
- Use
gofmtfor formatting - Run
make vetbefore committing - Add tests for new features
- Update documentation
- Fork and create feature branch
- Make changes with tests
- Run
make testandmake vet - Update documentation if needed
- Submit PR against
mainbranch - Reference issues using
#<issue-number>
Main docs: docs/
Key documentation files:
docs/getting_started.md- Setup guidedocs/configuration.v2alpha1.md- Configuration referencedocs/how-to-contribute.md- Contribution guidedocs/deprecated_configs.md- Deprecation notices
- Documentation: https://github.com/DataDog/datadog-operator/tree/main/docs
- OperatorHub: https://operatorhub.io/operator/datadog-operator
- RedHat Certification: https://catalog.redhat.com/software/operators/detail/5e9874986c5dcb34dfbb1a12
- Datadog Agent: https://github.com/DataDog/datadog-agent
- ExtendedDaemonSet: https://github.com/DataDog/extendeddaemonset
- Helm Chart: https://github.com/DataDog/helm-charts/tree/main/charts/datadog
- v1alpha1: Original API (deprecated in v1.8.0+)
- v2alpha1: Current stable API
- Operator v1.8.0+ does not support direct migration from v1alpha1 to v2alpha1
- Use conversion webhook in v1.7.0 for migration
- See
docs/deprecated_configs.mdfor deprecation notices
Licensed under Apache License 2.0. See LICENSE file.
- Issues: https://github.com/DataDog/datadog-operator/issues
- Discussions: GitHub Discussions
- Slack: Datadog Community Slack
Generated for AI Agents and Developers This guide provides a comprehensive overview for navigating and contributing to the Datadog Operator codebase.