Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
150 changes: 150 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
# Contributing to Druid Operator

First off, thanks for taking the time to contribute to the Druid Operator! 🎉

The following is a set of guidelines for contributing to Druid Operator and its packages. These are mostly guidelines, not rules. Use your best judgment, and feel free to propose changes to this document in a pull request.

## Table of Contents

- [Prerequisites](#prerequisites)
- [Windows](#windows)
- [MacOS](#macos)
- [Linux](#linux)
- [Development Setup](#development-setup)
- [1. Fork and Clone](#1-fork-and-clone)
- [2. Create a Local Cluster](#2-create-a-local-cluster)
- [3. Install Dependencies](#3-install-dependencies)
- [4. Run the Operator](#4-run-the-operator)
- [Testing](#testing)
- [Unit Tests](#unit-tests)
- [End-to-End Tests](#end-to-end-tests)
- [Project Structure](#project-structure)

## Prerequisites

You will need the following tools installed on your development machine:

* **Go** (v1.20+)
* **Docker** (v20.10+)
* **Kind** (v0.20+)
* **Kubectl** (latest)
* **Helm** (v3+)
* **Make**

### Windows

**Recommended**: Use Docker Desktop with WSL 2 backend.

run the following commands in PowerShell (Admin):

```powershell
# Install core tools
winget install -e --id GoLang.Go
winget install -e --id Docker.DockerDesktop
winget install -e --id Kubernetes.kind
winget install -e --id Kubernetes.kubectl
winget install -e --id Helm.Helm
winget install -e --id GnuWin32.Make
```

### MacOS

Using [Homebrew](https://brew.sh/):

```bash
brew install go
brew install --cask docker
brew install kind
brew install kubectl
brew install helm
brew install make
```

### Linux

Using `apt` (Ubuntu/Debian) or `brew` (Linuxbrew):

```bash
# Using Linuxbrew (Recommended for unified versioning)
brew install go kind kubectl helm make

# OR using apt (Ubuntu)
sudo apt update
sudo apt install -y golang-go make
# For Docker, Kind, Kubectl, and Helm, please refer to their official installation guides
# as apt repositories might lag behind.
```

## Development Setup

### 1. Fork and Clone

1. Fork the [druid-operator repository](https://github.com/datainfrahq/druid-operator) on GitHub.
2. Clone your fork locally:

```bash
git clone https://github.com/<your-username>/druid-operator.git
cd druid-operator
```

### 2. Create a Local Cluster

We use **Kind** (Kubernetes in Docker) for local development.

```bash
kind create cluster --name druid
```

### 3. Install Dependencies

Deploy the Druid Operator using Helm to set up CRDs and basic resources.

```bash
# Add the DataInfra Helm repo
helm repo add datainfra https://charts.datainfra.io
helm repo update

# Install the operator (this installs CRDs and the controller)
helm -n druid-operator-system upgrade -i --create-namespace cluster-druid-operator datainfra/druid-operator
```

### 4. Run the Operator

You can run the operator source code locally against your Kind cluster. This is useful for rapid development without building Docker images for every change.

```bash
# Verify you are pointing to the correct context
kubectl config use-context kind-druid

# Run the controller locally
make run
```

The operator logs will appear in your terminal.

## Testing

### Unit Tests

Run the unit tests to verify your changes.

```bash
make test
```

### End-to-End Tests

To run the full end-to-End suite (this spins up a Kind cluster and runs validation):

```bash
make e2e
```

## Project Structure

* `apis/`: Kubernetes API definitions (CRDs).
* `controllers/`: Core controller logic using Kubebuilder.
* `chart/`: Helm chart for the operator.
* `e2e/`: End-to-End test scripts and configurations.
* `docs/`: Documentation files.
* `Makefile`: Build and test automation commands.
114 changes: 114 additions & 0 deletions learning-docs/01-introduction/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# 1. Introduction - What is This Project?

## The Problem This Solves

Imagine you want to run Apache Druid (a real-time analytics database) on Kubernetes. Druid is a **distributed system** with multiple components:

- **Coordinator** - Manages data availability
- **Overlord** - Manages data ingestion tasks
- **Broker** - Handles queries from clients
- **Router** - Routes requests to the right service
- **Historical** - Stores and serves historical data
- **MiddleManager/Indexer** - Handles data ingestion

To run Druid manually on Kubernetes, you would need to create:
- Multiple StatefulSets (one for each component)
- Multiple ConfigMaps (configuration files)
- Multiple Services (for networking)
- PersistentVolumeClaims (for storage)
- And more...

This could be **50+ YAML files** that you need to manage, update, and keep in sync!

## The Solution: An Operator

This operator lets you define your entire Druid cluster in **ONE simple YAML file**:

```yaml
apiVersion: druid.apache.org/v1alpha1
kind: Druid
metadata:
name: my-druid-cluster
spec:
image: apache/druid:25.0.0
nodes:
brokers:
nodeType: broker
replicas: 2
historicals:
nodeType: historical
replicas: 3
# ... other nodes
```

The operator then:
1. **Reads** this YAML file
2. **Creates** all necessary Kubernetes resources automatically
3. **Monitors** the cluster continuously
4. **Heals** the cluster if something goes wrong
5. **Updates** the cluster when you change the YAML

## Key Concepts in This Project

### 1. Custom Resource Definition (CRD)
A CRD extends Kubernetes with new resource types. This project defines two CRDs:
- `Druid` - Represents a Druid cluster
- `DruidIngestion` - Represents a data ingestion job

### 2. Custom Resource (CR)
A CR is an instance of a CRD. When you create a YAML file with `kind: Druid`, you're creating a CR.

### 3. Controller
The controller is the "brain" that watches for CRs and takes action. It runs in a loop:
1. Watch for changes to Druid CRs
2. Compare desired state (what the CR says) vs actual state (what exists in K8s)
3. Take action to make actual state match desired state

### 4. Reconciliation
The process of making the actual state match the desired state is called "reconciliation."

## Project Components

```
druid-operator/
├── apis/ # CRD definitions (what a Druid CR looks like)
├── controllers/ # Controller logic (what to do when CR changes)
├── config/ # Kubernetes manifests for deploying the operator
├── chart/ # Helm chart for easy installation
├── examples/ # Example Druid cluster configurations
├── main.go # Entry point - starts the operator
└── docs/ # Documentation
```

## How It Works (High Level)

```
┌─────────────────────────────────────────────────────────────────┐
│ Kubernetes Cluster │
│ │
│ ┌─────────────┐ ┌─────────────────────────────────┐ │
│ │ You │ │ Druid Operator │ │
│ │ (User) │ │ │ │
│ └──────┬──────┘ │ ┌───────────────────────────┐ │ │
│ │ │ │ Controller │ │ │
│ │ kubectl apply │ │ │ │ │
│ │ │ │ 1. Watch Druid CRs │ │ │
│ ▼ │ │ 2. Compare states │ │ │
│ ┌─────────────┐ │ │ 3. Create/Update/Delete │ │ │
│ │ Druid CR │◄────────┤ │ K8s resources │ │ │
│ │ (YAML) │ │ └───────────────────────────┘ │ │
│ └─────────────┘ └─────────────────────────────────┘ │
│ │ │
│ │ Operator creates these automatically: │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ StatefulSets, Services, ConfigMaps, PVCs, etc. │ │
│ │ (All the resources needed to run Druid) │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
```

## Next Steps

Continue to [Prerequisites & Learning Path](../02-prerequisites/README.md) to understand what technologies you need to learn.
Loading
Loading