Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
name: Distributed Systems CI

on:
push:
branches: [ "main" ]
pull_request:
branches: [ "main" ]

jobs:
build:
runs-on: ubuntu-latest

steps:
# 1. Checkout the code
- uses: actions/checkout@v3

# 2. Set up JDK 21
- name: Set up JDK 21
uses: actions/setup-java@v3
with:
java-version: '21'
distribution: 'temurin'
cache: maven

# 3. Install Root POM
- name: Install Root POM
run: mvn clean install -DskipTests -f java/pom.xml

# 4. Build & Install 'dsl-common'
- name: Build & Test Common Module
run: mvn clean install -f java/dsl-common/pom.xml

# 5. Build & Test 'url-shortener'
- name: Build & Test URL Shortener
run: mvn clean package -f java/url-shortener/pom.xml
16 changes: 16 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# IDE files
.idea/

# Build output
build/
out/
target/

# Log files
*.log

# Dependency caches
.mvn/

# OS-specific files
.DS_Store
20 changes: 18 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,18 @@
# distributed-systems-lab
High-performance distributed system components implemented from scratch, focusing on concurrency, availability, and low-latency architecture.
# Distributed Systems Lab

A collection of **production-ready** distributed services implemented from scratch.

This repository moves beyond the theoretical "boxes and arrows" of system design interviews to provide **working, end-to-end implementations** of complex distributed challenges. It bridges the gap between high-level architecture diagrams and the low-level engineering reality.

Instead of assuming a database scales, we **choose the DB**, implement the schema and prove it with **load tests** and **observability** metrics.

>**Core Philosophy:** It's not a valid design until it runs in Docker, passes CI/CD, and handles real traffic.

## Modules & Status

| Module | Description | Key Tech | Status |
|:-----------------------------------------------|:--------------------------------------------------|:--------------------------|:-------------------|
| **[Common Libraries](java/dsl-common)** (Java) | Shared utility libraries for distributed systems. | Distributed ID Generator | ✅ **Completed** |
| **[URL Shortener](java/url-shortener)** | Distributed URL Shortener service | Netty, ScyllaDB | ✅ **Completed** |
| **[Rate Limiter](java/rate-limiter)** | Distributed sliding window rate limiter. | Redis (Lua), Token Bucket | 🚧 **In Progress** |

Binary file added docs/assets/snowflake_bit_layout.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/url-shortener/architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/url-shortener/grafana_dashboard.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/url-shortener/load_test.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
54 changes: 54 additions & 0 deletions java/dsl-common/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# DSL Common Library

Core distributed system utilities shared across the `distributed-systems-lab` microservices.

## 1. Snowflake ID Generator
A distributed unique ID generator inspired by Twitter's Snowflake algorithm.

### Why Snowflake?

| Feature | **Snowflake ID** | **UUID (v4)** | **DB Auto-Increment** | **Redis `INCR`** |
|:----------------------|:----------------------------|:-------------------------|:------------------------|:-----------------------|
| **Generation** | Local (No Network) | Local (No Network) | Central (Network Call) | Central (Network Call) |
| **Sortable?** | Yes (Time-ordered) | No (Random) | Yes | Yes |
| **Size** | 64-bit (Small) | 128-bit (Large) | 64-bit (Small) | 64-bit (Small) |
| **Index Performance** | **Excellent** (Append-only) | **Poor** (Fragmentation) | Excellent | Excellent |
| **Coordination** | Low (Node ID config) | None | High (Locks) | High (Single Thread) |
| **Collision Risk** | Zero (if clock stable) | Near Zero | Zero | Risk if data lost |

### Bit Layout (64-bit *long*)
The ID is composed of 64 bits, allowing for time-sorting and distributed generation.

![snowflake_bit_layout.png](../../docs/assets/snowflake_bit_layout.png)

### Handling Clock Drift (NTP)
Distributed systems rely on NTP, which can sometimes move the system clock backwards to sync with the global time. This creates a risk of generating duplicate IDs.

#### Resolution Strategy (Patient Wait):
If the drift is small, the thread pauses (sleeps) until the clock catches up to the last generated timestamp. This prevents service outages during minor NTP adjustments.

### Code Usage
```java
// Initialize with Node ID (must be unique per server instance)
SnowflakeIdGenerator generator = new SnowflakeIdGenerator(1);

// Generate ID
long id = generator.nextId();
// Output: 839281920192
```

## 2. Base62 Encoder
A high-performance utility for converting unique numeric IDs into short strings with only `[0-9, a-z, A-Z]`.

### The Base Conversion Math:
Direct mathematical conversion between Base10 (Decimal) and Base62.
* **Encode:** ID $\rightarrow$ String (e.g., `1024` $\rightarrow$ `"g8"`)
* **Decode:** String $\rightarrow$ ID (e.g., `"g8"` $\rightarrow$ `1024`)

### Usage:
```java
String shortUrl = Base62Encoder.encode(178263819283046400L);
// Output: "darUvlwGyI"

long originalId = Base62Encoder.decode("darUvlwGyI");
// Output: 178263819283046400
14 changes: 14 additions & 0 deletions java/dsl-common/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">

<modelVersion>4.0.0</modelVersion>

<parent>
<artifactId>distributed-systems-lab</artifactId>
<groupId>com.dsl</groupId>
<version>1.0-SNAPSHOT</version>
</parent>

<artifactId>dsl-common</artifactId>
</project>
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
package com.dsl.common.baseencoder;

import java.util.Arrays;

/**
* Utility to convert between unique Long IDs and short Base62 Strings.
* <p>
* Alphabet: [0-9, a-z, A-Z]
*/
public class Base62Encoder {

private static final String ALPHABET = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
private static final int BASE = ALPHABET.length(); // 62

// Pre-computed map for fast (char -> value) decoding
private static final int[] INDEX_MAP = new int[128];

static {
Arrays.fill(INDEX_MAP, -1);
for (int i = 0; i < BASE; i++) {
INDEX_MAP[ALPHABET.charAt(i)] = i;
}
}

/**
* Encodes a numeric ID into a Base62 string.
* Example: 1024 -> "g8"
*/
public static String encode(long id) {
if (id < 0) {
throw new IllegalArgumentException("ID must be non-negative");
}
if (id == 0) {
return String.valueOf(ALPHABET.charAt(0));
}

StringBuilder sb = new StringBuilder();
while (id > 0) {
int remainder = (int) (id % BASE);
sb.append(ALPHABET.charAt(remainder));
id /= BASE;
}
return sb.reverse().toString();
}

/**
* Decodes a Base62 string back into a numeric ID.
* Example: "g8" -> 1024
*/
public static long decode(String str) {
if (str == null || str.isEmpty()) {
throw new IllegalArgumentException("String cannot be null or empty");
}

long id = 0;
for (int i = 0; i < str.length(); i++) {
char c = str.charAt(i);
int val = (c < INDEX_MAP.length) ? INDEX_MAP[c] : -1;

if (val == -1) {
throw new IllegalArgumentException("Invalid character in Base62 string: " + c);
}

id = id * BASE + val;
}
return id;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
package com.dsl.common.idgenerator;

import java.time.Instant;

/**
* A distributed unique ID generator inspired by Twitter's Snowflake.
* <p>
* Structure (64 bits):
* 1 bit: Unused (sign bit)
* 41 bits: Timestamp (milliseconds since custom epoch) -> 2,199,023,255,552 milliseconds or ~69 years
* 10 bits: Node ID (configured per server) -> 1,023 servers
* 12 bits: Sequence number (for IDs generated within the same millisecond) -> 4,096 IDs per millisecond
*/
public class SnowflakeIdGenerator {

// Custom Epoch (e.g., Jan 1st, 2025) - Allows for ~69 years of IDs
private static final long CUSTOM_EPOCH = 1735689600000L;

private static final long NODE_ID_BITS = 10L;
private static final long SEQUENCE_BITS = 12L;

private static final long MAX_NODE_ID = (1L << NODE_ID_BITS) - 1;
private static final long MAX_SEQUENCE = (1L << SEQUENCE_BITS) - 1;

// Bit shifts
private static final long NODE_ID_SHIFT = SEQUENCE_BITS;
private static final long TIMESTAMP_SHIFT = SEQUENCE_BITS + NODE_ID_BITS;

private final long nodeId;
private long lastTimestamp = -1L;
private long sequence = 0L;

/**
* @param nodeId Unique ID for this server/process (0 - 1023)
*/
public SnowflakeIdGenerator(long nodeId) {
if (nodeId < 0 || nodeId > MAX_NODE_ID) {
throw new IllegalArgumentException(String.format("Node ID must be between 0 and %d", MAX_NODE_ID));
}
this.nodeId = nodeId;
}

public synchronized long nextId() {
long currentTimestamp = timestamp();

if (currentTimestamp < lastTimestamp) {
throw new IllegalStateException("Clock moved backwards. Refusing to generate ID.");
}

if (currentTimestamp == lastTimestamp) {
// Same millisecond: increment sequence
sequence = (sequence + 1) & MAX_SEQUENCE;
if (sequence == 0) {
// Sequence exhausted, wait for next millisecond
currentTimestamp = waitNextMillis(currentTimestamp);
}
} else {
// New millisecond: reset sequence
sequence = 0L;
}

lastTimestamp = currentTimestamp;

// Bitwise OR to combine parts
return ((currentTimestamp - CUSTOM_EPOCH) << TIMESTAMP_SHIFT)
| (nodeId << NODE_ID_SHIFT)
| sequence;
}

private long waitNextMillis(long currentTimestamp) {
while (currentTimestamp <= lastTimestamp) {
currentTimestamp = timestamp();
}
return currentTimestamp;
}

private long timestamp() {
return Instant.now().toEpochMilli();
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
package com.dsl.common.baseencoder;

import org.junit.jupiter.api.Test;
import static org.junit.jupiter.api.Assertions.*;

class Base62EncoderTest {

@Test
void testEncodeDecodeRoundTrip() {
long originalId = 178263819283046400L;

String shortUrl = Base62Encoder.encode(originalId);
long decodedId = Base62Encoder.decode(shortUrl);

System.out.println("Original: " + originalId);
System.out.println("Encoded: " + shortUrl);
System.out.println("Decoded: " + decodedId);

assertEquals(originalId, decodedId, "Decoded ID must match the original ID");
}

@Test
void testSimpleValues() {
// 0 -> "0"
assertEquals("0", Base62Encoder.encode(0));
assertEquals(0, Base62Encoder.decode("0"));

// 61 -> "Z"
assertEquals("Z", Base62Encoder.encode(61));
assertEquals(61, Base62Encoder.decode("Z"));

// 62 -> "10"
assertEquals("10", Base62Encoder.encode(62));
assertEquals(62, Base62Encoder.decode("10"));
}

@Test
void testInvalidCharacters() {
assertThrows(IllegalArgumentException.class, () -> Base62Encoder.decode("abc$"));
assertThrows(IllegalArgumentException.class, () -> Base62Encoder.decode(""));
}
}
Loading