NebulaFS

NebulaFS is a production-grade, cloud-storage style file server written in C++20 using Boost.Asio/Beast for async HTTP and Poco for configuration, logging, and utilities. It is built as a learning-by-building project that scales from a single-node file server to a distributed storage cluster.

Highlights

Async HTTP server with Boost.Asio/Beast and multi-threaded IO.
Local filesystem storage with atomic writes and checksum-based ETags.
SQLite metadata for buckets and objects.
Structured logging via Poco with request correlation.
Security-first design with OIDC/JWT auth and JWKS validation support.

Architecture (Milestone 0–7 baseline)

flowchart LR
  client(("Client")) -->|"HTTPS"| gateway["HTTP Server"]
  gateway --> auth["Auth (OIDC/JWT + JWKS)"]
  gateway --> local_storage["Local Storage Engine (single_node)"]
  gateway --> local_metadata["SQLite Metadata (single_node)"]
  gateway --> metadata_svc["Metadata Service (distributed)"]
  gateway --> storage_nodes["Storage Nodes (distributed)"]
  gateway --> observability["Metrics/Health"]

Quickstart

Prerequisites

CMake 3.20+
C++20 compiler
vcpkg

Build

cmake --preset debug
cmake --build --preset debug

Run

./build/debug/nebulafs --config config/server.json

Distributed mode binaries

Milestone 6 adds two internal services:

./build/debug/nebulafs_metadata (metadata + placement service)
./build/debug/nebulafs_storage_node (blob storage node)

Gateway distributed mode is enabled with:

server.mode = "distributed"
distributed.metadata_base_url
distributed.storage_nodes
distributed.service_auth_token

Distributed observability metrics (Milestone 6)

Distributed mode now emits service-specific counters and latency sums via /metrics.

Gateway:
- nebulafs_gateway_storage_put_failures_total
- nebulafs_gateway_metadata_rpc_failures_total
- nebulafs_gateway_replica_fallback_total
- nebulafs_gateway_multipart_compose_failures_total
- nebulafs_gateway_multipart_rollback_attempts_total
- nebulafs_gateway_multipart_rollback_failures_total
- nebulafs_gateway_distributed_cleanup_uploads_total
- nebulafs_gateway_distributed_cleanup_upload_failures_total
- nebulafs_gateway_distributed_cleanup_blob_deletes_total
- nebulafs_gateway_distributed_cleanup_blob_delete_failures_total
Metadata service:
- nebulafs_metadata_allocate_requests_total
- nebulafs_metadata_allocate_failures_total
- nebulafs_metadata_allocate_latency_ms_sum
- nebulafs_metadata_commit_requests_total
- nebulafs_metadata_commit_failures_total
- nebulafs_metadata_commit_latency_ms_sum
Storage node service:
- nebulafs_storage_node_blob_writes_total
- nebulafs_storage_node_blob_write_failures_total
- nebulafs_storage_node_blob_write_latency_ms_sum
- nebulafs_storage_node_blob_reads_total
- nebulafs_storage_node_blob_read_failures_total
- nebulafs_storage_node_blob_read_latency_ms_sum
- nebulafs_storage_node_blob_deletes_total
- nebulafs_storage_node_blob_delete_failures_total
- nebulafs_storage_node_blob_delete_latency_ms_sum
- nebulafs_storage_node_blob_composes_total
- nebulafs_storage_node_blob_compose_failures_total
- nebulafs_storage_node_blob_compose_latency_ms_sum

Traffic controls

config/server.json supports:

server.limits.request_timeout_ms (default 30000)
server.limits.rate_limit_rps (default 0, disabled)
server.limits.rate_limit_burst (default 0, disabled)

Example API calls

# Health
curl http://localhost:8080/healthz

# Create bucket
curl -X POST http://localhost:8080/v1/buckets -d '{"name":"demo"}'

# Upload object
curl -X PUT \
  --data-binary @README.md \
  http://localhost:8080/v1/buckets/demo/objects/readme.txt

# Upload object (query-style)
curl -X POST \
  --data-binary @README.md \
  "http://localhost:8080/v1/buckets/demo/objects?name=readme.txt"

# Multipart upload: initiate
UPLOAD_ID=$(curl -s -X POST \
  -H "Content-Type: application/json" \
  -d '{"object":"large.bin"}' \
  http://localhost:8080/v1/buckets/demo/multipart-uploads | jq -r .upload_id)

# Multipart upload: parts
curl -X PUT --data-binary @part1.bin \
  "http://localhost:8080/v1/buckets/demo/multipart-uploads/$UPLOAD_ID/parts/1"
curl -X PUT --data-binary @part2.bin \
  "http://localhost:8080/v1/buckets/demo/multipart-uploads/$UPLOAD_ID/parts/2"

# Multipart upload: complete
PART1_ETAG=$(curl -s "http://localhost:8080/v1/buckets/demo/multipart-uploads/$UPLOAD_ID/parts" \
  | jq -r '.parts[] | select(.part_number==1) | .etag')
PART2_ETAG=$(curl -s "http://localhost:8080/v1/buckets/demo/multipart-uploads/$UPLOAD_ID/parts" \
  | jq -r '.parts[] | select(.part_number==2) | .etag')
curl -X POST -H "Content-Type: application/json" \
  -d "{\"parts\":[{\"part_number\":1,\"etag\":\"$PART1_ETAG\"},{\"part_number\":2,\"etag\":\"$PART2_ETAG\"}]}" \
  "http://localhost:8080/v1/buckets/demo/multipart-uploads/$UPLOAD_ID/complete"

# Multipart upload: abort
curl -X DELETE "http://localhost:8080/v1/buckets/demo/multipart-uploads/$UPLOAD_ID"

# Download object
curl http://localhost:8080/v1/buckets/demo/objects/readme.txt -o readme.txt

# List objects
curl "http://localhost:8080/v1/buckets/demo/objects?prefix=read"

Note: multipart upload endpoints are available in both single-node and distributed mode. In distributed mode, parts are stored on storage nodes and finalized through gateway orchestration.

Authentication test (Keycloak local)

Use this to validate auth.enabled=true end-to-end.

Start Keycloak:

docker run --name keycloak -p 8081:8080 \
  -e KEYCLOAK_ADMIN=admin \
  -e KEYCLOAK_ADMIN_PASSWORD=admin \
  quay.io/keycloak/keycloak:26.0 start-dev

Set auth in config/server.json:

"auth": {
  "enabled": true,
  "issuer": "http://127.0.0.1:8081/realms/master",
  "audience": "",
  "jwks_url": "http://127.0.0.1:8081/realms/master/protocol/openid-connect/certs",
  "cache_ttl_seconds": 300,
  "clock_skew_seconds": 60,
  "allowed_alg": "RS256"
}

Restart NebulaFS.
Verify protected route without token (should be 401):

curl -i http://127.0.0.1:8080/v1/buckets

Request token and call protected route:

TOKEN=$(curl -s -X POST \
  "http://127.0.0.1:8081/realms/master/protocol/openid-connect/token" \
  -d "grant_type=password" \
  -d "client_id=admin-cli" \
  -d "username=admin" \
  -d "password=admin" | jq -r .access_token)

curl -i -H "Authorization: Bearer $TOKEN" http://127.0.0.1:8080/v1/buckets

Troubleshooting:

issuer mismatch: auth.issuer must exactly equal token iss.
audience mismatch: set auth.audience to match token aud, or use empty string to skip.
jwks fetch failed: verify auth.jwks_url and IdP reachability from NebulaFS process.

Known limitations (Milestone 3 baseline)

OpenSSL 3 deprecation warnings appear in JWT/JWKS test helper code (RSA_* APIs). They are test-only warnings and do not block runtime behavior.
/metrics is currently treated as a protected endpoint when auth.enabled=true (only /healthz and /readyz are public).

Security Model (Current)

TLS supported via config; disabled by default for local dev.
Auth is available via OIDC/JWT when enabled in config. Health is public; all other endpoints require a valid token.
Path traversal protection enforced in storage.
Size limits enforced by config.

Performance Notes (Current)

Async IO with per-connection strands.
Streaming request bodies to disk with size limits.
Download supports HTTP range requests.

Roadmap

Milestone 3: OIDC/JWT validation with JWKS caching (completed).
Milestone 3.1: Startup auth config hardening (completed).
Milestone 4: Multipart uploads and cleanup baseline (completed).
Milestone 5: Metrics (Prometheus), rate limiting, timeouts (completed).
Milestone 6: Distributed baseline implemented (gateway + metadata service + storage nodes + distributed CI lane).
Milestone 7: Distributed upload maturity (streamed writes + distributed multipart baseline) (completed).
Milestone 8: Distributed reliability hardening (compose reliability + distributed cleanup) (in progress).

Milestone 6 completion criteria

Distributed mode keeps public object CRUD routes unchanged at the gateway.
Metadata and storage-node internal services run as separate binaries with service-token checks.
Distributed failure correctness is covered in integration tests (read fallback, write quorum failure, token rejection).
Distributed metrics are exposed and validated for gateway, metadata service, and storage node.
Current limitation: distributed cleanup coordination is best-effort per gateway instance (no cluster leader election).

Docs

Architecture: docs/architecture.md
Milestone 4 design: docs/design/milestone-4-multipart-cleanup.md
Milestone 6 design: docs/design/milestone-6-distributed-mode.md
Threat model: docs/threat-model.md
ADRs: docs/adr/
Code style: docs/code-style.md

License

MIT. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
.github/workflows		.github/workflows
config		config
deploy		deploy
docs		docs
include/nebulafs		include/nebulafs
src		src
tests		tests
tools/dev-scripts		tools/dev-scripts
.clang-format		.clang-format
.clang-tidy		.clang-tidy
.editorconfig		.editorconfig
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
CMakePresets.json		CMakePresets.json
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
vcpkg.json		vcpkg.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NebulaFS

Highlights

Architecture (Milestone 0–7 baseline)

Quickstart

Prerequisites

Build

Run

Distributed mode binaries

Distributed observability metrics (Milestone 6)

Traffic controls

Example API calls

Authentication test (Keycloak local)

Known limitations (Milestone 3 baseline)

Security Model (Current)

Performance Notes (Current)

Roadmap

Milestone 6 completion criteria

Docs

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NebulaFS

Highlights

Architecture (Milestone 0–7 baseline)

Quickstart

Prerequisites

Build

Run

Distributed mode binaries

Distributed observability metrics (Milestone 6)

Traffic controls

Example API calls

Authentication test (Keycloak local)

Known limitations (Milestone 3 baseline)

Security Model (Current)

Performance Notes (Current)

Roadmap

Milestone 6 completion criteria

Docs

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages