Skip to content

Conversation

@hadv
Copy link
Owner

@hadv hadv commented Oct 5, 2025

Overview

This PR implements the Kafka AFS (Abstract File System) adapter for NebulaStore, enabling Apache Kafka as a storage backend with event-driven capabilities, audit trails, and time-travel features.

Implementation Summary

📦 Components Implemented

  1. KafkaBlob - Blob metadata with 28-byte serialization
  2. KafkaConfiguration - Configuration management with production/development presets
  3. KafkaPathValidator - Topic naming sanitization and validation
  4. KafkaTopicIndex - Per-topic blob metadata management
  5. KafkaConnector - Full IBlobStoreConnector implementation

🏗️ Architecture

File → Kafka Topic (data) + Index Topic (metadata)
  ├── Data Topic: File chunks as Kafka messages (1MB default)
  └── Index Topic: Blob metadata (partition, offset, range)

Key Features:

  • ✅ Blob chunking with configurable size (default 1MB)
  • ✅ Partial reads (read specific byte ranges)
  • ✅ File truncation, move, copy operations
  • ✅ Virtual directory support
  • ✅ Topic naming sanitization for Kafka constraints
  • ✅ Production-ready configuration presets

🧪 Testing

  • 75 unit tests - All passing ✅
  • 100% pass rate - No Kafka required for unit tests
  • Test Coverage:
    • KafkaBlobTests (24 tests)
    • KafkaConfigurationTests (29 tests)
    • KafkaPathValidatorTests (22 tests)

📚 Documentation

  • Comprehensive README with usage examples
  • Configuration guide with production/development presets
  • Example code demonstrating various use cases
  • Troubleshooting guide
  • Comparison with other AFS adapters

📊 Files Changed

  • 13 files added (~2,939 lines)
  • Core implementation: 5 source files
  • Tests: 3 test files with 75 tests
  • Documentation: README + examples

🔧 Dependencies

  • Confluent.Kafka (2.6.1) - Apache Kafka .NET client
  • MessagePack (inherited from NebulaStore.Storage)

✅ Build Status

✅ Build: SUCCESS (0 errors, 0 warnings)
✅ Tests: 75/75 PASSED (100% pass rate)

Feasibility Study

This implementation is based on a comprehensive feasibility study documented in:

  • docs/KafkaAfsFeasibility.md - Technical feasibility analysis
  • docs/KafkaAfsComparison.md - Comparison with other AFS adapters
  • docs/KafkaAfsSummary.md - Executive summary
  • docs/KafkaAfsQuickStart.md - Quick start guide

Verdict: FEASIBLE - Confluent.Kafka provides 100% API parity with Java kafka-clients.

Usage Example

using NebulaStore.Storage.Embedded;
using NebulaStore.Afs.Kafka;

// Configure Kafka
var kafkaConfig = KafkaConfiguration.Production(
    bootstrapServers: "kafka1:9092,kafka2:9092,kafka3:9092",
    clientId: "nebulastore-prod"
);

// Create connector
using var connector = KafkaConnector.New(kafkaConfig);

// Create file system
using var fileSystem = BlobStoreFileSystem.New(connector);

// Use with EmbeddedStorage
var storageConfig = EmbeddedStorageConfiguration.New()
    .SetStorageFileSystem(fileSystem)
    .Build();

using var storage = EmbeddedStorage.Start(storageConfig);

When to Use Kafka AFS

✅ Good Fit

  • Event-driven architectures
  • Audit trail requirements
  • Time travel capabilities needed
  • Multi-datacenter deployments
  • Already using Kafka infrastructure
  • Streaming analytics integration

❌ Poor Fit

  • Simple local storage needs (use NIO)
  • Cost-sensitive scenarios
  • Small-scale applications
  • No Kafka expertise

Roadmap

This PR completes Phase 1 (Core Implementation) of the 3-week MVP roadmap.

Future Enhancements (Phase 2 & 3):

  • Async index loading
  • Kafka transactions for atomic writes
  • Index rebuild capability
  • Integration tests with Docker Compose
  • Performance benchmarks
  • Production deployment guide

Testing Instructions

Unit Tests (No Kafka Required)

dotnet test afs/kafka/tests/NebulaStore.Afs.Kafka.Tests.csproj

Integration Tests (Requires Kafka)

Integration tests will be added in Phase 3. For now, you can test manually:

  1. Start Kafka:

    docker-compose up -d kafka
  2. Run examples:

    dotnet run --project afs/kafka/examples/

Breaking Changes

None - This is a new module with no impact on existing code.

Checklist

  • Code compiles without errors or warnings
  • All unit tests pass (75/75)
  • Documentation added (README + examples)
  • Follows existing code patterns (BlobStoreConnectorBase)
  • Added to solution file
  • Feasibility study completed
  • Integration tests (deferred to Phase 3)
  • Performance benchmarks (deferred to Phase 3)

Related Issues

Feasibility investigation for porting Eclipse Store AFS Kafka adapter to NebulaStore.

References


Ready for review! 🎉

This implementation provides a solid foundation for Kafka-backed storage in NebulaStore with comprehensive testing and documentation.


Pull Request opened by Augment Code with guidance from the PR author

prpeh added 2 commits October 5, 2025 12:31
- Add comprehensive feasibility study (KafkaAfsFeasibility.md)
- Add comparison with other AFS adapters (KafkaAfsComparison.md)
- Add executive summary (KafkaAfsSummary.md)
- Add quick start guide (KafkaAfsQuickStart.md)
- Investigate Confluent.Kafka .NET library compatibility
- Map Eclipse Store Kafka AFS design to .NET implementation
- Provide 3-week MVP roadmap for implementation

Verdict: FEASIBLE - Confluent.Kafka provides 100% API parity with Java kafka-clients.
Recommended for event-driven architectures with audit trail requirements.
- Add KafkaBlob record for blob metadata (28-byte serialization)
- Add KafkaConfiguration with production/development presets
- Add KafkaPathValidator for topic naming sanitization
- Add KafkaTopicIndex for per-topic blob metadata management
- Add KafkaConnector implementing IBlobStoreConnector
- Implement read/write operations with blob chunking (1MB default)
- Implement directory operations and file management
- Add comprehensive unit tests (75 tests, 100% pass rate)
- Add README with usage examples and configuration guide
- Add example code demonstrating various use cases

Architecture:
- Files stored as Kafka topics with configurable chunk size
- Blob metadata stored in separate index topics
- Supports partial reads, truncation, and file operations
- Uses Confluent.Kafka 2.6.1 for .NET Kafka client

Testing:
- Unit tests cover all core components without requiring Kafka
- Tests validate serialization, configuration, and path validation
- Integration tests can be added later when Kafka is available

Note: This is Phase 1 (Core Implementation) of the 3-week MVP roadmap.
Next phases will add advanced index management and integration tests.
@hadv hadv added hacktoberfest Issues suitable for Hacktoberfest contributions hacktoberfest-accepted labels Oct 5, 2025
@hadv hadv merged commit 6606443 into main Oct 5, 2025
5 checks passed
@hadv hadv deleted the feature/kafka-afs-adapter branch October 5, 2025 08:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hacktoberfest Issues suitable for Hacktoberfest contributions hacktoberfest-accepted

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants