Skip to content

storytellerjr/dht_study

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

389 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

DHT Study Project - Learning Distributed Hash Tables

πŸŽ“ Educational Project: This repository is a comprehensive learning journey into Distributed Hash Tables (DHT) using practical examples and hands-on experimentation.

πŸ“š Study Scope & Learning Objectives

This project documents a complete learning journey from DHT fundamentals to production-ready distributed systems. Perfect for developers new to peer-to-peer networking who want hands-on experience with real DHT implementation.

What This Study Covers:

  • πŸ”— P2P Network Fundamentals - Bootstrap nodes, routing tables, peer discovery
  • πŸ—‚οΈ Distributed Storage - Content addressing, data replication, fault tolerance
  • πŸ—οΈ Network Architecture - Kademlia routing, NAT traversal, censorship resistance
  • πŸ’Ύ Production Systems - Large-scale networks, JSON storage, enterprise patterns
  • πŸ›‘οΈ Security Concepts - Attack vectors, defense strategies, network resilience

🎯 Learning Phases Completed

Phase Focus Area Key Achievements
1-2 Core Concepts & Network Setup Bootstrap nodes, peer discovery, routing tables
3 Connection Process Node lifecycle, bootstrap handshake, network integration
4 Security Analysis Bootstrap attack vectors, censorship vulnerabilities
5 Storage Architecture Request handlers, data persistence, command protocols
6 Advanced Features Warehouse nodes, metadata tracking, analytics
7 Production Scale 63-node networks, JSON storage pipeline, multi-node replication
8 Enterprise Storage ⭐ Production warehouse system, CRUD operations, comprehensive testing

πŸ—‚οΈ Study Files & Examples

πŸ“– Core Learning Documentation

  • step by step learning.md - Complete learning journey with discoveries and insights

πŸš€ Network & Bootstrap Examples

  • verbose_bootstrap.mjs - Enhanced bootstrap with detailed network activity logging
  • new_node.mjs - Educational node showing complete bootstrap process
  • fresh_node.mjs - Debug version for connection troubleshooting

πŸ’Ύ Storage & Data Management

  • storage_node.mjs - Basic storage-enabled DHT node
  • warehouse_node.mjs - Enterprise storage with metadata, analytics, and CRUD operations
  • insert_json.mjs - JSON file insertion with validation and hash generation
  • retrieve_json.mjs - JSON retrieval with integrity checking and formatting
  • sample.json - Test data for storage experiments

πŸ”¬ Testing & Experimentation

  • test_warehouse.mjs - NEW! Comprehensive 7-test validation suite for warehouse storage
  • working_demo.mjs - Functional DHT demonstration
  • simple_demo.mjs - Basic DHT operations
  • small_network.mjs - Controlled network testing
  • fixed_insert.mjs - Corrected insertion logic with proper bootstrapping

πŸ§ͺ Development & Debugging

  • debug_insert.mjs - Insertion debugging and troubleshooting
  • test_*.mjs - Various testing scenarios for network behavior

πŸš€ Quick Start Guide

1. Setup Network

# Start bootstrap node (terminal 1)
node examples/bootstrap.mjs

# Create 63-node storage network (terminal 2)  
node examples/network.mjs

2. Store JSON Data

# Insert JSON file into DHT
node insert_json.mjs sample.json
# Returns hash: fe55bb60efb0b18d59e644a55263b4be25d8c5217dfd55a7d18eb37f5958a494

3. Retrieve JSON Data

# Retrieve by hash
node retrieve_json.mjs fe55bb60efb0b18d59e644a55263b4be25d8c5217dfd55a7d18eb37f5958a494

# Save to file
node retrieve_json.mjs <hash> retrieved_data.json

4. Advanced Warehouse Storage ⭐ NEW!

# Start multiple warehouse nodes for distributed storage
node warehouse_node.mjs  # Warehouse 1
node warehouse_node.mjs  # Warehouse 2 (different terminal)
node warehouse_node.mjs  # Warehouse 3 (different terminal)

# Test the complete warehouse system
node test_warehouse.mjs
# Runs 7 comprehensive tests:
# βœ… Storage operations with metadata
# βœ… Retrieval with integrity validation
# βœ… Inventory listing and management
# βœ… Access tracking and analytics
# βœ… Deletion and lifecycle management
# βœ… Error handling and edge cases
# βœ… Multi-node compatibility

🏭 NEW: Advanced Warehouse Storage System ⭐

Major Improvements & Features

The warehouse system has been completely overhauled with enterprise-grade features for production distributed storage networks.

πŸ”§ Core Enhancements

  • Fixed DHT Protocol Handling: Proper token management and request/response cycles
  • Advanced Metadata Tracking: Access counts, timestamps, storage sizes, source tracking
  • Complete CRUD Operations: Store, Retrieve, List, Delete with full lifecycle management
  • Production-Ready Logging: Comprehensive operation tracking and debugging
  • Multi-Node Compatibility: Seamless operation across distributed warehouse networks

πŸ“Š New Commands & Capabilities

Command Code Purpose Features
STORE 100 Store data with metadata Size tracking, source attribution, timestamp
GET 101 Retrieve with access tracking Increments access count, logs retrieval
LIST 102 Inventory management Complete metadata listing, JSON export
DELETE 103 Lifecycle management Secure deletion with confirmation

πŸ§ͺ Comprehensive Testing Suite

The test_warehouse.mjs provides complete validation:

πŸ§ͺ WAREHOUSE STORAGE TEST RESULTS
==================================================
βœ… Passed: 7/7 - ALL TESTS PASSED!
πŸŽ‰ Warehouse storage is working correctly!

Tests Cover:
1. Storage Operations - Multi-format data storage
2. Retrieval Integrity - Data consistency validation  
3. Inventory Management - Metadata listing and tracking
4. Access Analytics - Usage pattern monitoring
5. Deletion Lifecycle - Secure data removal
6. Error Handling - Edge case resilience
7. Multi-Node Support - Distributed network compatibility

🌐 Multi-Warehouse Network Benefits

Fault Tolerance: Data survives individual node failures

# Run multiple warehouses for redundancy
node warehouse_node.mjs  # Port auto-assigned
node warehouse_node.mjs  # Port auto-assigned  
node warehouse_node.mjs  # Port auto-assigned
# Data automatically distributed across all nodes

Load Distribution: Requests balanced across network

// Clients automatically find optimal warehouse
const warehouses = await findAvailableWarehouses()
// DHT routes to nearest/fastest warehouse automatically

Geographic Scaling: Deploy warehouses globally

# Different geographic locations
Server-US:     node warehouse_node.mjs --location="us-east"
Server-Europe: node warehouse_node.mjs --location="eu-west"  
Server-Asia:   node warehouse_node.mjs --location="ap-south"
# Users connect to nearest warehouse automatically

πŸ“ˆ Real-World Applications

  • Enterprise Storage: Replace centralized databases with distributed warehouses
  • Content Distribution: Media files replicated across geographic locations
  • IoT Data Collection: Sensor data stored in regional warehouse clusters
  • Microservices Storage: Each service gets its own warehouse network
  • Web3 Applications: Decentralized storage without central authorities

πŸš€ Usage Examples

Store Any Data Type:

// JSON data
await store('user-config', JSON.stringify({theme: 'dark', lang: 'en'}))

// Binary files  
const fileBuffer = fs.readFileSync('document.pdf')
await store('document-123', fileBuffer)

// Plain text
await store('session-data', 'user-session-abc123')

Advanced Queries:

// Get complete inventory with metadata
const inventory = await listWarehouseInventory()
console.log(inventory) 
// Shows: keys, sizes, access counts, timestamps

// Track data usage
const stats = await getAccessStatistics('user-data-key')
console.log(`Accessed ${stats.count} times, last: ${stats.lastAccess}`)

Multi-Node Replication:

// Store in multiple warehouses for redundancy
const warehouses = await findWarehouseNodes()
for (const warehouse of warehouses.slice(0, 3)) {
  await storeInWarehouse(warehouse, criticalData)
}
// Data now exists in 3 different locations

πŸ” Key Discoveries & Insights

🧠 Conceptual Breakthroughs

  • Bootstrap Dependency: New nodes cannot join without accessible bootstrap nodes
  • Censorship Vulnerability: Authoritarian states can kill P2P networks by blocking bootstraps
  • Automatic Replication: Data is stored on multiple nodes automatically for fault tolerance
  • Content Addressing: Same content always has same hash-based address globally

πŸ›‘οΈ Security Realizations

  • Single Point of Failure: Bootstrap nodes are critical infrastructure
  • Defense Strategies: Multiple bootstraps, domain fronting, bridge networks needed
  • Attack Vectors: Network partitioning, bootstrap blocking, routing manipulation

πŸ—οΈ Architecture Patterns

  • Storage Node = Regular Node + Request Handlers + Local Storage
  • Warehouse Node = Storage Node + Metadata + Analytics + Lifecycle Management
  • Enterprise DHT = Basic DHT + Audit Trails + Performance Monitoring

πŸ“Š Production Capabilities Demonstrated

βœ… Multi-Node Data Replication

  • JSON stored across multiple nodes simultaneously
  • Fault-tolerant retrieval from any storing node
  • Network resilience with node failures

βœ… Content-Addressable Storage

  • SHA256-based content addressing
  • Global content discovery by hash
  • Data integrity verification

βœ… Enterprise Storage Features ⭐ ENHANCED!

  • Advanced Metadata Tracking: Timestamps, sizes, access counts, source attribution
  • Complete CRUD Operations: Store, GET, LIST, DELETE with lifecycle management
  • Real-time Analytics: Usage patterns, performance monitoring, access statistics
  • Comprehensive Logging: Operation audit trails and debug information
  • Multi-Node Orchestration: Automatic distribution and load balancing
  • Production Testing: 7-test validation suite ensuring reliability

πŸŽ“ Educational Value

This project serves as a complete DHT learning curriculum with:

  • πŸ“š Theory: Step-by-step concept building from basics to advanced topics
  • πŸ”§ Practice: Working code examples for every concept
  • πŸ§ͺ Experimentation: Tools for testing ideas and exploring edge cases
  • πŸ“Š Analysis: Real network behavior observation and documentation
  • 🏭 Production: Scalable patterns for real-world applications

Perfect for:

  • Computer Science Students learning distributed systems
  • Developers transitioning to P2P/blockchain development
  • System Architects designing decentralized applications
  • Researchers studying network resilience and censorship resistance

πŸ™ Credits & Attribution

Original DHT-RPC Library

This study project is built upon the excellent dht-rpc library:

  • Original Author: Mathias Buus (@mafintosh)
  • Original Repository: holepunchto/dht-rpc
  • License: MIT
  • Description: Production-grade Kademlia DHT implementation for Node.js

What This Study Adds

  • Educational layer: Step-by-step learning documentation
  • Practical examples: 19+ working demo files with different use cases
  • Production patterns: Enterprise storage architectures and scaling examples
  • Security analysis: Real-world attack vectors and defense strategies
  • JSON pipeline: Complete content storage and retrieval system

Study Project Attribution

  • Created by: storyteller
  • Learning Documentation: Original educational content and analysis
  • Example Scripts: Custom implementations for learning objectives
  • Architecture Patterns: Original warehouse and enterprise storage designs

πŸ“„ Original DHT-RPC Library Documentation

The core DHT-RPC library provides the foundation for this study. For complete API documentation, see the original repository.

Key Features of dht-rpc:

  • Remote IP / firewall detection
  • Easily add any command to your DHT
  • Streaming queries and updates
  • Production-ready Kademlia implementation
  • NAT traversal and network resilience

Basic Usage Pattern:

import DHT from 'dht-rpc'

// Create bootstrap node
const bootstrap = DHT.bootstrapper(10001, '127.0.0.1')

// Create DHT node  
const node = new DHT({ bootstrap: ['localhost:10001'] })

// Add request handling
node.on('request', (req) => {
  // Handle storage/retrieval requests
})

πŸ“ˆ Learning Outcomes Achieved

After completing this study, you will understand:

πŸ”§ Technical Skills

  • DHT node creation and configuration
  • P2P network bootstrap and discovery
  • Distributed storage implementation
  • Content addressing and hash-based routing

πŸ—οΈ System Design

  • Decentralized architecture patterns
  • Fault tolerance and replication strategies
  • Network security and attack mitigation
  • Production scaling and monitoring

πŸ’‘ Conceptual Understanding

  • P2P vs client-server trade-offs
  • Distributed consensus challenges
  • Censorship resistance mechanisms
  • Real-world deployment considerations

πŸš€ Next Steps & Applications

This study foundation enables building:

πŸ“± Applications

  • Distributed file systems
  • Censorship-resistant messaging
  • Peer-to-peer social networks
  • Blockchain light clients

πŸ”¬ Advanced Topics

  • Custom routing algorithms
  • Encrypted storage systems
  • Cross-network bridging
  • Performance optimization

πŸ“œ License

  • Original dht-rpc library: MIT License (Mathias Buus)
  • Study project additions: MIT License
  • Educational content: Available for learning and reference

πŸŽ‰ Ready to explore the decentralized future? Start with Phase 1 in step by step learning.md!

About

A study of dht to understand what is going on when setting a peer to peer network

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors