Skip to content

gvpaleev/BitTorrentParserRust

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BitTorrent Parser Rust

A Rust implementation of a BitTorrent metainfo file (.torrent) parser using the Bencode format. This project demonstrates parser combinators with the nom library and provides detailed educational comments explaining Rust concepts.

Features

  • Complete Bencode format parser supporting all data types:
    • Integers (i42e)
    • Strings (4:spam)
    • Lists (l4:spami42ee)
    • Dictionaries (d4:spami42ee)
  • Extracts comprehensive torrent file metadata
  • Supports both single-file and multi-file torrents
  • Detailed error reporting with hex dump of unparsed data
  • Educational code comments with Q&A format
  • Comprehensive test suite

Technologies

  • Rust (Edition 2024)
  • nom (8.0.0) - Parser combinator library
  • humantime (2.2.0) - Human-readable time formatting

Installation

Prerequisites

  • Rust 1.70 or higher
  • Cargo (comes with Rust)

Building from Source

# Clone the repository
git clone git@github.com:gvpaleev/BitTorrentParserRust.git
cd BitTorrentParserRust

# Build the project
cargo build --release

# Run the application
cargo run --release

Usage

Place a .torrent file named ubuntu.torrent in the project root directory and run:

cargo run

Example Output

File size: 245503 bytes
First 50 bytes: [100, 56, 58, 97, 110, 110, 111, 117, 110, 99, ...]

Parsed .torrent file successfully!

=== TORRENT FILE INFO ===
Announce URL: https://torrent.ubuntu.com/announce
Creation date: 2023-10-12T12:00:00Z (timestamp: 1697112000)
Comment: Ubuntu CD releases.ubuntu.com
Created by: mktorrent 1.1

=== FILE INFO ===
Name: ubuntu-23.10-desktop-amd64.iso
Single file
File size: 5234567890 bytes (4991.23 MB)
Piece length: 262144 bytes (256.00 KB)
Number of pieces: 19968

Project Structure

BitTorrentParserRust/
├── src/
│   └── main.rs          # Main parser implementation
├── Cargo.toml           # Project dependencies
├── Cargo.lock           # Dependency lock file
├── ubuntu.torrent       # Sample torrent file
├── ubuntu.dec           # Decoded torrent data
├── launch.json          # VSCode debug configuration
└── README.md            # This file

Code Architecture

Core Components

BencodeValue Enum

Represents all possible Bencode data types:

  • String(Vec<u8>) - Byte strings
  • Integer(i64) - Signed integers
  • List(Vec<BencodeValue>) - Lists of values
  • Dictionary(HashMap<Vec<u8>, BencodeValue>) - Key-value maps

Parser Functions

  • parse_integer() - Parses integers in format i<number>e
  • parse_string() - Parses strings in format <length>:<data>
  • parse_list() - Parses lists in format l<values>e
  • parse_dictionary() - Parses dictionaries in format d<key><value>...e
  • parse_bencode_value() - Main dispatcher using alt combinator

Helper Methods

  • get_string() - Extracts byte string
  • get_string_utf8() - Converts to UTF-8 string
  • get_integer() - Extracts integer value
  • get_dict() - Extracts dictionary
  • get_list() - Extracts list

Bencode Format Specification

Bencode is a simple binary serialization format used by BitTorrent:

Type Format Example Result
Integer i<number>e i42e 42
String <length>:<data> 4:spam "spam"
List l<values>e l4:spami42ee ["spam", 42]
Dictionary d<key><value>...e d4:spami42ee {"spam": 42}

Rules

  • Dictionary keys must be strings
  • Dictionary keys must be sorted lexicographically
  • Strings can contain arbitrary binary data
  • Integers can be negative

Testing

Run the test suite:

cargo test

Test Coverage

  • Integer parsing (positive and negative)
  • String parsing
  • List parsing
  • Dictionary parsing
  • Edge cases and error handling

Educational Features

The code includes extensive educational comments in a Q&A format:

  • Explains Rust concepts (ownership, borrowing, traits)
  • Describes nom parser combinators
  • Details BitTorrent protocol specifics
  • Clarifies design decisions

Perfect for learning:

  • Parser combinators with nom
  • Binary format parsing
  • Rust error handling
  • BitTorrent protocol internals

Torrent File Information Extracted

  • Announce URL - Primary tracker URL
  • Announce List - Backup tracker tiers
  • Creation Date - Unix timestamp with human-readable format
  • Comment - Torrent description
  • Created By - Client software used
  • File Name - Name of file or directory
  • Piece Length - Size of each piece in bytes
  • Number of Pieces - Total piece count
  • File Size - Total size with MB conversion
  • File List - For multi-file torrents (first 10 files shown)

Error Handling

The parser provides detailed error messages:

  • Shows unparsed data in hex format
  • Displays ASCII representation when possible
  • Indicates exact position of parse failures
  • Handles incomplete input gracefully

Performance Considerations

  • Zero-copy parsing where possible
  • Efficient use of Rust's ownership system
  • HashMap for O(1) dictionary lookups
  • Minimal allocations during parsing

Contributing

Contributions are welcome! Areas for improvement:

  • Support for additional torrent extensions
  • Bencode encoder implementation
  • Streaming parser for large files
  • CLI argument parsing for custom file paths
  • Additional output formats (JSON, YAML)

License

This project is provided as-is for educational purposes.

References

Author

Created as an educational project demonstrating Rust parser implementation and BitTorrent protocol understanding.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages