A Rust implementation of a BitTorrent metainfo file (.torrent) parser using the Bencode format. This project demonstrates parser combinators with the nom library and provides detailed educational comments explaining Rust concepts.
- Complete Bencode format parser supporting all data types:
- Integers (
i42e) - Strings (
4:spam) - Lists (
l4:spami42ee) - Dictionaries (
d4:spami42ee)
- Integers (
- Extracts comprehensive torrent file metadata
- Supports both single-file and multi-file torrents
- Detailed error reporting with hex dump of unparsed data
- Educational code comments with Q&A format
- Comprehensive test suite
- Rust (Edition 2024)
- nom (8.0.0) - Parser combinator library
- humantime (2.2.0) - Human-readable time formatting
- Rust 1.70 or higher
- Cargo (comes with Rust)
# Clone the repository
git clone git@github.com:gvpaleev/BitTorrentParserRust.git
cd BitTorrentParserRust
# Build the project
cargo build --release
# Run the application
cargo run --releasePlace a .torrent file named ubuntu.torrent in the project root directory and run:
cargo runFile size: 245503 bytes
First 50 bytes: [100, 56, 58, 97, 110, 110, 111, 117, 110, 99, ...]
Parsed .torrent file successfully!
=== TORRENT FILE INFO ===
Announce URL: https://torrent.ubuntu.com/announce
Creation date: 2023-10-12T12:00:00Z (timestamp: 1697112000)
Comment: Ubuntu CD releases.ubuntu.com
Created by: mktorrent 1.1
=== FILE INFO ===
Name: ubuntu-23.10-desktop-amd64.iso
Single file
File size: 5234567890 bytes (4991.23 MB)
Piece length: 262144 bytes (256.00 KB)
Number of pieces: 19968
BitTorrentParserRust/
├── src/
│ └── main.rs # Main parser implementation
├── Cargo.toml # Project dependencies
├── Cargo.lock # Dependency lock file
├── ubuntu.torrent # Sample torrent file
├── ubuntu.dec # Decoded torrent data
├── launch.json # VSCode debug configuration
└── README.md # This file
Represents all possible Bencode data types:
String(Vec<u8>)- Byte stringsInteger(i64)- Signed integersList(Vec<BencodeValue>)- Lists of valuesDictionary(HashMap<Vec<u8>, BencodeValue>)- Key-value maps
parse_integer()- Parses integers in formati<number>eparse_string()- Parses strings in format<length>:<data>parse_list()- Parses lists in formatl<values>eparse_dictionary()- Parses dictionaries in formatd<key><value>...eparse_bencode_value()- Main dispatcher usingaltcombinator
get_string()- Extracts byte stringget_string_utf8()- Converts to UTF-8 stringget_integer()- Extracts integer valueget_dict()- Extracts dictionaryget_list()- Extracts list
Bencode is a simple binary serialization format used by BitTorrent:
| Type | Format | Example | Result |
|---|---|---|---|
| Integer | i<number>e |
i42e |
42 |
| String | <length>:<data> |
4:spam |
"spam" |
| List | l<values>e |
l4:spami42ee |
["spam", 42] |
| Dictionary | d<key><value>...e |
d4:spami42ee |
{"spam": 42} |
- Dictionary keys must be strings
- Dictionary keys must be sorted lexicographically
- Strings can contain arbitrary binary data
- Integers can be negative
Run the test suite:
cargo test- Integer parsing (positive and negative)
- String parsing
- List parsing
- Dictionary parsing
- Edge cases and error handling
The code includes extensive educational comments in a Q&A format:
- Explains Rust concepts (ownership, borrowing, traits)
- Describes nom parser combinators
- Details BitTorrent protocol specifics
- Clarifies design decisions
Perfect for learning:
- Parser combinators with nom
- Binary format parsing
- Rust error handling
- BitTorrent protocol internals
- Announce URL - Primary tracker URL
- Announce List - Backup tracker tiers
- Creation Date - Unix timestamp with human-readable format
- Comment - Torrent description
- Created By - Client software used
- File Name - Name of file or directory
- Piece Length - Size of each piece in bytes
- Number of Pieces - Total piece count
- File Size - Total size with MB conversion
- File List - For multi-file torrents (first 10 files shown)
The parser provides detailed error messages:
- Shows unparsed data in hex format
- Displays ASCII representation when possible
- Indicates exact position of parse failures
- Handles incomplete input gracefully
- Zero-copy parsing where possible
- Efficient use of Rust's ownership system
- HashMap for O(1) dictionary lookups
- Minimal allocations during parsing
Contributions are welcome! Areas for improvement:
- Support for additional torrent extensions
- Bencode encoder implementation
- Streaming parser for large files
- CLI argument parsing for custom file paths
- Additional output formats (JSON, YAML)
This project is provided as-is for educational purposes.
Created as an educational project demonstrating Rust parser implementation and BitTorrent protocol understanding.