🚀 1BRC - One Billion Row Challenge (Rust) 🚀

1BRC - One Billion Row Challenge

This repository contains a Rust implementation of the One Billion Row Challenge. The goal is to efficiently parse and process a large dataset with one billion rows, leveraging Rust's performance capabilities. The challenge is to read a text file with 1 billion rows, each containing a mapping of a string to a floating point number (this is meant to signify a key-value pair, where the key is a weather station and the value is the temperature registered by that station at any given time). Once read, the program should calculate the minimum, maximum, and average temperature for each weather station and output the results into a JSON file (output.json), with the mappings being the weather station name and the values being a string with the format min/max/avg.

Input Format

The input should look a little like this:

...
Juba;9.2
Dar es Salaam;26.1
Honiara;22.4
San Salvador;6.9
Nashville;21.9
Vientiane;29.4
Edinburgh;22
Gaborone;37.2
...

Output Format

The output should look like this:

{
  ...
  Kankan=-22.2/26.5/76.4,
  Kano=-35.8/26.4/83.6,
  Kansas City=-23.0/12.5/45.2,
  Karachi=-26.4/26.0/77.2,
  Karonga=-23.9/24.4/72.7,
  Kathmandu=-42.6/18.3/75.2,
  Khartoum=-14.2/29.9/80.4,
  Kingston=-34.3/27.4/86.2,
  Kinshasa=-17.4/25.3/71.3,
  ...
}

⚡ Blazing fast SIMD-powered data cruncher! ⚡

	Cold Cache	Warm Cache
⏱️ Performance	6-10 seconds	1.8-2.0 seconds

Latest Benchmarks:

Calculations only: ~1.814s

Full challenge: ~1.820s

Benchmarks were run on my MacBook Pro 14" M3 Max with 36GB Unified Memory and 14 CPU cores (10 performance, 4 efficiency).

✨ Features

SIMD acceleration for ultra-fast parsing (responsible for around 10-20% of the performance gain)
Multi-threaded processing (responsible for most of the performance gain) using rayon
Optimized HashMap for fast lookups using hashbrown and ahash
Optimized float32 parsing by pretending it is a i16, multiplied by 10.
Efficient memory management
Handles 1 billion+ rows efficiently and quickly
Benchmark suite with Criterion for accurate performance measurements

🚀 Usage

# Run in release mode
cargo run --release

# Run benchmarks
cargo bench

# Generate sample data (e.g., 1000 rows)
cargo run --example generate 1000

# To generate the 1B challenge data, use:
cargo run --example generate 1000000000

📂 Project Structure

src/ - Core source code
benches/ - Benchmarks
examples/ - Data generators
1b_measurements.txt - Input data (ignored in git)

❤️ Contributing

PRs welcome! Please benchmark your changes.

📜 License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
benches		benches
examples		examples
src		src
tests		tests
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🚀 1BRC - One Billion Row Challenge (Rust) 🚀

1BRC - One Billion Row Challenge

Input Format

Output Format

⚡ Blazing fast SIMD-powered data cruncher! ⚡

✨ Features

🚀 Usage

📂 Project Structure

❤️ Contributing

📜 License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

GustavoWidman/1brc

Folders and files

Latest commit

History

Repository files navigation

🚀 1BRC - One Billion Row Challenge (Rust) 🚀

1BRC - One Billion Row Challenge

Input Format

Output Format

⚡ Blazing fast SIMD-powered data cruncher! ⚡

✨ Features

🚀 Usage

📂 Project Structure

❤️ Contributing

📜 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages