PardoX — Hyper-Fast Data Engine

The Speed of Rust. The Simplicity of Python.

PardoX is a high-performance DataFrame engine for modern ETL and analytics. A Rust core powers SDKs in Python, Node.js, and PHP, with native database I/O, an ultra-fast binary format, and out-of-core processing for datasets larger than RAM.

v0.3.2 is now available. PRDX Streaming to PostgreSQL (150M rows validated), GroupBy, Window Functions, String & Date ops, Lazy Pipeline, SQL over DataFrames, Encryption, Data Contracts, Time Travel, Arrow Flight, Linear Algebra, REST Connector, Cloud Storage — 29 feature gaps total.

✅ What's New in v0.3.2

Feature	Status
PRDX Streaming to PostgreSQL	Stream `.prdx` → PostgreSQL via `COPY FROM STDIN` with O(block) RAM. Validated: 150M rows / 3.8 GB in ~490s at ~306k rows/s
GroupBy Aggregation (Gap 1)	`df.groupby(col, {col: agg})` — sum, mean, count, min, max, std — Python, JS, PHP
String & Date Operations (Gap 2)	`str_upper`, `str_lower`, `str_contains`, `date_extract`, `date_diff`, `date_add` — all SDKs
Decimal Type (Gap 3)	Native Decimal128 column type with configurable precision and scale
Window Functions (Gap 4)	`row_number`, `rank`, `lag`, `lead`, `rolling_mean` — all SDKs
Lazy Pipeline (Gap 5)	`scan_csv().select().filter().limit().collect()` — all SDKs
SQL over DataFrames (Gap 14)	`df.sql("SELECT ... FROM df ...")` — all SDKs
Out-of-Core Processing (Gap 11)	`chunked_groupby`, `external_sort`, `spill_to_disk` — handles datasets > RAM
Streaming GroupBy on .prdx (Gap 13)	`prdx_groupby()` — O(groups) memory on any file size
Encryption (Gap 18)	`write_prdx_encrypted` / `read_prdx_encrypted`
Data Contracts (Gap 19)	`df.validate_contract(schema_json)` — row-level validation
Time Travel (Gap 20)	`version_write` / `version_read` / `version_list` — snapshot history
Arrow Flight (Gap 21)	`pardox_flight_start` / `pardox_flight_read` — high-throughput Arrow transport
Linear Algebra (Gap 28)	`cosine_sim`, `l2_normalize`, `matmul`, `pca`
REST Connector (Gap 29)	`read_rest(url, method, headers_json)` → DataFrame
Cloud Storage (Gap 15)	`read_cloud_csv` from S3, GCS, Azure
29 Gaps Total	All 29 feature gaps implemented in the Rust core across Python, JS, PHP

✅ Core Capabilities (since v0.3.1)

Zero-Copy Architecture: Rust HyperBlock buffers with no intermediate Python/JS/PHP objects.
SIMD + Multithreading: AVX2/NEON vectorized ops for 5x–20x speedups.
Native Database I/O: PostgreSQL, MySQL, SQL Server, MongoDB — no psycopg2, no pymysql.
.prdx format: ~4.6 GB/s read throughput — faster than Parquet for repeated workloads.
GPU Sort: sort_values(gpu=True) — WebGPU Bitonic sort with CPU fallback.
ML Ready: Zero-copy NumPy bridge via __array__ protocol.
Multi-SDK: One Rust core, identical API in Python, Node.js, and PHP.

📦 Quick Install

Python

pip install pardox

Node.js

npm i @pardox/pardox

PHP

composer require betoalien/pardox-php

🚀 Quick Start

import pardox as px

# Load 100k rows — parallel Rust CSV parser
df = px.read_csv("sales.csv")
print(f"{df.shape[0]:,} rows × {df.shape[1]} columns")

# GroupBy — pure Rust
grouped = df.groupby("state", {"revenue": "sum", "qty": "count"})

# Stream 150M rows to PostgreSQL with O(block) RAM
rows = px.write_sql_prdx(
    "sales_150m.prdx",
    "postgresql://user:pass@localhost:5432/db",
    "sales", mode="append", conflict_cols=[], batch_rows=1_000_000
)

# Out-of-core: GroupBy on .prdx without loading all rows into RAM
result = px.prdx_groupby("sales_150m.prdx", ["region"], {"revenue": "sum"})

📊 Benchmarks (v0.3.2)

Operation	Baseline	PardoX v0.3.2	Speedup
Read CSV (1 GB)	Pandas ~4.2s	~0.8s	5x
Column multiply (1M rows)	Pandas ~0.15s	~0.02s	7.5x
PostgreSQL write 50k rows	psycopg2 ~18s	~0.6s (COPY)	30x
MySQL write 50k rows	pymysql ~22s	~3s (batch INSERT)	7x
PRDX -> PostgreSQL 150M rows	N/A	~490s	306k rows/s

📚 Documentation

Full docs: https://www.pardox.io
Repository: https://github.com/betoalien/PardoX

🧭 Community

X (Twitter): https://x.com/pardox_io

📄 License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github		.github
docs		docs
js_validation		js_validation
php_validation/pardox		php_validation/pardox
py_validation		py_validation
.gitignore		.gitignore
README.md		README.md
mkdocs.yml		mkdocs.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PardoX — Hyper-Fast Data Engine

✅ What's New in v0.3.2

✅ Core Capabilities (since v0.3.1)

📦 Quick Install

Python

Node.js

PHP

🚀 Quick Start

📊 Benchmarks (v0.3.2)

📚 Documentation

🧭 Community

📄 License

About

Uh oh!

Releases 1

Sponsor this project

Uh oh!

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

PardoX — Hyper-Fast Data Engine

✅ What's New in v0.3.2

✅ Core Capabilities (since v0.3.1)

📦 Quick Install

Python

Node.js

PHP

🚀 Quick Start

📊 Benchmarks (v0.3.2)

📚 Documentation

🧭 Community

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages