Skip to content

feat: Add Cube ADBC Driver for CubeSQL #2

Open
borodark wants to merge 73 commits intomainfrom
cleanup-take-II
Open

feat: Add Cube ADBC Driver for CubeSQL #2
borodark wants to merge 73 commits intomainfrom
cleanup-take-II

Conversation

@borodark
Copy link
Owner

@borodark borodark commented Dec 13, 2025

Adds a native ADBC driver for CubeSQL that connects to CubeSQL's Arrow Native server (port 8120), enabling high-performance binary data transfer for Elixir/Livebook and any ADBC-compatible client.

Part of a three-project ecosystem:

What this PR does

Implements a complete ADBC Cube driver in C++ that:

  1. Speaks CubeSQL's Arrow Native protocol - Binary Arrow IPC over TCP
  2. Returns Arrow RecordBatches directly - Zero-copy data transfer to Elixir/Python/R
  3. Handles all Arrow data types - Integers, floats, strings, timestamps, decimals
  4. Supports MEASURE() syntax - Cube's semantic layer aggregation functions
  5. Works with livebook-dev/adbc - Drop-in driver for existing ADBC infrastructure

Architecture

  ┌─────────────────────────────────────────────────────────────────┐
  │                    COMPLETE ECOSYSTEM                           │
  ├─────────────────────────────────────────────────────────────────┤
  │                                                                 │
  │  Elixir Application (power_of_three)                            │
  │         │                                                       │
  │         ▼                                                       │
  │  livebook-dev/adbc (Elixir bindings)                            │
  │         │                                                       │
  │         ▼                                                       │
  │  ┌─────────────────────────────────┐                            │
  │  │  ADBC Cube Driver (THIS PR)     │                            │
  │  │  c/driver/cube/                 │                            │
  │  │  - native_client.cc             │                            │
  │  │  - native_protocol.cc           │                            │
  │  │  - arrow_reader.cc              │                            │
  │  └─────────────────────────────────┘                            │
  │         │                                                       │
  │         │ Arrow IPC (Binary, Port 8120)                         │
  │         ▼                                                       │
  │  ┌─────────────────────────────────┐                            │
  │  │  CubeSQL Arrow Native Server    │                            │
  │  │  (cube-js/cube PR)              │                            │
  │  └─────────────────────────────────┘                            │
  │         │                                                       │
  │         ▼                                                       │
  │  Cube API → CubeStore / Source DB                               │
  │                                                                 │
  └─────────────────────────────────────────────────────────────────┘

Performance

Compared to REST HTTP API:

  ┌────────────┬────────────────────┬──────────┬─────────┐
  │ Query Size │ ADBC (this driver) │ REST API │ Speedup │
  ├────────────┼────────────────────┼──────────┼─────────┤
  │ 200 rows   │ 42ms               │ 1414ms   │ 33x     │
  ├────────────┼────────────────────┼──────────┼─────────┤
  │ 2K rows    │ 2ms                │ 1576ms   │ 788x    │
  ├────────────┼────────────────────┼──────────┼─────────┤
  │ 20K rows   │ 8ms                │ 2133ms   │ 266x    │
  └────────────┴────────────────────┴──────────┴─────────┘

Files Added

  Cube Driver (3rd_party/apache-arrow-adbc/c/driver/cube/):
  ├── cube.cc                 # Driver entry point
  ├── database.cc/h           # Connection management
  ├── connection.cc/h         # Session handling
  ├── statement.cc/h          # Query execution
  ├── native_client.cc/h      # TCP client for Arrow Native protocol
  ├── native_protocol.cc/h    # Wire protocol implementation
  ├── arrow_reader.cc/h       # Arrow IPC deserialization
  ├── cube_types.cc/h         # Type mapping (Arrow ↔ Cube)
  └── format/generated/       # FlatBuffers Arrow schema

  Elixir Tests (test/):
  ├── adbc_cube_basic_test.exs    # Basic connectivity and queries
  └── cube_preagg_benchmark.exs   # Performance benchmarks

  C++ Tests (tests/cpp/):
  ├── test_simple.cpp             # Basic driver tests
  ├── test_all_types.cpp          # Data type coverage
  ├── test_cube_integration.cpp   # End-to-end with CubeSQL
  └── test_error_handling.cpp     # Error scenarios

Usage

Elixir/Livebook:

  # Connect to CubeSQL Arrow Native server
  {:ok, db} = Adbc.Database.start_link(
    driver: :cube,
    uri: "cube://localhost:8120"
  )

  {:ok, conn} = Adbc.Connection.start_link(database: db)

  # Execute query - returns Arrow data directly
  {:ok, result} = Adbc.Connection.query(conn, """
    SELECT market_code, MEASURE(count) as total
    FROM orders
    GROUP BY market_code
  """)

  # Zero-copy conversion to Elixir
  rows = Adbc.Result.to_list(result)

Python:

  import adbc_driver_cube.dbapi as cube

  conn = cube.connect("cube://localhost:8120")
  cursor = conn.cursor()
  cursor.execute("SELECT * FROM orders LIMIT 1000")
  df = cursor.fetch_arrow_table().to_pandas()

Data Types Supported

  ┌────────────────┬───────────┬────────┐
  │   Arrow Type   │ Cube Type │ Status │
  ├────────────────┼───────────┼────────┤
  │ Int8/16/32/64  │ Integer   │ ✅     │
  ├────────────────┼───────────┼────────┤
  │ UInt8/16/32/64 │ Integer   │ ✅     │
  ├────────────────┼───────────┼────────┤
  │ Float32/64     │ Float     │ ✅     │
  ├────────────────┼───────────┼────────┤
  │ Decimal128     │ Decimal   │ ✅     │
  ├────────────────┼───────────┼────────┤
  │ Utf8/LargeUtf8 │ String    │ ✅     │
  ├────────────────┼───────────┼────────┤
  │ Timestamp      │ DateTime  │ ✅     │
  ├────────────────┼───────────┼────────┤
  │ Date32/64      │ Date      │ ✅     │
  ├────────────────┼───────────┼────────┤
  │ Boolean        │ Boolean   │ ✅     │
  ├────────────────┼───────────┼────────┤
  │ Binary         │ Binary    │ ✅     │
  └────────────────┴───────────┴────────┘

Testing

  # Build the driver
  make

  # Run Elixir tests (requires CubeSQL running)
  mix test test/adbc_cube_basic_test.exs

  # Run C++ tests
  cd tests/cpp
  ./compile.sh
  ./run.sh

Related Projects

┌─────────────────────────┬──────────────────────────────┬───────────────────────────────────────────────────┐
│         Project         │             Role             │                       Link                        │
├─────────────────────────┼──────────────────────────────┼───────────────────────────────────────────────────┤
│ cube-js/cube            │ Arrow Native server          │ https://github.com/cube-js/cube/pull/10297        │
├─────────────────────────┼──────────────────────────────┼───────────────────────────────────────────────────┤

│ borodark/adbc           │ ADBC driver (this PR)        │ https://github.com/borodark/adbc/pull/2           │

├─────────────────────────┼──────────────────────────────┼───────────────────────────────────────────────────┤

│ borodark/power_of_three │ Elixir semantic layer client │ https://github.com/borodark/power_of_three/pull/5 │  

├─────────────────────────┼──────────────────────────────┼───────────────────────────────────────────────────┤
│ livebook-dev/adbc       │ Upstream ADBC bindings       │ https://github.com/livebook-dev/adbc              │

└─────────────────────────┴──────────────────────────────┴───────────────────────────────────────────────────┘

Why ADBC?

ADBC is becoming the standard for high-performance database connectivity in the Arrow ecosystem:

  • Zero-copy data transfer - Arrow buffers passed directly to client
  • Language-agnostic - Same driver works for Python, R, Elixir, Julia
  • Growing ecosystem - Livebook, pandas, DuckDB all support ADBC
  • Future-proof - Apache Arrow project backing

Checklist

  • Driver compiles and links
  • All Arrow data types supported
  • Elixir tests pass
  • C++ tests pass
  • Works with CubeSQL Arrow Native server
  • MEASURE() syntax supported
  • Error handling implemented

@borodark borodark mentioned this pull request Dec 29, 2025
11 tasks
@borodark borodark changed the title Cleanup take ii ADBC Driver For Cube Dec 29, 2025
@borodark borodark changed the title ADBC Driver For Cube ADBC Driver For Cube ADBC Server Jan 5, 2026
@borodark borodark changed the title ADBC Driver For Cube ADBC Server ADBC Client For Cube ADBC Server Jan 8, 2026
@borodark borodark changed the title ADBC Client For Cube ADBC Server feat: Add Cube ADBC Driver for CubeSQL Jan 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant