Skip to content

feat: Add Arrow Native (ADBC) Server Protocol#10297

Open
borodark wants to merge 105 commits intocube-js:masterfrom
borodark:feature/arrow-ipc-api
Open

feat: Add Arrow Native (ADBC) Server Protocol#10297
borodark wants to merge 105 commits intocube-js:masterfrom
borodark:feature/arrow-ipc-api

Conversation

@borodark
Copy link

@borodark borodark commented Jan 8, 2026

Check List

  • Tests have been run in packages where changes have been made if available
  • Linter has been run for changed code
  • Tests for the changes have been added if not covered yet
  • Docs have been added / updated if required

Adds an Arrow Native server to CubeSQL that speaks Arrow IPC protocol on port 8120, enabling 8-15x faster data transfer compared to the REST HTTP API.

Closes #10296

What this PR does

  • Arrow Native server on configurable port (default: 8120)
  • Binary Arrow IPC protocol - no JSON serialization overhead
  • Optional query result cache - additional 3-10x speedup on repeated queries
  • Works with any ADBC client - Python, Elixir, R, etc.

Architecture

  Client (Python/Elixir/R via ADBC)
           │
           ├─── REST HTTP (Port 4008) - existing
           │    └─> JSON serialization → Cube API
           │
           └─── Arrow Native (Port 8120) - NEW
                └─> Binary Arrow IPC
                     └─> Optional Results Cache
                          └─> Cube API

  Performance

  | Query Size | Arrow Native | REST API | Speedup |
  |------------|--------------|----------|---------|
  | 200 rows   | 42ms         | 1414ms   | 33x     |
  | 2K rows    | 2ms          | 1576ms   | 788x    |
  | 20K rows   | 8ms          | 2133ms   | 266x    |

Configuration

  # Enable Arrow Native server (enabled by default when port is set)
  CUBEJS_ADBC_PORT=8120

  # Optional query result cache
  CUBESQL_ARROW_RESULTS_CACHE_ENABLED=true      # default: true
  CUBESQL_ARROW_RESULTS_CACHE_MAX_ENTRIES=1000  # default: 1000
  CUBESQL_ARROW_RESULTS_CACHE_TTL=3600          # default: 3600s

Files Changed

Core Implementation (rust/cubesql/cubesql/src/):

  • sql/arrow_native/server.rs - Arrow Native server
  • sql/arrow_native/protocol.rs - Wire protocol
  • sql/arrow_native/stream_writer.rs - Arrow IPC streaming
  • sql/arrow_native/cache.rs - Query result cache
  • config/mod.rs - Configuration and DI

Integration:

  • packages/cubejs-backend-shared/src/env.ts - Environment variables
  • packages/cubejs-server-core/ - Server initialization
  • docs/ - Environment variable documentation

Example (examples/recipes/arrow-ipc/):

  • Complete working example with Python tests
  • Sample data (3000 orders)
  • Performance benchmarks

Testing

  # Unit tests
  cd rust/cubesql
  cargo test arrow_native

  # Integration test with example
  cd examples/recipes/arrow-ipc
  docker-compose up -d postgres
  ./setup_test_data.sh
  ./start-cube-api.sh &
  ./start-cubesqld.sh &
  python test_arrow_native_performance.py

Ecosystem Compatibility

Tested with:

Checklist

  • Code compiles without warnings (cargo clippy)
  • Code is formatted (cargo fmt)
  • Unit tests pass (cargo test)
  • Example works end-to-end
  • Documentation updated
  • No breaking changes to existing APIs

@borodark borodark requested review from a team as code owners January 8, 2026 18:54
@github-actions github-actions bot added cube store Issues relating to Cube Store rust Pull requests that update Rust code javascript Pull requests that update Javascript code python pr:community Contribution from Cube.js community members. labels Jan 8, 2026
@igorlukanin igorlukanin self-assigned this Jan 9, 2026
@borodark borodark force-pushed the feature/arrow-ipc-api branch 2 times, most recently from 14f9efa to 6f1c53f Compare January 12, 2026 18:55
borodark and others added 29 commits January 13, 2026 17:47
PostgreSQL wire protocol (port 4444) was already working.
This PR specifically introduces:
- Arrow IPC native protocol (port 4445)
- Optional query result cache
Port 4444 (PostgreSQL wire protocol) was already there.
Port 4445 (Arrow IPC native) is what this PR introduces.
… MetaContext::new()

Upstream added a second parameter `pre_aggregations: Vec<PreAggregationMeta>`
to MetaContext::new() but the call in transport.rs wasn't updated.

This fix:
- Imports parse_pre_aggregations_from_cubes() function
- Extracts pre-aggregations from cube metadata before creating MetaContext
- Passes pre_aggregations as the 2nd parameter to MetaContext::new()

Matches the implementation in cubesql's cubestore_transport.rs and service.rs.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@borodark borodark force-pushed the feature/arrow-ipc-api branch from 6f1c53f to 6fe781d Compare January 13, 2026 22:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cube store Issues relating to Cube Store javascript Pull requests that update Javascript code pr:community Contribution from Cube.js community members. python rust Pull requests that update Rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Arrow Native (ADBC) Server Protocol for High-Performance Data Access

2 participants