-
Notifications
You must be signed in to change notification settings - Fork 42
feat(r/sedonadb): Add basic DataFrame API with sd_select(), sd_transmute(), and sd_filter()
#499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3da7864 to
c72fb81
Compare
sd_select(), sd_transmute(), and sd_filter()
sd_select(), sd_transmute(), and sd_filter()sd_select(), sd_transmute(), and sd_filter()
|
why do we need |
...probably a better example would be just |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements basic DataFrame manipulation functions (sd_select(), sd_transmute(), and sd_filter()) for the SedonaDB R package, wrapping the expression translation system introduced in a previous PR. These functions provide a familiar dplyr-like API for column selection, transformation, and row filtering.
Changes:
- Added three new exported functions (
sd_select(),sd_transmute(),sd_filter()) in R/dataframe.R with corresponding Rust implementations - Updated documentation to consistently describe
.dataparameter as "A sedonadb_dataframe or an object that can be coerced to one" - Added comprehensive test coverage for the new DataFrame API functions
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| r/sedonadb/R/dataframe.R | Implements sd_select(), sd_transmute(), and sd_filter() functions with expression translation support |
| r/sedonadb/src/rust/src/dataframe.rs | Adds Rust methods select() and filter() to InternalDataFrame for expression-based operations |
| r/sedonadb/src/rust/src/expression.rs | Makes exprs() method public to support DataFrame operations |
| r/sedonadb/tests/testthat/test-dataframe.R | Adds test cases for the three new DataFrame functions |
| r/sedonadb/R/000-wrappers.R | Auto-generated wrapper functions for new Rust methods |
| r/sedonadb/src/rust/api.h | Auto-generated C API declarations |
| r/sedonadb/src/init.c | Auto-generated C initialization code |
| r/sedonadb/NAMESPACE | Exports new functions |
| r/sedonadb/man/*.Rd | Documentation files for new and updated functions |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

This PR implements
sd_select(),sd_transmute(), andsd_filter()wrapping the expression translation implemented in #468. The supported expressions are still very minimal but this establishes the first API we can expose in this way.I chose to do this instead of just implementing
dplyr::transmute()anddplyr::filter()because those functions have other arguments and perhaps the expectation of exact compatibility. Thesd_...()versions have the added benefit of converting to a SedonaDB data frame for you and usually it's a good idea for this to be explicit (particularly for now).This doesn't support aggregate expressions in the arguments, which does work in SQL and in dplyr. I took a stab at translating the DataFusion assembler of SELECT statements and it does work but is a bit more complicated and needs more testing than I have time to put together right now ( https://gist.github.com/paleolimbot/de220c55c96e721a50a4752397f1cbf9 ).
The next step is to add blanket support for all functions in the sedona-specific function registry so that we can do geo stuff.
Also,
sd_join()would be particularly useful to expose the (arguably) most useful part of SedonaDB as an engine.Created on 2026-01-23 with reprex v2.1.1