Skip to content

fix: replace docs, upcasting ints, sf geo encoding#7

Merged
max-romagnoli merged 10 commits intomasterfrom
max/tx-replace-fix
Mar 13, 2026
Merged

fix: replace docs, upcasting ints, sf geo encoding#7
max-romagnoli merged 10 commits intomasterfrom
max/tx-replace-fix

Conversation

@max-romagnoli
Copy link
Contributor

@max-romagnoli max-romagnoli commented Mar 11, 2026

  • Updated documentations for replace
  • Upcasting dataframe from int32 to int64 (R does not support int64)
  • Encoding sf geometries in WKT or WKB depending on schema

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the transaction/cursor workflow to support row-by-row replace/edit/insert patterns by adding a rows() materializer on cursors and allowing OdpTransaction$insert() to accept named lists (single-row “row dictionaries”).

Changes:

  • Add OdpCursor$rows() returning a list of per-row named lists.
  • Allow odp_to_arrow_table() (and thus tx$insert()) to accept named list input.
  • Update docs, examples, and tests around replace + row iteration; bump package version.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
R/cursor.R Adds rows() helper that materializes cursor results as row dictionaries.
R/utils.R Extends Arrow table coercion to accept named list input.
tests/testthat/test-cursor.R Adds tests for cursor$rows() behavior.
tests/testthat/test-table_tx.R Updates list input validation tests; adds replace + row-iteration workflow test.
scripts/example_quick_start.R Demonstrates replace + row iteration in a transaction.
man/OdpCursor.Rd Documents the new rows() method (but currently malformed Rd output).
man/OdpTransaction.Rd Updates insert docs and adds replace example.
R/r6-docs.R Roxygen source updates for cursor/transaction docs.
DESCRIPTION Version bump to 0.1.3.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +279 to +280

expect_true(TRUE)
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't assert any observable outcome beyond expect_true(TRUE), so it won't catch regressions (e.g., that tx$insert(row) + tx$commit() successfully flushes, or that the mock client saw the expected insert/commit calls). Consider adding an assertion that the loop produces no errors and/or verifying the fake client's call_log contains the expected request(s) after committing.

Suggested change
expect_true(TRUE)
tx$commit()
expect_true(length(client$call_log) >= 1L)

Copilot uses AI. Check for mistakes.
man/OdpCursor.Rd Outdated
Comment on lines +14 to +16
\item{$next_batch()$}{Return the next \code{RecordBatch} or \code{NULL} when finished.}
\item{$collect()\code{/}$arrow()\verb{\}\{Materialise unread batches as an Arrow Table.\} \\item\{$dataframe()}}{Materialise unread batches as a base \code{data.frame}.}
\item{$tibble()`}{Materialise unread batches as a tibble (optional
dependency).}
\item{$tibble()\verb{\}\{Materialise unread batches as a tibble (optional dependency).\} \\item\{$rows()}}{Materialise unread batches as a list of named lists (row dictionaries).}
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this generated Rd, the \describe{} / \item{} markup for method docs is malformed (e.g., the $collect()/$arrow() and $tibble()/$rows() entries appear to be merged and contain stray \verb{} / \\item sequences). This is likely to break R CMD check Rd parsing. Please fix the roxygen source in R/r6-docs.R for the OdpCursor Methods section (avoid backticks around $method() and ensure each method has its own \item{label}{description}), then re-run roxygen to regenerate this file.

Copilot uses AI. Check for mistakes.

\describe{
\item{$insert(data)$}{Validate and buffer a data frame. Auto-flushes on size thresholds.}
\item{$insert(data)$}{Validate and buffer a data frame, named list (single row), or Arrow Table. Auto-flushes on size thresholds.}
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs say named list input is a "single row", but the current conversion path accepts any named list and as.data.frame() will happily treat vector elements as full columns (potentially inserting multiple rows). Either enforce single-row named lists (e.g., all elements length 1) or adjust the wording here to reflect the actual accepted shape (named list-as-columns).

Suggested change
\item{$insert(data)$}{Validate and buffer a data frame, named list (single row), or Arrow Table. Auto-flushes on size thresholds.}
\item{$insert(data)$}{Validate and buffer a data frame, named list of columns (treated like a data frame), or Arrow Table. Auto-flushes on size thresholds.}

Copilot uses AI. Check for mistakes.
R/utils.R Outdated
arg
} else if (is.data.frame(arg)) {
arrow::Table$create(arg)
} else if (is.list(arg) && !is.null(names(arg))) {
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

odp_to_arrow_table() currently treats any named list as valid input as long as names(arg) is not NULL. That will also accept partially unnamed lists (e.g., list(a = 1, 2)) or empty-string names, which then become ambiguous/invalid column names after as.data.frame(). Consider tightening validation to require all names are present and non-empty (and, if the intent is a single-row "row dictionary", also validate that each element is length-1) before converting to an Arrow Table.

Suggested change
} else if (is.list(arg) && !is.null(names(arg))) {
} else if (is.list(arg) && !is.null(names(arg))) {
nms <- names(arg)
if (length(nms) != length(arg) || any(is.na(nms)) || any(!nzchar(nms))) {
cli::cli_abort("`arg` must be a fully named list with non-empty names to be converted to an Arrow Table")
}

Copilot uses AI. Check for mistakes.
if (request$operation == "replace") {
buf <- arrow::BufferOutputStream$create()
writer <- arrow::RecordBatchStreamWriter$create(buf, schema = mock_batch$schema)
writer$write(mock_batch)
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Arrow IPC writer is invoked with writer$write(mock_batch), but elsewhere in the repo Arrow stream writers use write_batch() / write_table(). Using a different method name here risks test failures across Arrow versions. Consider switching to the same writer API used in production code (write_batch(mock_batch) or equivalent) for consistency and compatibility.

Suggested change
writer$write(mock_batch)
writer$write_batch(mock_batch)

Copilot uses AI. Check for mistakes.
@max-romagnoli max-romagnoli changed the title fix: transaction replace fix: replace docs, upcasting ints, sf geo encoding Mar 13, 2026
@max-romagnoli max-romagnoli merged commit 8454e05 into master Mar 13, 2026
1 check passed
@max-romagnoli max-romagnoli deleted the max/tx-replace-fix branch March 13, 2026 09:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants