Skip to content

Add PolarsCursor for native Polars DataFrame support#636

Merged
laughingman7743 merged 4 commits intomasterfrom
feature/polars-cursor
Jan 3, 2026
Merged

Add PolarsCursor for native Polars DataFrame support#636
laughingman7743 merged 4 commits intomasterfrom
feature/polars-cursor

Conversation

@laughingman7743
Copy link
Member

@laughingman7743 laughingman7743 commented Jan 3, 2026

Summary

This PR introduces PolarsCursor and AsyncPolarsCursor for native Polars DataFrame support without requiring PyArrow as a mandatory dependency.

Key Features

  • PolarsCursor: Synchronous cursor that returns results as Polars DataFrames
  • AsyncPolarsCursor: Asynchronous cursor for non-blocking query execution
  • Support for both regular queries and UNLOAD operations
  • Proper type conversion for all Athena data types including decimal with precision/scale
  • CSV reading via fsspec with S3 support
  • Parquet reading via Polars native object_store
  • SQLAlchemy dialect support via AthenaPolarsDialect

Implementation Details

  • DefaultPolarsTypeConverter: Converts Athena types to Polars dtypes
  • DefaultPolarsUnloadTypeConverter: Minimal converter for UNLOAD operations
  • AthenaPolarsResultSet: Result set with Polars DataFrame conversion
  • Added get_dtype method to base Converter class for extensible type handling
  • Moved is_unload property to AthenaResultSet base class

Additional Improvements

  • Updated pandas and arrow result sets to use get_dtype method
  • Added as_polars() method to Arrow cursors for interoperability
  • Added tests for as_polars() method in Arrow cursors
  • Comprehensive documentation and tests

Future Work (separate PR)

  • Chunking support for large result sets (similar to pandas iter_chunks())

Test plan

  • Run make chk - all quality checks pass
  • Run Polars cursor tests - all 69 tests pass
  • Run pandas cursor tests - all tests pass
  • Run arrow cursor tests - all tests pass

Closes #436

🤖 Generated with Claude Code

laughingman7743 and others added 4 commits January 3, 2026 16:43
This commit introduces PolarsCursor and AsyncPolarsCursor for native Polars
DataFrame support without requiring PyArrow as a mandatory dependency.

Key features:
- PolarsCursor: Synchronous cursor that returns results as Polars DataFrames
- AsyncPolarsCursor: Asynchronous cursor for non-blocking query execution
- Support for both regular queries and UNLOAD operations
- Proper type conversion for all Athena data types including decimal with precision/scale
- CSV reading via fsspec with S3 support
- Parquet reading via Polars native object_store

Implementation details:
- DefaultPolarsTypeConverter: Converts Athena types to Polars dtypes
- DefaultPolarsUnloadTypeConverter: Minimal converter for UNLOAD operations
- AthenaPolarsResultSet: Result set with Polars DataFrame conversion
- SQLAlchemy dialect support via AthenaPolarsDialect

Additional improvements:
- Added get_dtype method to base Converter class for extensible type handling
- Moved is_unload property to AthenaResultSet base class
- Updated pandas and arrow result sets to use get_dtype method
- Added as_polars() method to Arrow cursors for interoperability
- Comprehensive documentation and tests

Closes #436

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add comprehensive tests for the as_polars() method in ArrowCursor:
- test_as_polars: Basic single row test (with and without UNLOAD)
- test_many_as_polars: Many rows (10000) test (with and without UNLOAD)
- test_complex_as_polars: Complex data types test (standard mode)
- test_complex_unload_as_polars: Complex data types test (UNLOAD mode)
- Updated test_fetch_no_data to verify as_polars raises ProgrammingError
- Updated test_executemany_fetch to verify as_polars raises ProgrammingError

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add docstrings to AsyncPolarsCursor.get_default_converter() and
  arraysize property to match PolarsCursor documentation
- Change default type from "varchar" to "string" in polars/util.py
  to align with arrow/util.py for consistency

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add assertions to verify query_id is not None in tests where the
variable was previously unused, improving test coverage.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@laughingman7743 laughingman7743 marked this pull request as ready for review January 3, 2026 08:18
@laughingman7743 laughingman7743 merged commit 5f75492 into master Jan 3, 2026
5 checks passed
@laughingman7743 laughingman7743 deleted the feature/polars-cursor branch January 3, 2026 08:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Impl Polars cursor

1 participant