Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion docs/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ This section provides comprehensive API documentation for all PyAthena classes a
api/s3fs
api/spark
api/converters
api/sqlalchemy
api/filesystem
api/models
api/utilities
Expand All @@ -38,9 +39,10 @@ Specialized Integrations
- :ref:`api_arrow` - Apache Arrow columnar data integration
- :ref:`api_s3fs` - Lightweight S3FS-based cursor (no pandas/pyarrow required)
- :ref:`api_spark` - Apache Spark integration for big data processing
- :ref:`api_sqlalchemy` - SQLAlchemy dialect implementations

Infrastructure
~~~~~~~~~~~~~~~

- :ref:`api_filesystem` - S3 filesystem integration and object management
- :ref:`api_filesystem` - S3 filesystem integration and object management
- :ref:`api_models` - Athena query execution and metadata models
18 changes: 16 additions & 2 deletions docs/api/arrow.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
Apache Arrow Integration
========================

This section covers Apache Arrow-specific cursors and data converters.
This section covers Apache Arrow-specific cursors, result sets, and data converters.

Arrow Cursors
-------------
Expand All @@ -16,11 +16,25 @@ Arrow Cursors
:members:
:inherited-members:

Arrow Result Set
----------------

.. autoclass:: pyathena.arrow.result_set.AthenaArrowResultSet
:members:
:inherited-members:

Arrow Data Converters
----------------------

.. autoclass:: pyathena.arrow.converter.DefaultArrowTypeConverter
:members:

.. autoclass:: pyathena.arrow.converter.DefaultArrowUnloadTypeConverter
:members:
:members:

Arrow Utilities
---------------

.. autofunction:: pyathena.arrow.util.to_column_info

.. autofunction:: pyathena.arrow.util.get_athena_type
16 changes: 15 additions & 1 deletion docs/api/connection.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,4 +35,18 @@ Asynchronous Cursors

.. autoclass:: pyathena.async_cursor.AsyncDictCursor
:members:
:inherited-members:
:inherited-members:

Result Sets
-----------

.. autoclass:: pyathena.result_set.AthenaResultSet
:members:
:inherited-members:

.. autoclass:: pyathena.result_set.AthenaDictResultSet
:members:
:inherited-members:

.. autoclass:: pyathena.result_set.WithResultSet
:members:
3 changes: 3 additions & 0 deletions docs/api/filesystem.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@ S3 FileSystem
.. autoclass:: pyathena.filesystem.s3.S3FileSystem
:members:

.. autoclass:: pyathena.filesystem.s3.S3File
:members:

S3 Objects
----------

Expand Down
33 changes: 31 additions & 2 deletions docs/api/pandas.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
Pandas Integration
==================

This section covers pandas-specific cursors and data converters.
This section covers pandas-specific cursors, result sets, and data converters.

Pandas Cursors
--------------
Expand All @@ -16,11 +16,40 @@ Pandas Cursors
:members:
:inherited-members:

Pandas Result Set
-----------------

.. autoclass:: pyathena.pandas.result_set.AthenaPandasResultSet
:members:
:inherited-members:

.. autoclass:: pyathena.pandas.result_set.DataFrameIterator
:members:

Pandas Data Converters
-----------------------

.. autoclass:: pyathena.pandas.converter.DefaultPandasTypeConverter
:members:

.. autoclass:: pyathena.pandas.converter.DefaultPandasUnloadTypeConverter
:members:
:members:

Pandas Utilities
----------------

.. autofunction:: pyathena.pandas.util.get_chunks

.. autofunction:: pyathena.pandas.util.reset_index

.. autofunction:: pyathena.pandas.util.as_pandas

.. autofunction:: pyathena.pandas.util.to_sql_type_mappings

.. autofunction:: pyathena.pandas.util.to_parquet

.. autofunction:: pyathena.pandas.util.to_sql

.. autofunction:: pyathena.pandas.util.get_column_names_and_types

.. autofunction:: pyathena.pandas.util.generate_ddl
63 changes: 63 additions & 0 deletions docs/api/sqlalchemy.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
.. _api_sqlalchemy:

SQLAlchemy Integration
======================

This section covers SQLAlchemy dialect implementations for Amazon Athena.

Dialects
--------

.. autoclass:: pyathena.sqlalchemy.rest.AthenaRestDialect
:members:
:inherited-members:

.. autoclass:: pyathena.sqlalchemy.pandas.AthenaPandasDialect
:members:
:inherited-members:

.. autoclass:: pyathena.sqlalchemy.arrow.AthenaArrowDialect
:members:
:inherited-members:

Type System
-----------

.. autoclass:: pyathena.sqlalchemy.types.AthenaTimestamp
:members:

.. autoclass:: pyathena.sqlalchemy.types.AthenaDate
:members:

.. autoclass:: pyathena.sqlalchemy.types.Tinyint
:members:

.. autoclass:: pyathena.sqlalchemy.types.AthenaStruct
:members:

.. autoclass:: pyathena.sqlalchemy.types.AthenaMap
:members:

.. autoclass:: pyathena.sqlalchemy.types.AthenaArray
:members:

Compilers
---------

.. autoclass:: pyathena.sqlalchemy.compiler.AthenaTypeCompiler
:members:

.. autoclass:: pyathena.sqlalchemy.compiler.AthenaStatementCompiler
:members:

.. autoclass:: pyathena.sqlalchemy.compiler.AthenaDDLCompiler
:members:

Identifier Preparers
--------------------

.. autoclass:: pyathena.sqlalchemy.preparer.AthenaDMLIdentifierPreparer
:members:

.. autoclass:: pyathena.sqlalchemy.preparer.AthenaDDLIdentifierPreparer
:members:
42 changes: 42 additions & 0 deletions pyathena/arrow/util.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,11 @@
# -*- coding: utf-8 -*-
"""Utilities for converting PyArrow types to Athena metadata.

This module provides functions to convert PyArrow schema and type information
to Athena-compatible column metadata, enabling proper type mapping when
reading query results in Apache Arrow format.
"""

from __future__ import annotations

from typing import TYPE_CHECKING, Any, Dict, Tuple, cast
Expand All @@ -9,6 +16,22 @@


def to_column_info(schema: "Schema") -> Tuple[Dict[str, Any], ...]:
"""Convert a PyArrow schema to Athena column information.

Iterates through all fields in the schema and converts each field's
type information to an Athena-compatible column metadata dictionary.

Args:
schema: A PyArrow Schema object containing field definitions.

Returns:
A tuple of dictionaries, each containing column metadata with keys:
- Name: The column name
- Type: The Athena SQL type name
- Precision: Numeric precision (0 for non-numeric types)
- Scale: Numeric scale (0 for non-numeric types)
- Nullable: Either "NULLABLE" or "NOT_NULL"
"""
columns = []
for field in schema:
type_, precision, scale = get_athena_type(field.type)
Expand All @@ -25,6 +48,25 @@ def to_column_info(schema: "Schema") -> Tuple[Dict[str, Any], ...]:


def get_athena_type(type_: "DataType") -> Tuple[str, int, int]:
"""Map a PyArrow data type to an Athena SQL type.

Converts PyArrow type identifiers to corresponding Athena SQL type names
with appropriate precision and scale values. Handles all common Arrow
types including numeric, string, binary, temporal, and complex types.

Args:
type_: A PyArrow DataType object to convert.

Returns:
A tuple of (type_name, precision, scale) where:
- type_name: The Athena SQL type (e.g., "varchar", "bigint", "timestamp")
- precision: The numeric precision or max length
- scale: The numeric scale (decimal places)

Note:
Unknown types default to "string" with maximum varchar length.
Decimal types preserve their original precision and scale.
"""
import pyarrow.lib as types

if type_.id in [types.Type_BOOL]: # 1
Expand Down
Loading