Skip to content

Slow authentication with arrow cursor results in NETWORK_CONNECTION during HeadObject operation #609

@pnorman

Description

@pnorman

When running a query with arrow there is a default timeout of 3 seconds for s3 to respond. This is normally enough but when a role is being assumed with STS sometimes it takes longer and the error in #520 presents itself.

pyathena.error.OperationalError: When reading information for key 'tilelogs/d75a5f52-a89c-4095-a329-1d3be1d1b2d2.csv' in bucket 'openstreetmap-athena-results': AWS Error NETWORK_CONNECTION during HeadObject operation: curlCode: 28, Timeout was reached

The query itself doesn't matter, but in this case it is returning 0-1 rows. The issue is from Arrow (and hence ArrowCursor) having a default 3s timeout on the HeadObject operation and the operation taking longer than that to authenticate.

This can be seen with the AWS CLI, with times of up to 7s to head the object

time AWS_REGION=eu-north-1 AWS_PROFILE=osm-service-logs aws s3api head-object --bucket openstreetmap-athena-results --key tilelogs/d75a5f52-a89c-4095-a329-1d3be1d1b2d2.csv
{
    "AcceptRanges": "bytes",
    "Expiration": "expiry-date=\"Thu, 30 Oct 2025 00:00:00 GMT\", rule-id=\"Lifecycle\"",
    "LastModified": "Wed, 15 Oct 2025 22:43:38 GMT",
    "ContentLength": 778,
    "ETag": "\"eb5d9149215475d4fbef7892bbbcdc32\"",
    "VersionId": "wAnbK.mUfCK94G3Q0Kt37kQ.IJcX4bKB",
    "ContentType": "application/octet-stream",
    "ServerSideEncryption": "AES256",
    "Metadata": {}
}

real    0m7.419s
user    0m0.551s
sys     0m0.140s

In this case it is because the accounts are set up to auth into one account and assume role into another, and that is adding significant delays.

~/.aws/credentials contains

[osm-main]
aws_access_key_id = A...
aws_secret_access_key = ...
[osm-service-logs]
role_arn = arn:aws:iam::...:role/OrganizationAccountAccessRole
source_profile = osm-main

This is repeatable for me, but if it repeats for someone else will depend on their location, their region, latency, and luck. My dev server is on the US west coast connecting to eu-north-1, so that may be related.

Arrow added the ability to adjust the timeouts to s3fs in apache/arrow#13385 but this isn't exposed in pyathena. The relevant variables are request_timeout and connect_timeout

Repeatable version of #520

My workaround was to switch to Pandas

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions