Skip to content

python: add more precise typehints for immutable structs#3321

Merged
Jens-G merged 1 commit intoapache:masterfrom
markjm:markjm/more-hints
Feb 25, 2026
Merged

python: add more precise typehints for immutable structs#3321
Jens-G merged 1 commit intoapache:masterfrom
markjm:markjm/more-hints

Conversation

@markjm
Copy link
Contributor

@markjm markjm commented Feb 25, 2026

Description

This change extends the type-hinted codegen to work more nicely with frozen structs. We directly use the same logic in member_hint which adds type annotations to the __init__ to provide class-level annotations.

Using the tutorial structs:

Before

class InvalidOperation(TFrozenExceptionBase):
    """
    Structs can also be exceptions, if they are nasty.

    Attributes:
     - whatOp
     - why

    """
    thrift_spec: typing.Any = None

    def __init__(self, whatOp: typing.Optional[int] = None, why: typing.Optional[str] = None,):
        super(InvalidOperation, self).__setattr__('whatOp', whatOp)
        super(InvalidOperation, self).__setattr__('why', why)

After

class InvalidOperation(TFrozenExceptionBase):
    """
    Structs can also be exceptions, if they are nasty.

    Attributes:
     - whatOp
     - why

    """
    thrift_spec: typing.Any = None

    whatOp: typing.Optional[int]
    why: typing.Optional[str]


    def __init__(self, whatOp: typing.Optional[int] = None, why: typing.Optional[str] = None,):
        super(InvalidOperation, self).__setattr__('whatOp', whatOp)
        super(InvalidOperation, self).__setattr__('why', why)

Diff

+   whatOp: typing.Optional[int]
+   why: typing.Optional[str]

Why

Because we use super(...).__setattr__ in frozen structs to initially set the relevant attributes, typecheckers can't see or understand what attributes exist on the struct when typechecking.

We the below examples:

ty playground

Invalid override of method `__hash__`: Definition is incompatible with `TFrozenBase.__hash__` (invalid-method-override) [Ln 119, Col 9]
Object of type `Self@__hash__` has no attribute `httpStatus` (unresolved-attribute) [Ln 122, Col 17]
Object of type `Self@__hash__` has no attribute `message` (unresolved-attribute) [Ln 123, Col 17]
Object of type `Self@__hash__` has no attribute `errorCode` (unresolved-attribute) [Ln 124, Col 17]
Object of type `Self@__hash__` has no attribute `method` (unresolved-attribute) [Ln 125, Col 17]
Object of type `Self@__hash__` has no attribute `info` (unresolved-attribute) [Ln 126, Col 17]
Object of type `FrozenException` has no attribute `errorCode` (unresolved-attribute) [Ln 138, Col 15]

Note: ty currently has the worst behaviour because it does not understand attributes in __slots__ yet. I suspect behaviour would be in line after astral-sh/ruff#20996

pyright playground

Note: here, pyright doesn't throw an error, but if you hove over e.errorCode you will see that the type is Unbound instead of the expected Optional[int]

mypy playground

main.py:138: error: "FrozenException" has no attribute "errorCode"  [attr-defined]
Found 1 error in 1 file (checked 1 source file)

In each of those examples, un-comment out the type annotations and see the errors go away (and/or errorCode be properly typed as int | None

    httpStatus: Optional[int]
    message: str
    errorCode: Optional[int]
    method: str
    info: Optional[Any]

Slots

I believe we use __slots__ here for the memory saving of not carrying a __dict__ around on each object. The below script demonstrates that these annotations do not increase the size of each created thrift object. Forward Warning: the PR and description were not generated using AI tools, but this validation script was:

"""Prove that adding class-level type annotations to a __slots__ class
does not increase instance memory usage."""

import sys
from typing import Any, Optional


class WithoutAnnotations:
    __slots__ = ('httpStatus', 'message', 'errorCode', 'method', 'info')

    def __init__(
        self,
        httpStatus=None,
        message="",
        errorCode=None,
        method="",
        info=None,
    ):
        object.__setattr__(self, 'httpStatus', httpStatus)
        object.__setattr__(self, 'message', message)
        object.__setattr__(self, 'errorCode', errorCode)
        object.__setattr__(self, 'method', method)
        object.__setattr__(self, 'info', info)


class WithAnnotations:
    httpStatus: Optional[int]
    message: str
    errorCode: Optional[int]
    method: str
    info: Optional[Any]

    __slots__ = ('httpStatus', 'message', 'errorCode', 'method', 'info')

    def __init__(
        self,
        httpStatus: Optional[int] = None,
        message: str = "",
        errorCode: Optional[int] = None,
        method: str = "",
        info: Optional[Any] = None,
    ):
        object.__setattr__(self, 'httpStatus', httpStatus)
        object.__setattr__(self, 'message', message)
        object.__setattr__(self, 'errorCode', errorCode)
        object.__setattr__(self, 'method', method)
        object.__setattr__(self, 'info', info)


def main():
    a = WithoutAnnotations(httpStatus=400, message="Bad Request", errorCode=1000, method="GET", info={"key": "value"})
    b = WithAnnotations(httpStatus=400, message="Bad Request", errorCode=1000, method="GET", info={"key": "value"})

    size_a = sys.getsizeof(a)
    size_b = sys.getsizeof(b)

    print(f"Instance size WITHOUT annotations: {size_a} bytes")
    print(f"Instance size WITH    annotations: {size_b} bytes")
    print(f"Difference:                        {size_b - size_a} bytes")
    print()

    assert not hasattr(a, '__dict__'), "WithoutAnnotations should not have __dict__"
    assert not hasattr(b, '__dict__'), "WithAnnotations should not have __dict__"
    print("Neither class has __dict__ (slots still prevent it)")

    assert '__annotations__' not in b.__slots__, "Annotations are not in __slots__"
    assert '__annotations__' not in a.__slots__, "Annotations are not in __slots__"
    assert '__annotations__' not in b.__class__.__dict__.get('__dict__', {}), "No per-instance __annotations__"
    assert '__annotations__' in WithAnnotations.__dict__, "Annotations live on the class object"
    assert '__annotations__' not in WithoutAnnotations.__dict__, "Unannotated class has no __annotations__"
    print("Annotations live on the class, not on instances")
    print(f"  WithAnnotations.__annotations__ = {WithAnnotations.__annotations__}")
    print()

    N = 100_000
    objs_a = [
        WithoutAnnotations(httpStatus=i, message=f"msg{i}", errorCode=i, method="GET", info=None)
        for i in range(N)
    ]
    objs_b = [
        WithAnnotations(httpStatus=i, message=f"msg{i}", errorCode=i, method="GET", info=None)
        for i in range(N)
    ]

    total_a = sum(sys.getsizeof(o) for o in objs_a)
    total_b = sum(sys.getsizeof(o) for o in objs_b)

    print(f"Total for {N:,} instances WITHOUT annotations: {total_a:,} bytes")
    print(f"Total for {N:,} instances WITH    annotations: {total_b:,} bytes")
    print(f"Difference:                                    {total_b - total_a:,} bytes")


if __name__ == "__main__":
    main()

Administravia

I have gone ahead and requested a JIRA account to submit this as a ticket, but IDK if yall would consider this to be trivial or non-trivial so posting ahead of time

  • If a ticket exists: Does your pull request title follow the pattern "THRIFT-NNNN: describe my issue"?
  • Did you squash your changes to a single commit? (not required, but preferred)
  • Did you do your best to avoid breaking changes? If one was needed, did you label the Jira ticket with "Breaking-Change"?
  • If your change does not involve any code, include [skip ci] anywhere in the commit message to free up build resources.

@mergeable mergeable bot added the compiler label Feb 25, 2026
@fishy fishy added the python label Feb 25, 2026
@Jens-G Jens-G merged commit b48ba36 into apache:master Feb 25, 2026
79 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants