Skip to content

Add PythonTypeSpec subclasses for recognised Python types#669

Merged
tonybaloney merged 21 commits intotonybaloney:mainfrom
atifaziz:pytypespecs
Sep 15, 2025
Merged

Add PythonTypeSpec subclasses for recognised Python types#669
tonybaloney merged 21 commits intotonybaloney:mainfrom
atifaziz:pytypespecs

Conversation

@atifaziz
Copy link
Copy Markdown
Collaborator

@atifaziz atifaziz commented Sep 4, 2025

This PR represents a significant refactoring of the Python AST in CSnakes, transforming the loosely-typed, string-based PythonTypeSpec class into a comprehensive, strongly-typed hierarchy of record types. This change enhances type safety, improves performance through better caching, and provides a more maintainable foundation for Python-to-C# type mapping.

Some of this work will help with adding type annotations support too, for help with PR #584.

Design Changes

Core Design Transformation

The new design introduces a hierarchy of immutable record types, each representing a recognised Python type:

// Base type
public record PythonTypeSpec(string Name)

// Primitive types (singletons)
public sealed record IntType : PythonTypeSpec
public sealed record StrType : PythonTypeSpec
public sealed record FloatType : PythonTypeSpec
// ... etc

// Generic types with specific type parameters
public sealed record ListType(PythonTypeSpec Of) : ClosedGenericType("list")
public sealed record DictType(PythonTypeSpec Key, PythonTypeSpec Value) : ClosedGenericType("dict")
public sealed record OptionalType(PythonTypeSpec Of) : ClosedGenericType("Optional")
// ... etc

Type System Enhancements

1. Singleton Pattern for Primitive Types

Built-in Python types like int, str, bool, etc., are now implemented as singletons, reducing memory allocation and improving equality comparisons:

public static readonly IntType Int = IntType.Instance;
public static readonly StrType Str = StrType.Instance;

2. Strongly-Typed Generic Types

Complex types now have strongly-typed properties instead of generic argument arrays:

// Before: typeSpec.Arguments[0] and typeSpec.Arguments[1]
// After: dictType.Key and dictType.Value

Likewise, the LiteralType only accepts an array of PythonConstant, which should help with #24 and fixes problems in tests like Literal[10] being seen as Literal[Literal]:

[InlineData("temp: Literal[10]", "temp", "Literal[Literal]")]
[InlineData("value: Literal[1, 2, 3]", "value", "Literal[Literal]")]
[InlineData("value: Literal[1, 'two', 3.0]", "value", "Literal[Literal]")]

3. Advanced Union Type Normalization

The UnionType.Normalize method implements sophisticated logic to simplify union types:

  • Union[int, None]Optional[int]
  • Union[int]int
  • Deduplication of identical types
  • Flattening of nested unions

This functionality existed already but is now housed in the new UnionType.

4. ValueArray Collection Type

Introduced a new ValueArray<T> record struct that provides:

  • Value semantics for collections
  • Proper equality comparisons
  • Collection builder support
  • Immutable array backing with default handling

Technical Implementation Details

Parser Enhancements

The type definition parser has been refactored considerably to recognise the new type hierarchy early during parsing:

  • Expanded Type Recognition:
    • collections.abc.Sequence
    • collections.abc.Mapping
    • collections.abc.Generator
    • collections.abc.Coroutine
    • typing.Union
    • typing.Literal
    • Callable or typing.Callable or collections.abc.Callable
    • Variadic tuples like tuple[int, ...], including tuple to mean tuple[Any, ...]
  • Improved Subscript Parsing: Strict parsing of 1, 2, and 3-parameter generic types

Code Generation Updates

All code generation components have been updated to work with the new type system:

  • Type Reflection: Pattern matching on specific type records instead of string comparisons
  • Method Reflection: Cleaner handling of async/coroutine types and optional parameters
  • Result Conversion: More efficient type-specific conversion logic

Performance Improvements

  • Memory Efficiency: Singleton primitive types eliminate duplicate allocations

Testing Coverage

The PR includes comprehensive test coverage with 399 new test cases covering:

  • Primitive Type Parsing: All built-in Python types
  • Generic Type Parsing: Lists, dictionaries, tuples, generators, coroutines
  • Union Type Normalization: Complex union simplification scenarios
  • Error Cases: Proper error messages for malformed type definitions
  • Nested Generics: Complex nested type structures
  • Literal Types: Support for literal value types

Benefits

Type Safety

  • Compile-time verification of type operations
  • Eliminates runtime type checking errors
  • IntelliSense support for type-specific properties (like looking up references for code navigation)

Maintainability

  • Clear separation of concerns between different type categories
  • Potentially easier to add new Python type support
  • Pattern matching on types versus strings provides cleaner code and fewer changes of making typos
  • Aliases for types like typing.Optional and Optional are in one place (the parser) instead of being spread out

Extensibility

  • Easy to add new type-specific behaviors
  • Interface-based design for common type categories (ISequenceType, IMappingType)
  • Clean foundation for future Python typing features

Conflicts resolved:

- src/CSnakes.SourceGeneration/Reflection/ArgumentReflection.cs
.Subscript()
select new PythonTypeSpec(n, [..callable.Parameters.Append(callable.Return)]),
"Literal" =>
typeDefinitionParser.ManyDelimitedBy(comma)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is already an issue, but this fails the parser:

P = ParamSpec("P")
T_co = TypeVar("T_co", covariant=True)


def my_decorator(func: Callable[P, T_co]) -> Callable[P, T_co]:

Copy link
Copy Markdown
Collaborator Author

@atifaziz atifaziz Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be resolved in this PR if it was already an issue?

I don't think a decorator function would be a target for the source generator. We already ignore functions starting with an underscore, but for such a case where a decorator is part of a module and name mangling isn't desired, I can imagine supporting something along the lines of # type: ignore comments that would act as a directive for the source generator. Like so?

P = ParamSpec("P")
T_co = TypeVar("T_co", covariant=True)


def my_decorator(func: Callable[P, T_co]) -> Callable[P, T_co]:  # csharp: ignore

That said, one could support Callable where the first type argument is not a subscript and map it to ParsedPythonTypeSpec, but this would technically achieve something and generate a function on the C# end that would never be desired or used.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't need to be fixed in this PR, but it's worth noting in the comment that there are circumstances when it's not another subscripted type (eg a type alias)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note added with 963029c.

I can imagine supporting something along the lines of # type: ignore comments that would act as a directive for the source generator.

Tracking in #684.

Copy link
Copy Markdown
Owner

@tonybaloney tonybaloney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great improvement and removes a lot of the brittleness in the parser/source generator logic.

Approved in principle once you're finished

@atifaziz atifaziz changed the title 🚧 Add PythonTypeSpec subclasses for recognised Python types Add PythonTypeSpec subclasses for recognised Python types Sep 15, 2025
@atifaziz atifaziz marked this pull request as ready for review September 15, 2025 15:08
@tonybaloney tonybaloney merged commit b352075 into tonybaloney:main Sep 15, 2025
52 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants