Skip to content

🚧 Initial safer & stronger API for buffers#505

Draft
atifaziz wants to merge 9 commits intotonybaloney:mainfrom
atifaziz:rebuffer
Draft

🚧 Initial safer & stronger API for buffers#505
atifaziz wants to merge 9 commits intotonybaloney:mainfrom
atifaziz:rebuffer

Conversation

@atifaziz
Copy link
Copy Markdown
Collaborator

@atifaziz atifaziz commented Jun 8, 2025

This PR is a proposal to address and close #504. It refactors the buffer API to avoid critical memory safety vulnerabilities that can easily lead to System.AccessViolationException crashes when juggling the lifetime management rules of native memory backing Python buffer objects, and spans that may be pointing into them. This is done by making most of the API use safe patterns where spans are provided as input to functions as opposed to being handed out, e.g.:

public interface IPyArrayBuffer<T> : IPyBuffer<T>, IMemoryOwner<T> where T : unmanaged
{
    T this[int index]

    public TResult Map<TResult>(SpanFunc<T, TResult> function);

    public TResult Map<TArg, TResult>(TArg arg, SpanFunc<T, TArg, TResult> function);

    public void Do<TArg>(TArg arg, SpanAction<T, TArg> action);

    void CopyFrom(scoped ReadOnlySpan<T> source);
    void CopyTo(scoped Span<T> destination);

    /// <summary>
    /// Gets the memory directly underlying the buffer, which is tied to the lifetime of the buffer.
    /// <em>Usage after disposing the buffer will lead to corruption and crashes</em>.
    /// </summary>
    /// <remarks>
    /// See <see
    /// href="https://learn.microsoft.com/en-us/dotnet/standard/memory-and-spans/memory-t-usage-guidelines"><see
    /// cref="Memory{T}"/> and <see cref="Span{T}"/> usage guidelines</see> for more information.
    /// </remarks>
    Memory<T> UnsafeMemory { get; }
}

The only unsafe access is provided by UnsafeMemory, which returns a Memory<T> (as being discussed in #471). Ideally, this would be marked unsafe, but that's ineffective for interface members that don't expose unsafe types in their signatures (like a pointer).

// Before: Dangerous pattern that caused crashes
Span<byte> GetBufferSpan(byte fill)
{
    using var buffer = test.Test(fill: fill);  // ⬅️ Disposed here
    return buffer.AsSpan<byte>();              // ❌ Dangling pointer
}

// After: Type-safe access with explicit lifetime management
if (buffer is IPyArrayBuffer<byte> typedBuffer)
{
    // Safe: Access within buffer lifetime
    var result = typedBuffer.Map(span => span.IndexOfAnyExcept(fill));
    // or use IMemoryOwner<T> pattern for explicit lifetime control
}

I've made the API more type-safe:

  • Before, there was a single PyBuffer class with run-time type checking and unsafe casting
  • Now there's a generic type-safe hierarchy with:
    • IPyBuffer<T> representing a shape-less base for some buffer of T items
    • IPyArrayBuffer<T> for scalars and 1D arrays
    • IPyArray2DBuffer<T> for 2D arrays
    • PyTensorBuffer<T> for tensors

This eliminates run-time type mismatches and provides compile-time safety. Developers can now rely on standard C# patterns like:

// Type casting
var byteBuffer = (IPyArrayBuffer<byte>)buffer;

// Safe casting with 'as'
if (buffer is IPyArray2DBuffer<float> floatBuffer2D)
{
    // Work with 2D float buffer
}

// Pattern matching
switch (buffer)
{
    case IPyArrayBuffer<int> intBuffer: /* handle 1D int array */
    case IPyArray2DBuffer<double> doubleBuffer2D: /* handle 2D double array */
}

It also improves performance as all checks and creation of the right buffer sub-type is done once. The validation also lies with each sub-type where it makes sense.

Other Notes

  • Buffers now properly handle disposal in both managed and finalizer contexts. Like in the case of PyObject, added buffer-specific disposal queue that safely handles cleanup when the GIL isn't available (see also Add PyBuffer finalizer to prevent potential leaks #544).
  • Deprecated all existing AsSpan<T>() family of methods to discourage unsafe usage patterns.
  • This proposal introduces breaking changes.

Pending Decisions & Work

This is a draft PR to summon early feedback.

  • Decide on IPyBuffer sub-type/hierarchical API
  • Decide on interfaces with default methods vs classes for buffer types
  • Decide on safe API in addition to Map, Do, CopyTo, CopyFrom, etc. (including overloads)
  • Decide on unsafe API
  • Add benchmarks for performance impact of using a safe indexer?
  • Update documentation (buffers.md)

atifaziz added 8 commits June 18, 2025 20:05
Conflicts resolved:

- src/CSnakes.Runtime/Python/PyBuffer.cs
- src/Integration.Tests/BufferTests.cs
Conflicts resolved:

- src/CSnakes.Runtime/PyObjectTypeConverter.cs
- src/CSnakes.Runtime/Python/PyBuffer.cs
- src/CSnakes.Runtime/Python/PyBufferExtensions.cs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Buffer API is fundamentally unsafe

1 participant