Skip to content

Add deleter support to torch::stable::from_blob#173371

Closed
NicolasHug wants to merge 8 commits intopytorch:mainfrom
NicolasHug:from_blob_deleter
Closed

Add deleter support to torch::stable::from_blob#173371
NicolasHug wants to merge 8 commits intopytorch:mainfrom
NicolasHug:from_blob_deleter

Conversation

@NicolasHug
Copy link
Member

@NicolasHug NicolasHug commented Jan 26, 2026

This PR adds deleter support to torch::stable::from_blob by adding a new aoti_torch_create_tensor_from_blob_v3. We need it to cleanly port TorchCodec to the stable ABI in meta-pytorch/torchcodec#1188

There's a bit of scaffolding, especially for the tests where I had to create a new test/cpp_extensions/libtorch_agn_2_11_extension folder. Most of it is just copy/paste from test/cpp_extensions/libtorch_agn_2_11_extension/libtorch_agn_2_10/__init__.py

@pytorch-bot
Copy link

pytorch-bot bot commented Jan 26, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/173371

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 1 Pending

As of commit d499fac with merge base 1a7a87d (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions
Copy link
Contributor

Attention! PyTorch one of the C-stable API file was changed

You MUST NOT change existing function declarations in this, as this header defines a stable C ABI. If you need to change the signature for a function, introduce a new v2 version of the function and modify code generation to target the new version of the function.


Caused by:

@NicolasHug NicolasHug changed the title Add delete support to torch::stable::from_blob Add deleter support to torch::stable::from_blob Jan 26, 2026
Copy link
Member Author

@NicolasHug NicolasHug Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is the same as test/cpp_extensions/libtorch_agn_2_10_extension/libtorch_agn_2_10/__init__.py

Same with the other files here, they're copy/pasted or adapted from test/cpp_extensions/libtorch_agn_2_10_extension/

@NicolasHug NicolasHug marked this pull request as ready for review January 26, 2026 18:19
Copy link
Contributor

@mikaylagawarecki mikaylagawarecki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding all the necessary versioning related scaffolding for the first op that was added for 2.11, that part looks good to me

Leaving the review of the actual deleter_fn / from_blob changes to @janeyx99


@skipIfTorchVersionLessThan(2, 11)
@skipIfTorchDynamo("no data pointer defined for FakeTensor, FunctionalTensor")
def test_my_from_blob_with_deleter(self, device):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you get a chance to try this on a real use case and see it work?

Copy link
Member Author

@NicolasHug NicolasHug Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a true real world yet, that will come when I integrate these changes into meta-pytorch/torchcodec#1188. I'm planning to wait for this to be in the nightlies if that's OK?

Here's a snippet that's closer to real life than the current test, where we actually call free in the deleter instead of just checking for calls. The snippet allocates thousands of 10MB float tensors (with malloc -> from_blob):

# When passing `free` as the deleter:
~/dev/from_blob_deleter_test/build » /usr/bin/time -v ./test_from_blob_deleter 2>&1 | grep Maximum
	Maximum resident set size (kbytes): 92672

# When not passing a deleter, leak!!
~/dev/from_blob_deleter_test/build » /usr/bin/time -v ./test_from_blob_deleter 2>&1 | grep Maximum
	Maximum resident set size (kbytes): 172764
#include <torch/csrc/stable/ops.h>
#include <torch/csrc/stable/tensor.h>
#include <torch/csrc/stable/device.h>

#include <cstdlib>
#include <cstdint>
#include <fstream>
#include <iostream>
#include <string>
#include <vector>



int main(int argc, char* argv[]) {
    // 10MB per allocation (float = 4 bytes)
    const size_t num_elements = 10 * 1024 * 1024 / sizeof(float);
    std::vector<int64_t> sizes = {static_cast<int64_t>(num_elements)};
    std::vector<int64_t> strides = {1};

    for (int i = 0; i < 10000; i++) {
        float* data = static_cast<float*>(malloc(num_elements * sizeof(float)));
        data[0] = 1.0f;
        data[num_elements - 1] = 2.0f;

        // Create tensor with deleter - tensor now owns the memory
        {
            torch::stable::Tensor tensor = torch::stable::from_blob(
                data,
                torch::headeronly::IntHeaderOnlyArrayRef(sizes.data(), sizes.size()),
                torch::headeronly::IntHeaderOnlyArrayRef(strides.data(), strides.size()),
                torch::stable::Device(torch::headeronly::DeviceType::CPU),
                torch::headeronly::ScalarType::Float,
                free /* deleter!! */
            );
        }
    }

}
cmake_minimum_required(VERSION 3.18)
project(from_blob_deleter_test)

set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

find_package(Torch REQUIRED)

add_executable(test_from_blob_deleter test_from_blob_deleter.cpp)
target_link_libraries(test_from_blob_deleter "${TORCH_LIBRARIES}")
target_include_directories(test_from_blob_deleter PRIVATE "${TORCH_INCLUDE_DIRS}")

# Set TORCH_TARGET_VERSION to 2.11 to use the new from_blob with deleter
target_compile_definitions(test_from_blob_deleter PRIVATE
    TORCH_TARGET_VERSION=0x020b000000000000
)

Copy link
Contributor

@janeyx99 janeyx99 Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see! Thanks for the example. I was more wanting to ensure that the currently added test does not leak memory, haha. But this is good to exemplify too.

I'm realizing anyone using from_blob signs up for the danger of memory leaks anyway, but can you add something for memory similar to in test_my_ones_like for cuda? Thank you!

@NicolasHug
Copy link
Member Author

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased from_blob_deleter onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout from_blob_deleter && git pull --rebase)

@NicolasHug
Copy link
Member Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 3, 2026
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged release notes: inductor (aoti)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants