Skip to content

Conversation

@matt-o-how
Copy link
Contributor

@matt-o-how matt-o-how commented Jun 17, 2025

Adds an optional field to quit out of serializing if a maximum byte count is reached

This change also takes the time to update a couple of out of date things.

py.test is deprecated in favour of pytest and mypy is now catching a problem in more_ops.py

@coveralls-official
Copy link

coveralls-official bot commented Jun 17, 2025

Pull Request Test Coverage Report for Build 15819603206

Details

  • 7 of 7 (100.0%) changed or added relevant lines in 2 files are covered.
  • 2 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-0.2%) to 93.934%

Files with Coverage Reduction New Missed Lines %
clvm/object_cache.py 2 91.11%
Totals Coverage Status
Change from base Build 14520240340: -0.2%
Covered Lines: 1207
Relevant Lines: 1264

💛 - Coveralls

@matt-o-how matt-o-how requested a review from richardkiss June 18, 2025 09:13
@richardkiss
Copy link
Contributor

An issue I can think of is that it makes life difficult for clvm_tools users who could work with ultra-large blobs before this. Now they're going to get serialization errors. I think we would have to add an option to clvm_tools to override this default max block size (which of course is a bit tricky because we have to coordinate to another repo).

@richardkiss
Copy link
Contributor

And here's an alternate mechanism for implementation, where you just put a wrapper around the file-like object f (courtesy of ChatGPT):

class MaxSizeIOWrapper:
    def __init__(self, fileobj, max_size):
        self._file = fileobj
        self._max_size = max_size
        self._written = 0  # total bytes/characters written

    def write(self, data):
        size = len(data)
        if self._written + size > self._max_size:
            raise IOError(f"Write would exceed max size of {self._max_size}")
        written = self._file.write(data)
        self._written += written
        return written

    def __getattr__(self, attr):
        # Delegate everything else to the underlying file
        return getattr(self._file, attr)

Providing a wrapper like this could make it more opt-in.

@richardkiss
Copy link
Contributor

Or maybe just a new API with a new name which requires the max_size api, and we deprecate the old api. This is kind of like Program.run which originally had no limit, which proved to be stupid, so we had to move to a Program.run_with_max_cost or something like that, and then the max_cost eventually made its way into .run. Or at least in some places.

Conceptually it's fine. But I just worry about me getting dinged when trying to use clvm_tools at some point in the future. Maybe we do what we did with Program to give clvm_tools a chance to be revved too. Does anyone besides me even use it?

@arvidn
Copy link
Contributor

arvidn commented Jun 18, 2025

I think changing the default to have a limit is an important feature

@arvidn
Copy link
Contributor

arvidn commented Jun 23, 2025

I think the current implementation is preferable implementing the limit in a stream adapter. We don't know of any use cases where it's OK to be unsafe (i.e. unlimited). even clvm_tools ought to have some limit. I don't think it's reasonable to provide an API that's unsafe, and even if we do, it should have a longer name indicating that it's unsafe.

The implementation is also simple to follow and in the place where you would first look, so easy for other people to have confidence in being correct.

@matt-o-how matt-o-how requested a review from richardkiss June 24, 2025 10:54
@arvidn arvidn merged commit 426ffd3 into main Jun 25, 2025
21 checks passed
@arvidn arvidn deleted the add_max_size_to_serialize branch June 25, 2025 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants