Feature/async by stexandev · Pull Request #28 · pR0Ps/zipstream-ng

stexandev · 2025-09-17T08:47:03Z

Hi everyone,
good news from the async-front: I can now provide you with an implementation of AsyncZipStream.add() that is actually non-blocking. Hope to get your feedback on this very soon.

Previously it was possible add the contents of a directory at the top level of a zip by using an `arcname` of `"."`, but it wasn't documented or discoverable. This commit adds explicit support for specifying that a directory should have an `arcname` of `"/"`, therefore putting all of its contents at the top level. For example, to add every file in `/path/to/files/` to the top level of a zip: ``` ZipStream.from_path("/path/to/files", arcname="/") ``` This also fixes an issue where leading path separators were not being stripped in the same way as the `zipfile` module, potentially leading to size mismatches.

Fixes pR0Ps#24

Closes pR0Ps#23

…R0Ps#26)

Since no compression is ever used when adding directories (since there's no data to compress), the file headers and metadata returned by `get_info()` and `info_list()` should reflect this.`

zipstream/ng.py

GBirkel · 2026-01-14T02:19:25Z

This looks excellent and I look forward to the merge!

Copilot

Pull request overview

This PR adds async functionality to ZipStream, introduces support for Zstandard compression (Python 3.14+), and includes various bug fixes and improvements related to path handling and arcname sanitization.

Changes:

Implements non-blocking AsyncZipStream.add() method using async iterables with a queue-based producer-consumer pattern
Adds Zstandard compression support for Python 3.14+ with appropriate validation and version checking
Improves path handling by stripping leading path separators and allowing "/" as arcname for adding directory contents at the zip root level

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
zipstream/ng.py	Core changes including async implementation, zstd support, refactored compression validation, improved flag constants, arcname sanitization, and directory compression enforcement
tests/test_zipstream.py	Added zstd compression tests, updated path handling tests for new "/" arcname behavior, updated error message assertions, fixed spelling in comments
setup.py	Bumped version to 1.9.0, updated license to SPDX format, added Python 3.13 and 3.14 classifiers, removed license classifier
CHANGELOG.md	Added changelog entries for v1.8.0 and v1.9.0 with version links
README.md	Fixed syntax error in example code, updated CI badge URL
.github/workflows/tests.yml	Updated Python test matrix to 3.8-3.14, removed 3.7

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-20T09:46:55Z

zipstream/ng.py

+        if not arcname:
+            raise ValueError("A valid arcname for the directory is required")

        if arcname[-1] != "/":
            arcname += "/"


After sanitization with lstrip("/"), the mkdir method could fail with an IndexError if arcname becomes an empty string after sanitization (e.g., if arcname is "/" or "//"). The check at line 834 will catch this, but line 837 attempts to access arcname[-1] which would raise an IndexError if arcname is empty. However, the check at line 834 prevents this, so this is not actually a bug. Consider reordering lines 837-838 before line 834 to make the logic clearer, or add a comment explaining the order dependency.

Copilot · 2026-01-20T09:46:55Z

setup.py

        "Programming Language :: Python :: 3.11",
        "Programming Language :: Python :: 3.12",
+        "Programming Language :: Python :: 3.13",
+        "Programming Language :: Python :: 3.14",


The removed classifier "License :: OSI Approved :: GNU Lesser General Public License v3 (LGPLv3)" should be added back to maintain consistency with the license field. While the license field was updated to use the SPDX identifier "LGPL-3.0-only", the classifier provides important metadata for package managers and should be retained.

Suggested change

"Programming Language :: Python :: 3.14",

"Programming Language :: Python :: 3.14",

"License :: OSI Approved :: GNU Lesser General Public License v3 (LGPLv3)",

Copilot · 2026-01-20T09:46:55Z

zipstream/ng.py

+            q = _queue_from_async_iterable(data)
+
+            # Build a *synchronous* generator that yields from the queue.
+            def sync_gen() -> Iterable[bytes]:


The type annotation for sync_gen's return type is incorrect. The function is a generator that yields bytes, so it should be annotated as Generator[bytes, None, None] from typing, not Iterable[bytes]. While Iterable is technically correct, Generator is more precise and matches the actual implementation.

Copilot · 2026-01-20T09:46:55Z

zipstream/ng.py

+    from zipfile import ZIP_ZSTANDARD, ZSTANDARD_VERSION
+    from compression.zstd import CompressionParameter
+    ZSTD_LEVEL_BOUNDS = CompressionParameter.compression_level.bounds()


The import statement assumes that Python 3.14+ will have a 'compression.zstd' module available in the standard library. However, this module path appears to be hypothetical and may not match the actual implementation when Python 3.14 is released. The standard zipfile module typically imports compression modules directly (e.g., zlib, bz2, lzma). This should be verified against the actual Python 3.14 implementation or documentation, as importing a non-existent module will cause an ImportError.

Suggested change

from zipfile import ZIP_ZSTANDARD, ZSTANDARD_VERSION

from compression.zstd import CompressionParameter

ZSTD_LEVEL_BOUNDS = CompressionParameter.compression_level.bounds()

try:

from zipfile import ZIP_ZSTANDARD, ZSTANDARD_VERSION

from compression.zstd import CompressionParameter

except ImportError:

# Fall back to compatibility mode if zstd support is not available

PY313_COMPAT = True

warnings.warn(

"Zstandard compression is not available (failed to import 'compression.zstd'); "

"disabling ZIP_ZSTANDARD support.",

RuntimeWarning,

)

else:

ZSTD_LEVEL_BOUNDS = CompressionParameter.compression_level.bounds()

Copilot · 2026-01-20T09:46:56Z

zipstream/ng.py

+        async def add(
+            self,
+            data: AsyncIterable[bytes],
+            *args,
+            **kwargs,
+        ) -> None:
+            """
+            Add *data* to the zip archive where *data* is an asynchronous
+            iterable of ``bytes`` (e.g. an async generator that yields HTTP
+            chunks).
+
+            The method works by:
+
+            1. Creating a thread‑safe ``queue.Queue`` that is filled by a
+            background ``asyncio`` task running in the current event loop.
+            2. Passing a synchronous generator that reads from that queue
+            to the underlying ``ZipStream.add`` method (which runs in a
+            thread via ``to_thread``).
+
+            This keeps the event‑loop non‑blocking while the zip compression
+            happens in a separate thread.
+            """
+            # Turn the async iterable into a queue that the thread can read.
+            q = _queue_from_async_iterable(data)
+
+            # Build a *synchronous* generator that yields from the queue.
+            def sync_gen() -> Iterable[bytes]:
+                while True:
+                    chunk = q.get()  # blocks until a chunk is available
+                    if chunk is None:  # sentinel → end of stream
+                        break
+                    yield chunk
+
+            # Delegate to the original (thread‑based) ``add`` implementation.
+            # ``to_thread`` runs the whole call in a worker thread, so the
+            # zip compression stays off the event loop.
+            await to_thread(self._zip.add, sync_gen(), *args, **kwargs)


The new AsyncZipStream.add() method implementation lacks test coverage. Given that this is highlighted as a major feature in the PR description ("I can now provide you with an implementation of AsyncZipStream.add() that is actually non-blocking"), there should be tests verifying that the async add method works correctly with async iterables, handles errors properly, and maintains the non-blocking behavior.

Copilot · 2026-01-20T09:46:56Z

zipstream/ng.py

+                    q.put(chunk)
+            finally:
+                q.put(None)  # signal that the async iterable is finished


The queue.put() operation could block indefinitely if the queue has a maxsize set and the consumer (the sync_gen thread) is slower than the producer. This creates a potential deadlock because the async producer will be blocked waiting to put data, while the consumer thread may be waiting for something else. Consider using q.put(chunk, block=False) with appropriate error handling or using an asyncio.Queue instead of queue.Queue for better async compatibility.

Suggested change

q.put(chunk)

finally:

q.put(None) # signal that the async iterable is finished

# Use non‑blocking put and yield to the event loop while waiting

# for space in the queue to avoid blocking the event loop thread.

while True:

try:

q.put(chunk, block=False)

break

except queue.Full:

# Queue is full; yield control and retry.

await asyncio.sleep(0)

finally:

# Ensure the EOF sentinel is eventually enqueued, even if the queue

# is temporarily full, without blocking the event loop thread.

while True:

try:

q.put(None, block=False) # signal that the async iterable is finished

break

except queue.Full:

await asyncio.sleep(0)

Copilot · 2026-01-20T09:46:56Z

zipstream/ng.py

+        # Schedule the producer in the *current* event loop.
+        # ``asyncio.create_task`` returns immediately; the task runs
+        # concurrently with everything else.
+        asyncio.create_task(_producer())


The async task created by asyncio.create_task is not awaited or tracked. If an exception occurs in the _producer task, it will be silently ignored and the queue consumer will hang waiting for data that never arrives. The task should be stored and its result should be checked, or at minimum, exception handling should be added to ensure the sentinel None is always sent even on error.

Copilot · 2026-01-20T09:46:56Z

zipstream/ng.py

+        # directories are always stored
+        if kwargs["arcname"][-1] == "/":
+            kwargs["compress_type"] = ZIP_STORED


The directory compression type enforcement at line 1069 happens after the compression validation and default value removal (lines 1062-1065). This means if a user explicitly specifies compress_type=ZIP_DEFLATED for a directory, the validation will pass, but then line 1069 silently changes it to ZIP_STORED. This could be confusing behavior. Consider either: 1) raising a warning/error if a non-STORED compression is specified for a directory, or 2) enforcing ZIP_STORED before the validation/default removal logic so the behavior is consistent.

Copilot · 2026-01-20T09:46:56Z

tests/test_zipstream.py

+
+    # check compress level set
+    assert info[0]["compress_level"] == -7
+    assert info[1]["compress_level"] == None


Testing for None should use the 'is' operator.

Suggested change

assert info[1]["compress_level"] == None

assert info[1]["compress_level"] is None

Copilot · 2026-01-20T09:46:57Z

tests/test_zipstream.py

+    zs.add(test, "default.txt")
+    zs.add(test, "22.txt", compress_level=22)
+
+    data = bytes(zs)


Variable data is not used.

Suggested change

data = bytes(zs)

bytes(zs)

pR0Ps and others added 16 commits September 17, 2025 10:21

Bump version to v1.8.0

7f54f7d

Declare support 3.13 and enable CI for 3.13 and 3.14-dev

ec00c2f

Update license to match SPDX as per Python packaging guide

6557867

Fixes pR0Ps#24

Improve how valid compressoin levels are checked

0258553

Add note about performance to README

7a4517d

Closes pR0Ps#23

Add comments to explain hardcoded values in the archive footer

aa7e5a7

Update tests badge in README to reference the master branch

9fee571

Add support for zstd compression provided by stdlib (Python 3.14+) (p…

00f1131

…R0Ps#26)

Declare support for Python 3.14

4b9d79c

Remove deprecated license classifier

738b4d6

Bump version to v1.9.0

690eee2

Change magic numbers in zip64 end-of-archive record to use a constant

90896e9

Always report using ZIP_STORED for directories

20ba3c7

Since no compression is ever used when adding directories (since there's no data to compress), the file headers and metadata returned by `get_info()` and `info_list()` should reflect this.`

Add constants for file entry bot flags

f417b46

feat(ng): provide async capabilities to AsyncZipStream.add

de1edd0

stexandev commented Sep 17, 2025

View reviewed changes

zipstream/ng.py Outdated Show resolved Hide resolved

stexandev mentioned this pull request Oct 6, 2025

Support for async #10

Open

fix(ng): use backwards compatible to_thread

31901ed

Copilot AI review requested due to automatic review settings January 20, 2026 09:42

Copilot started reviewing on behalf of stexandev January 20, 2026 09:42 View session

Copilot AI reviewed Jan 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/async#28

Feature/async#28
stexandev wants to merge 17 commits intopR0Ps:feature/asyncfrom
stexandev:feature/async

stexandev commented Sep 17, 2025

Uh oh!

Uh oh!

GBirkel commented Jan 14, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 20, 2026

Uh oh!

Copilot AI Jan 20, 2026

Uh oh!

Copilot AI Jan 20, 2026

Uh oh!

Copilot AI Jan 20, 2026

Uh oh!

Copilot AI Jan 20, 2026

Uh oh!

Copilot AI Jan 20, 2026

Uh oh!

Copilot AI Jan 20, 2026

Uh oh!

Copilot AI Jan 20, 2026

Uh oh!

Copilot AI Jan 20, 2026

Uh oh!

Copilot AI Jan 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	"Programming Language :: Python :: 3.14",
	"Programming Language :: Python :: 3.14",
	"License :: OSI Approved :: GNU Lesser General Public License v3 (LGPLv3)",

-    from zipfile import ZIP_ZSTANDARD, ZSTANDARD_VERSION
-    from compression.zstd import CompressionParameter
-    ZSTD_LEVEL_BOUNDS = CompressionParameter.compression_level.bounds()
+    try:
+        from zipfile import ZIP_ZSTANDARD, ZSTANDARD_VERSION
+        from compression.zstd import CompressionParameter
+    except ImportError:
+        # Fall back to compatibility mode if zstd support is not available
+        PY313_COMPAT = True
+        warnings.warn(
+            "Zstandard compression is not available (failed to import 'compression.zstd'); "
+            "disabling ZIP_ZSTANDARD support.",
+            RuntimeWarning,
+        )
+    else:
+        ZSTD_LEVEL_BOUNDS = CompressionParameter.compression_level.bounds()

-                    q.put(chunk)
-            finally:
-                q.put(None)  # signal that the async iterable is finished
+                    # Use non‑blocking put and yield to the event loop while waiting
+                    # for space in the queue to avoid blocking the event loop thread.
+                    while True:
+                        try:
+                            q.put(chunk, block=False)
+                            break
+                        except queue.Full:
+                            # Queue is full; yield control and retry.
+                            await asyncio.sleep(0)
+            finally:
+                # Ensure the EOF sentinel is eventually enqueued, even if the queue
+                # is temporarily full, without blocking the event loop thread.
+                while True:
+                    try:
+                        q.put(None, block=False)  # signal that the async iterable is finished
+                        break
+                    except queue.Full:
+                        await asyncio.sleep(0)

	assert info[1]["compress_level"] == None
	assert info[1]["compress_level"] is None

Conversation

stexandev commented Sep 17, 2025

Uh oh!

Uh oh!

GBirkel commented Jan 14, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants