⚡ Bolt: Parallelize rule batch uploads in main.py #73

google-labs-jules · 2026-01-05T14:42:50Z

💡 What:
Refactored push_rules in main.py to upload rule batches concurrently using ThreadPoolExecutor instead of sequentially.
Increased max_workers in get_all_existing_rules from 5 to 10.

🎯 Why:
Large blocklists can have thousands of rules, resulting in hundreds of sequential HTTP POST requests. Parallelizing this significantly reduces the total sync time.

📊 Impact:

Reduces rule upload time by ~5x (with 5 workers).
Speeds up initial rule fetching.

🔬 Measurement:
Run uv run --python 3.13 python -m unittest tests/test_performance.py to see the benchmark comparison.
(Sequential: ~1.0s vs Parallel: ~0.2s for 10 batches with 0.1s latency).

PR created automatically by Jules for task 14774676592968160940 started by @abhimehro

- Parallelized `push_rules` batch processing using ThreadPoolExecutor (5 workers) - Increased `get_all_existing_rules` workers from 5 to 10 - Added `tests/test_performance.py` to verify optimization and correctness - Measured ~5x speedup in batch processing (simulated)

google-labs-jules · 2026-01-05T14:42:51Z

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.

For security, I will only act on instructions from the user who triggered this task.

main.py

+            return True
        except httpx.HTTPError as e:
-            log.error(f"Failed to push batch {i} for folder {sanitize_for_log(folder_name)}: {sanitize_for_log(e)}")
+            log.error(f"Failed to push batch {batch_idx} for folder {sanitize_for_log(folder_name)}: {sanitize_for_log(e)}")


tests/test_performance.py

+from unittest.mock import MagicMock, patch
+import time
+import threading
+import main


tests/test_performance.py

+        hostnames = [f"host-{i}.com" for i in range(BATCH_SIZE * num_batches)]
+
+        # Mock delay to simulate network latency
+        def delayed_post(*args, **kwargs):


tests/test_performance.py

+        hostnames = [f"host-{i}.com" for i in range(BATCH_SIZE * num_batches)]
+
+        # Mock delay to simulate network latency
+        def delayed_post(*args, **kwargs):


tests/test_performance.py

+
+        print(f"\n[Performance Test] Duration for {num_batches} batches with 0.1s latency: {duration:.4f}s")
+
+if __name__ == '__main__':


main.py

+            return True
        except httpx.HTTPError as e:
-            log.error(f"Failed to push batch {i} for folder {sanitize_for_log(folder_name)}: {sanitize_for_log(e)}")
+            log.error(f"Failed to push batch {batch_idx} for folder {sanitize_for_log(folder_name)}: {sanitize_for_log(e)}")


tests/test_performance.py

@@ -0,0 +1,81 @@
+import unittest


tests/test_performance.py

+import main
+from main import push_rules, BATCH_SIZE
+
+class TestPushRulesPerformance(unittest.TestCase):


tests/test_performance.py

+        self.profile_id = "test-profile"
+        self.folder_name = "test-folder"
+        self.folder_id = "test-folder-id"
+        self.do = 1


tests/test_performance.py

+        self.existing_rules = set()
+
+    @patch('main._api_post_form')
+    def test_push_rules_correctness_with_lock(self, mock_post):


tests/test_performance.py

+        hostnames = [f"host-{i}.com" for i in range(BATCH_SIZE * num_batches)]
+
+        # Mock delay to simulate network latency
+        def delayed_post(*args, **kwargs):


tests/test_performance.py

+        self.assertTrue(success)
+        self.assertEqual(mock_post.call_count, num_batches)
+
+        print(f"\n[Performance Test] Duration for {num_batches} batches with 0.1s latency: {duration:.4f}s")


tests/test_performance.py

+from unittest.mock import MagicMock, patch
+import time
+import threading
+import main


tests/test_performance.py

+        hostnames = [f"host-{i}.com" for i in range(BATCH_SIZE * num_batches)]
+
+        # Mock delay to simulate network latency
+        def delayed_post(*args, **kwargs):


tests/test_performance.py

+        hostnames = [f"host-{i}.com" for i in range(BATCH_SIZE * num_batches)]
+
+        # Mock delay to simulate network latency
+        def delayed_post(*args, **kwargs):


main.py

+            return True
        except httpx.HTTPError as e:
-            log.error(f"Failed to push batch {i} for folder {sanitize_for_log(folder_name)}: {sanitize_for_log(e)}")
+            log.error(f"Failed to push batch {batch_idx} for folder {sanitize_for_log(folder_name)}: {sanitize_for_log(e)}")


tests/test_performance.py

@@ -0,0 +1,81 @@
+import unittest


tests/test_performance.py

+import main
+from main import push_rules, BATCH_SIZE
+
+class TestPushRulesPerformance(unittest.TestCase):


tests/test_performance.py

+        self.profile_id = "test-profile"
+        self.folder_name = "test-folder"
+        self.folder_id = "test-folder-id"
+        self.do = 1


tests/test_performance.py

+        self.existing_rules = set()
+
+    @patch('main._api_post_form')
+    def test_push_rules_correctness_with_lock(self, mock_post):


tests/test_performance.py

+        self.assertTrue(success)
+        self.assertEqual(mock_post.call_count, num_batches)
+
+        print(f"\n[Performance Test] Duration for {num_batches} batches with 0.1s latency: {duration:.4f}s")


main.py

+            return True
        except httpx.HTTPError as e:
-            log.error(f"Failed to push batch {i} for folder {sanitize_for_log(folder_name)}: {sanitize_for_log(e)}")
+            log.error(f"Failed to push batch {batch_idx} for folder {sanitize_for_log(folder_name)}: {sanitize_for_log(e)}")


tests/test_performance.py

+from unittest.mock import MagicMock, patch
+import time
+import threading
+import main


tests/test_performance.py

+        hostnames = [f"host-{i}.com" for i in range(BATCH_SIZE * num_batches)]
+
+        # Mock delay to simulate network latency
+        def delayed_post(*args, **kwargs):


tests/test_performance.py

+        hostnames = [f"host-{i}.com" for i in range(BATCH_SIZE * num_batches)]
+
+        # Mock delay to simulate network latency
+        def delayed_post(*args, **kwargs):


Copilot

Pull request overview

This PR parallelizes rule batch uploads to significantly improve sync performance for large blocklists. The main change refactors the push_rules function to upload batches concurrently using ThreadPoolExecutor instead of sequentially, reducing upload time by approximately 5x.

Key changes:

Parallelized batch uploads within push_rules using ThreadPoolExecutor with 5 workers
Increased max_workers in get_all_existing_rules from 5 to 10 for faster rule fetching
Added performance tests to verify correctness and measure improvements

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.

File	Description
main.py	Refactored `push_rules` to process batches in parallel using ThreadPoolExecutor; increased max_workers for rule fetching from 5 to 10
tests/test_performance.py	Added new performance tests to verify parallel batch upload correctness and measure timing improvements

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-09T00:14:34Z

tests/test_performance.py

+
+    @patch('main._api_post_form')
+    def test_push_rules_concurrency(self, mock_post):
+        # Create enough hostnames for 5 batches


The comment says "Create enough hostnames for 5 batches" but the code sets num_batches = 10. Update the comment to match the actual number of batches being tested.

Suggested change

# Create enough hostnames for 5 batches

# Create enough hostnames for 10 batches

Copilot · 2026-01-09T00:14:34Z

tests/test_performance.py

+    def test_push_rules_correctness_with_lock(self, mock_post):
+        # Create enough hostnames for 5 batches
+        num_batches = 5
+        hostnames = [f"host-{i}.com" for i in range(BATCH_SIZE * num_batches)]
+
+        # Mock success
+        mock_post.return_value = MagicMock(status_code=200)
+
+        lock = threading.Lock()
+
+        start_time = time.time()
+        success = push_rules(
+            self.profile_id,
+            self.folder_name,
+            self.folder_id,
+            self.do,
+            self.status,
+            hostnames,
+            self.existing_rules,
+            self.client,
+            existing_rules_lock=lock
+        )
+        duration = time.time() - start_time
+
+        self.assertTrue(success)
+        self.assertEqual(mock_post.call_count, num_batches)
+        self.assertEqual(len(self.existing_rules), len(hostnames))
+
+        print(f"\n[Sequential Baseline (Lock)] Duration: {duration:.4f}s")


The test name test_push_rules_correctness_with_lock and its print output "Sequential Baseline (Lock)" are misleading. With the parallelization changes, this test now exercises the parallel code path (with a lock), not sequential execution. Consider renaming this test to something like test_push_rules_parallel_with_lock and updating the print message to reflect that it's testing parallel execution with thread-safety.

Copilot · 2026-01-09T00:14:34Z

tests/test_performance.py

+    def test_push_rules_concurrency(self, mock_post):
+        # Create enough hostnames for 5 batches
+        num_batches = 10
+        hostnames = [f"host-{i}.com" for i in range(BATCH_SIZE * num_batches)]
+
+        # Mock delay to simulate network latency
+        def delayed_post(*args, **kwargs):
+            time.sleep(0.1)
+            return MagicMock(status_code=200)
+
+        mock_post.side_effect = delayed_post
+
+        start_time = time.time()
+        success = push_rules(
+            self.profile_id,
+            self.folder_name,
+            self.folder_id,
+            self.do,
+            self.status,
+            hostnames,
+            self.existing_rules,
+            self.client
+        )


The test test_push_rules_concurrency doesn't verify thread-safety. Without passing an existing_rules_lock, the test relies on the fact that set.update() happens to be atomic in CPython for simple cases, but this is an implementation detail. Consider adding a test case that explicitly passes a lock (like the first test does) but with simulated latency to verify that thread-safety works correctly under concurrent load, or alternatively test the no-lock scenario with assertions that verify the set is properly updated despite potential race conditions.

Copilot · 2026-01-09T00:14:34Z

tests/test_performance.py

+import unittest
+from unittest.mock import MagicMock, patch
+import time
+import threading
+import main
+from main import push_rules, BATCH_SIZE
+
+class TestPushRulesPerformance(unittest.TestCase):
+    def setUp(self):
+        self.client = MagicMock()
+        self.profile_id = "test-profile"
+        self.folder_name = "test-folder"
+        self.folder_id = "test-folder-id"
+        self.do = 1
+        self.status = 1
+        self.existing_rules = set()
+
+    @patch('main._api_post_form')
+    def test_push_rules_correctness_with_lock(self, mock_post):
+        # Create enough hostnames for 5 batches
+        num_batches = 5
+        hostnames = [f"host-{i}.com" for i in range(BATCH_SIZE * num_batches)]
+
+        # Mock success
+        mock_post.return_value = MagicMock(status_code=200)
+
+        lock = threading.Lock()
+
+        start_time = time.time()
+        success = push_rules(
+            self.profile_id,
+            self.folder_name,
+            self.folder_id,
+            self.do,
+            self.status,
+            hostnames,
+            self.existing_rules,
+            self.client,
+            existing_rules_lock=lock
+        )
+        duration = time.time() - start_time
+
+        self.assertTrue(success)
+        self.assertEqual(mock_post.call_count, num_batches)
+        self.assertEqual(len(self.existing_rules), len(hostnames))
+
+        print(f"\n[Sequential Baseline (Lock)] Duration: {duration:.4f}s")
+
+    @patch('main._api_post_form')
+    def test_push_rules_concurrency(self, mock_post):
+        # Create enough hostnames for 5 batches
+        num_batches = 10
+        hostnames = [f"host-{i}.com" for i in range(BATCH_SIZE * num_batches)]
+
+        # Mock delay to simulate network latency
+        def delayed_post(*args, **kwargs):
+            time.sleep(0.1)
+            return MagicMock(status_code=200)
+
+        mock_post.side_effect = delayed_post
+
+        start_time = time.time()
+        success = push_rules(
+            self.profile_id,
+            self.folder_name,
+            self.folder_id,
+            self.do,
+            self.status,
+            hostnames,
+            self.existing_rules,
+            self.client
+        )
+        duration = time.time() - start_time
+
+        self.assertTrue(success)
+        self.assertEqual(mock_post.call_count, num_batches)
+
+        print(f"\n[Performance Test] Duration for {num_batches} batches with 0.1s latency: {duration:.4f}s")
+
+if __name__ == '__main__':
+    unittest.main()


The performance tests only verify successful execution paths. Consider adding a test case that verifies behavior when some batches fail (e.g., by making some mock_post calls raise HTTPError). This would ensure that the parallel error handling works correctly and that partial failures are properly reported with the correct successful_batches count.

Copilot · 2026-01-09T00:14:35Z

main.py

+        }
+
+        for future in concurrent.futures.as_completed(futures):
+            if future.result():


The call to future.result() can raise exceptions if the thread execution encounters an unexpected error beyond the handled HTTPError. If an exception occurs in _push_batch that's not caught (e.g., a programming error), it will propagate here and crash the entire upload process. Consider wrapping future.result() in a try-except block to handle unexpected exceptions and log them appropriately, ensuring other batches can continue processing.

Suggested change

if future.result():

batch_idx = futures[future]

try:

result = future.result()

except Exception as e:

log.error(

"Unexpected error while processing batch %d for folder %s: %s",

batch_idx,

sanitize_for_log(folder_name),

sanitize_for_log(e),

)

log.debug("Unexpected exception details", exc_info=True)

continue

if result:

Copilot · 2026-01-09T00:14:35Z

main.py

        return True

-    successful_batches = 0
    total_batches = len(range(0, len(filtered_hostnames), BATCH_SIZE))


The calculation len(range(0, len(filtered_hostnames), BATCH_SIZE)) is inefficient and can be simplified. Consider using integer division with ceiling: (len(filtered_hostnames) + BATCH_SIZE - 1) // BATCH_SIZE or import math; math.ceil(len(filtered_hostnames) / BATCH_SIZE). This avoids creating an unnecessary range object just to count it.

Suggested change

total_batches = len(range(0, len(filtered_hostnames), BATCH_SIZE))

total_batches = (len(filtered_hostnames) + BATCH_SIZE - 1) // BATCH_SIZE

Copilot · 2026-01-09T00:14:35Z

tests/test_performance.py

+from unittest.mock import MagicMock, patch
+import time
+import threading
+import main


Import of 'main' is not used.

Suggested change

import main

abhimehro · 2026-01-09T00:19:58Z

Hi @copilot, please open a new pull request to apply changes based on the comments in this thread. Thank you!

Copilot · 2026-01-09T00:20:06Z

@abhimehro I've opened a new pull request, #86, to work on those changes. Once the pull request is ready, I'll request review from you.

google-labs-jules · 2026-01-09T00:22:00Z

Hi @copilot, please open a new pull request to apply changes based on the comments in this thread. Thank you!

I've implemented the requested changes in this PR. Specifically:

Added tests/test_performance.py with comprehensive test cases including concurrency with failures.
Refactored main.py to use safe future.result() handling and efficient batch calculation.
Fixed logging f-string warnings.

Merge branch 'main' into bolt-perf-optimization-14774676592968160940

026db41

abhimehro marked this pull request as ready for review January 9, 2026 00:10

Copilot AI review requested due to automatic review settings January 9, 2026 00:10

github-actions bot added the python label Jan 9, 2026

abhimehro self-requested a review January 9, 2026 00:10

abhimehro assigned abhimehro and Copilot Jan 9, 2026

abhimehro removed their request for review January 9, 2026 00:11

Copilot started reviewing on behalf of abhimehro January 9, 2026 00:11 View session

github-advanced-security bot found potential problems Jan 9, 2026

View reviewed changes

Copilot AI reviewed Jan 9, 2026

View reviewed changes

Copilot AI mentioned this pull request Jan 9, 2026

Address PR review feedback: improve error handling, test clarity, and calculations #86

Open


		print(f"\n[Performance Test] Duration for {num_batches} batches with 0.1s latency: {duration:.4f}s")

		if __name__ == '__main__':

	# Create enough hostnames for 5 batches
	# Create enough hostnames for 10 batches

-            if future.result():
+            batch_idx = futures[future]
+            try:
+                result = future.result()
+            except Exception as e:
+                log.error(
+                    "Unexpected error while processing batch %d for folder %s: %s",
+                    batch_idx,
+                    sanitize_for_log(folder_name),
+                    sanitize_for_log(e),
+                )
+                log.debug("Unexpected exception details", exc_info=True)
+                continue
+            if result:

	total_batches = len(range(0, len(filtered_hostnames), BATCH_SIZE))
	total_batches = (len(filtered_hostnames) + BATCH_SIZE - 1) // BATCH_SIZE

⚡ Bolt: Parallelize rule batch uploads in main.py #73

Are you sure you want to change the base?

⚡ Bolt: Parallelize rule batch uploads in main.py #73

Uh oh!

Conversation

google-labs-jules bot commented Jan 5, 2026

Uh oh!

google-labs-jules bot commented Jan 5, 2026

Uh oh!

Check warning

Check warning

Check warning

Check warning

Check warning

Check warning

Check warning

Check warning

Check warning

Check warning

Check warning

Check warning

Check notice

Check notice

Check notice

Check warning

Check warning

Check warning

Check warning

Check warning

Check warning

Check notice

Check notice

Check notice

Check notice

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

abhimehro commented Jan 9, 2026

Uh oh!

Copilot AI commented Jan 9, 2026

Uh oh!

google-labs-jules bot commented Jan 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants