-
Notifications
You must be signed in to change notification settings - Fork 36
Open
Description
Summary
When we add a compute plan with N tasks, we can set autobatching to True and set the batch size.
This submits the tasks to the backend by batches of size batch_size. The fastest option is to increase the batch size as much as possible without getting backend errors.
The default batch size is 500, the question here is: how to find the maximal batch size we can use?
What happens when the batch size is too big
When the batch size is too big (451 tasks * 400 data samples per task), we get the following error
Requests error status 429: {"message":"grpc: received message larger than max (6228668 vs. 4194304)"}
Traceback (most recent call last):
File "HIDDEN/substra/sdk/backends/remote/rest_client.py", line 114, in __request
r.raise_for_status()
File "HIDDEN/requests/models.py", line 960, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 429 Client Error: Too Many Requests for url: HIDDEN/task/bulk_create/Metadata
Metadata
Assignees
Labels
No labels