feat: add GroupWorkPool for managing global bitswap concurrency#406
feat: add GroupWorkPool for managing global bitswap concurrency#406
Conversation
rvagg
commented
Sep 4, 2023
- Removes the per-retrieval preload concurrency handling
- Adds a per-Lassie concurrency pool which has a maximum number of parallel workers and a maximum number of workers per "group"; each retrieval is a "group"
- New default total bitswap concurrency is 32
- New default per-retrieval concurrency is 12
d22123e to
48abb35
Compare
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #406 +/- ##
==========================================
+ Coverage 76.05% 76.65% +0.59%
==========================================
Files 84 85 +1
Lines 6228 6275 +47
==========================================
+ Hits 4737 4810 +73
+ Misses 1242 1224 -18
+ Partials 249 241 -8
|
48abb35 to
636fff5
Compare
|
Open to suggestions on defaults .. 32 max and 12 per retrieval is a bit arbitrary. Apparently Kubo maintains a 32 global parallelism somewhere; 12 is .. double the current 6, but 32 is not a multiple of 12 but 🤷. 32 & 8, 48 & 12, 32 & 32? |
|
got flaky failures on windows: I could suspect the first one is because of #398 but the second one should have completed. Need to investigate that I think. |
6d2793c to
645bf1d
Compare
|
Experimenting with pulling in fix attempts from #398 into here, btw |
|
Relevant failure to investigate: https://github.com/filecoin-project/lassie/actions/runs/6081427236/job/16497917067?pr=406 |
8484bad to
106a4b9
Compare
|
At this stage, with the current form of #398 included, I'm willing to believe that flakies that are bitswap focused here are going to be the fault of the bitswap client itself. One expected difference here is that we're hitting the client with a little bit more parallelism than we currently are on |
78dc756 to
cf96aa6
Compare
* new total bitswap concurrency is 32 * new per-retrieval concurrency is 12
* `daemon` has both `--bitswap-concurrency` and `--bitswap-concurrency-per-retrieval` * `fetch` just has `--bitswap-concurrency` that sets both values to be the same
89ce40f to
2c6819b
Compare
|
Argh! failures from bitswap flakies that I thought we'd addressed in #398
|
bb4c0f3 to
2cea5ef
Compare
2e5541c to
153e9b8
Compare
|
Those 2 flakies have been fixed https://github.com/filecoin-project/lassie/compare/2c6819b8e32258bc02ef6afb4399173ce2205614..68b96aa49ee3f18a926e93e6842c7ce4ed8dd65b turns out it was about provider timeouts, 500ms is too short on slower windows machines, bitswap doesn't quite get set up in time. So I've extended the timeout on one and removed it on another. |