Skip to content

Conversation

@agerasev
Copy link

This PR suggests using local thread pool instead of global one.
It allows to selectively limit number of threads for tasks that consume a lot of RAM (like building FlashAttention) without limiting other ones, and resolves #2 as well.

@agerasev agerasev mentioned this pull request Jul 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Multiple Builder instances

1 participant