-
Notifications
You must be signed in to change notification settings - Fork 102
Open
Description
Currently when using the AsyncWritter, it is possible to have an OOM error due to the queue being huge.
For instance this snippet will fill up the queue faster than it can be send via https to hdfs
import string
import random
import hdfs
client = hdfs(<valid arguments>)
with client.write("filename", encoding="utf-8") as file_handle:
writer = csv.writer(file_handle)
# creates 25 pseudo lines of csv junk
for element in [["".join(random.choice(string.ascii_letters) for _ in range(100)) for _ in range(25)] for _ in range(25)]:
writer.writerows(element)Leading to a unmanageable large memory usage.
Is it possible to have a limit on the queue size when creating a file_handle?
If you like I would like to create a PR with a possible solution?
Metadata
Metadata
Assignees
Labels
No labels