-
Notifications
You must be signed in to change notification settings - Fork 14
Shared SQLAlchemy sessions #111
Description
Hi, I recently ran into an issue with the GCS importer code for openrelik-server.
For some background, I'm making an attempt at running the OSDFIR project in GKE, pulling data from a GS bucket.
I built the importer code to deploy as a pod, it worked great. However, I found that when it was receiving many pub/sub notifications at once, it started to have trouble. After some handy AI prompting, it seems the issue has to do with shared SQLAlchemy Sessions.
Example error:
sqlalchemy.exc.InvalidRequestError: This session is provisioning a new connection; concurrent operations are not permitted.
I've attached an AI research report, that hopefully explains it ok. Certainly much better than I can. I've also tested out a solution based on the Session Per Message (Per Thread) option in the report and it appears to be working
Review of GCP Importer Code (OpenRelik Server).pdf
Here are some samples of the files I've worked with for the mentioned testing:
There may be a side issue to look at. While watching the messages from the importer pod I tested, I eventually saw this message
Dropping 125 items because they were leased too long
In this test, I'd uploaded almost 200 files of various size all at once, so there may still be some other limits at play.