Skip to content

Shared SQLAlchemy sessions #111

@kev365

Description

@kev365

Hi, I recently ran into an issue with the GCS importer code for openrelik-server.
For some background, I'm making an attempt at running the OSDFIR project in GKE, pulling data from a GS bucket.

I built the importer code to deploy as a pod, it worked great. However, I found that when it was receiving many pub/sub notifications at once, it started to have trouble. After some handy AI prompting, it seems the issue has to do with shared SQLAlchemy Sessions.

Example error:
sqlalchemy.exc.InvalidRequestError: This session is provisioning a new connection; concurrent operations are not permitted.

I've attached an AI research report, that hopefully explains it ok. Certainly much better than I can. I've also tested out a solution based on the Session Per Message (Per Thread) option in the report and it appears to be working

Review of GCP Importer Code (OpenRelik Server).pdf

Here are some samples of the files I've worked with for the mentioned testing:

importer.txt
database.txt

There may be a side issue to look at. While watching the messages from the importer pod I tested, I eventually saw this message
Dropping 125 items because they were leased too long

In this test, I'd uploaded almost 200 files of various size all at once, so there may still be some other limits at play.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions