Skip to content

Bug: from CE (consistency) dark_action, an unknown database exception has occurred #940

@mrguilima

Description

@mrguilima

Bug Description

The consistency enforcement (CE)'s declare_dark.py script sometimes finishes with a "rucio error: Database exception.\nDetails: An unknown Database Exception has occurred".
Actually, this happens consistently for two sites: T1_DE_KIT_Tape and T3_US_Colorado.

With Eric's help, I've looked into the server-rucio-server's logs, for more info about this database exception.
Here is the python traceback:

[Thu Jul 10 14:33:18.651722 2025] [wsgi:error] [pid 58:tid 621] [remote 10.100.2.157:50248]
{"message": "DatabaseException in rucio.web.rest.flaskapi.v1.replicas QuarantineReplicas POST\\n
Database exception.\\n
Details: (cx_Oracle.IntegrityError) ORA-01400: cannot insert NULL into (\\"CMS_RUCIO_PROD\\".\\"QUARANTINED_REPLICAS\\".\\"RSE_ID\\")\\n
[SQL: INSERT INTO \\"CMS_RUCIO_PROD\\".quarantined_replicas (created_at, updated_at) VALUES (:created_at, :updated_at)]\\n
[parameters: {'created_at': datetime.datetime(2025, 7, 10, 14, 33, 18, 647161),
              'updated_at': datetime.datetime(2025, 7, 10, 14, 33, 18, 647163)}]\\n
(Background on this error at: https://sqlalche.me/e/20/gkpj)\\n
  File \\"/usr/local/lib/python3.9/site-packages/rucio/web/rest/flaskapi/v1/common.py\\", line 100, in dispatch_request\\n
     return super(ErrorHandlingMethodView, self).dispatch_request(*args, **kwargs)\\n
  File \\"/usr/local/lib/python3.9/site-packages/flask/views.py\\", line 191, in dispatch_request\\n
     return current_app.ensure_sync(meth)(**kwargs)  # type: ignore[no-any-return]\\n
  File \\"/usr/local/lib/python3.9/site-packages/rucio/web/rest/flaskapi/v1/replicas.py\\", line 952, in post\\n
     quarantine_file_replicas(replicas, issuer, rse=rse, rse_id=rse_id, vo=vo)\\n
  File \\"/usr/local/lib/python3.9/site-packages/rucio/gateway/quarantined_replica.py\\", line 76, in quarantine_file_replicas\\n
     add_quarantined_replicas(rse_id, replica_infos, session=session)\\n
  File \\"/usr/lib64/python3.9/contextlib.py\\", line 137, in __exit__\\n
     self.gen.throw(typ, value, traceback)\\n
  File \\"/usr/local/lib/python3.9/site-packages/rucio/db/sqla/session.py\\", line 524, in db_session\\n
     raise DatabaseException(str(error))\\n",
  "error": {"type": "DatabaseException", "message": "Database exception.\\n

  "@timestamp": "2025-07-10T14:33:18.648Z", "log": {"level": "DEBUG", "logger": "root"}, "process": {"pid": 58}}

Reproduction Steps

The problem happens every time the CE scripts run for those two sites, see T1_DE_KIT_Tape and T3_US_Colorado.

Expected Behavior

Without this DB exception, the dark files are expected to be quarantined.

Possible Solution

No response

Related Issues

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions