Skip to content

Performance of CouchDB is an issue once reaching 20k records - require indexing of regularly queried fields #2505

@Mark-J-Lawrence

Description

@Mark-J-Lawrence

Background

We've had an ecosystem running for a couple of weeks now, and there are 17774 run records in the couchDB database.

This amount was unexpected, and after investigating, I found our RAS cleanup process wasn't working correctly, which is now resolved and will delete ~11k rows out of the 17k+, HOWEVER, the reason I started looking into this was due to a huge performance slowdown for API calls to the ecosystem. This would often result in 504s and group runs failing within our polling CI run. Our processes absolutely hammer the API endpoints and we need them to be performant.

The ecosystem should be able to handle having 20k+ records in the Db.

I think the key thing here, is that we are constantly querying the DSS and RAS using:

  • group
  • runName
  • runId
  • from
  • to
  • we always want detail=methods

via the RAS API, which will map to db fields

....and then we often query via our eclipse plugin on requestor and owner, then maybe on certain tags.

We also search regularly on streamName via the Streams API, which will relate to a db field.
Additionally on namespace & propertyName via the CPS API, which will relate to db fields.

I suspect no form of indexing has been set up on any of fields that are regularly queried by customers, this will result in a full db scan, which would explain their poor performance. CouchDB sets up a default "Primary Index" on the document ID (i.e. runId), which will explain why its very quick when searching on that.

There are a number of ways to set up indexing in CouchDB, which are described well in this blog.

Some evidence:

  • 10.3s (!!!) for a GET on https://<server>/api/ras/runs?runname=U134115
  • 483ms for a GET on the same run but using https://<server>/api/ras/runs?runId=cdb-db069dde-a163-40da-ae5e-ccd6910cf24d-1766079846823-U134115
  • 8.6s for a GET on the group that contains the above run using https://<server>/api/ras/runs?group=yueeYxJiFl

Tasks

  • Investigate existing indexes to see if they are being used correctly
  • Make any necessary adjustments to the indexes that are created when the CouchDB RAS store initialises

Metadata

Metadata

Assignees

Labels

6-EcosystemEcosystem/Automation system issuesNeeds ReviewThis work item needs reviewing by a member of the dev teamcics

Type

No type

Projects

Status

🏗 2 In progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions