-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomers
Description
About
- Brief information about ark-services project
- The database's design as Key and Value in Mysql has performance issue when the database is growing. Basically, the issue is in order to pull the report of Ark ID with metatdata, we have to do mysql JOINs, but they take so long to get the results.
How to reproduce
- Install WAMPP or XAMPP into local development
- Follow these step to install the application https://github.com/digitalutsc/arks-service/wiki
- Contact Kyle to get the current database file.
- Go to phpmyadmin portal (http://localhost/phpmyadmin) or use MYSQL workbench , tryout these SQL
SELECT arks.*
FROM `dsu_ark`
AS arks
JOIN (
SELECT bound.id,
COALESCE(redirected._value, 0)
AS _value
FROM (
SELECT DISTINCT REGEXP_SUBSTR(_key, '^([^\\s]+)') AS id
FROM `dsu_ark`
WHERE _key LIKE '61220/utsc%' AND _key NOT REGEXP '(\\s:\/c|\\sREDIRECT|\\sPID|\\sLOCAL_ID|\\sCOLLECTION)$' AND (_key LIKE '%%' OR _value LIKE '%%')
) AS bound
LEFT JOIN (
SELECT REGEXP_SUBSTR(_key, '^([^\\s]+)') AS id, _value
FROM `dsu_ark`
WHERE _key LIKE '61220/utsc%' AND _key REGEXP '\\sREDIRECT$'
) AS redirected ON bound.id = redirected.id
ORDER BY _value ASC
LIMIT 10
OFFSET 0
) AS subquery
ON arks._key LIKE CONCAT(subquery.id, '%')
AND arks._key NOT LIKE '%:\/c'
ORDER BY arks._key ASC;
SELECT arks.*
FROM `dsu_ark`
AS arks
JOIN (
SELECT * FROM (
SELECT DISTINCT REGEXP_SUBSTR(_key, '^([^\\s]+)') AS id
FROM `dsu_ark`
WHERE _key LIKE '61220/utsc%' AND _key NOT REGEXP '(\\s:\/c|\\sREDIRECT|\\sPID|\\sLOCAL_ID|\\sCOLLECTION)$'
INTERSECT
SELECT DISTINCT REGEXP_SUBSTR(_key, '^([^\\s]+)') AS id
FROM `dsu_ark`
WHERE _key LIKE '61220/utsc%' AND _key NOT REGEXP '\\s:\/c' AND (_key LIKE '%%' OR _value LIKE '%%')
) AS target
ORDER BY id ASC
LIMIT 10
OFFSET 0
) AS subquery
ON arks._key LIKE CONCAT(subquery.id, '%')
AND arks._key NOT LIKE '%:\/c'
ORDER BY arks._key ASC;
- These query is querying the ARKs IDs with metadata which is bound to it. As
LIMIT 10which is querying 10 at a time, but it's take at least 15 secs to load which is quite slow for just 10 items.
Goals
- Develop MySQL query which can pull Arks IDs and their metadata more sufficiently in a acceptable loading time
- as 10, 25, 50 at a time (from the UI)
- All items (exporting to CSV feature)
Stretch Goals
- If the Goal above can't be achieved, we may look into restructure the database and look for alternative database platform to use
- Scope the migration and development, and testing performance in the new database platform, for example:
Related tickets
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomers
Type
Projects
Status
No status