Skip to content
sherrya2 edited this page Feb 10, 2017 · 1 revision

Naming Scheme and Proper Practices

A few general things to consider when creating new tables or columns in the database:

  • Table and Column names should be in camel case. One exception is any use of the term “id” as an identifier, such as “schemeid” “gid” or “pid”. In these instances, the first letter of “id” does not need to be capitalized.
  • Avoid redundancy in table names. “project” is preferrable to “projectCatalog”. In general, stick to singular nouns in table names (“project” as opposed to “projects”).
  • Whenever possible, each table should have a primary key.
  • Use the TINYINT(1) type for boolean flags
  • Unless there is a good reason not to, use the VARCHAR type (with a reasonable provided size) for text fields
  • Try to provide specific identifiers, such as “pid” for a project identifier, rather than simply using the generic term “id”
  • If storing an ordering of data items in the database via a column, please name the column “sequence” and have the first object in the order have a sequence value of 1.

Table Descriptions and Assumptions

collection

This table contains a list of the various control collections found in schemes across the KORA install. Its columns:

  • collid - collection id.
  • schemeid - scheme id.
  • name
  • description
  • sequence - corresponds to a collection's sequence within its scheme (the list starts at 1).

controlPreset

This table contains a list of the various control presets (preset sets of control options) throughout KORA.

  • presetid
  • name
  • class - corresponds to the class column of the control table.
  • project - refers to the pid of the project that owns the preset, except in the case of a stock preset, where the project is set to 0.
  • global - flag refers to whether a preset is global (visible to all projects) or local (visible only to the project which owns it).
  • value - the value corresponding to the preset; its format depends on the control.

control

This table contains a set of metadata referring to the controls currently installed in the KORA system. It is populated by the “Update Control List” function on the System Management page. It has three columns:

  • name - the name that should be shown internally for the control.
  • file - the file in the controls/directory where the control is stored.
  • class - the name of the control class.
  • xmlPacked - whether or not the function stores its data in an XML format in the database as opposed to plaintext.

dublinCore

This table contains a set of controls relevant for Dublin Core metadata standards and use within KORA.

  • kid varchar – The object the file is part of
  • pid int – The project ID
  • sid int – The scheme ID
  • title varchar
  • creator varchar
  • subject varchar
  • description varchar
  • publisher varchar
  • contributor varchar
  • dateOriginal varchar
  • dateDigital varchar
  • type varchar
  • format varchar
  • source varchar
  • language varchar
  • relation varcher
  • coverage varcher
  • rights varcher
  • contributingInstitution varcher
  • timestamp – Records the last time the object updated

fixity

This table is used to store information about files stored in KORA and marked for archival. It has seven columns in three groups.

  • kid - the object the file is part of.
  • cid - the id of the control to which the file was ingested.
  • path - the absolute path to the file on the filesystem.
  • initialHash - the hash of the file upon initial ingestion.
  • initialTime - the time of that initial hash computation.
  • computedHash - the most recent hash computed.
  • computerTime - the time at which the most recent hash computation was performed.

The hash method is customizable, although SHA512 is the stock default.

member

This table is used to store a list of memberships in permissions groups, and indirectly to determine what users have access to a project and what projects a user has access to.

  • uid - user id.
  • pid - project id.
  • gid - a group id from the permGroup table,** except for search tokens, where the gid will be set to 0.**

permGroup

This table stores a list of the permissions groups in the KORA system.

  • gid - the group's identifier.
  • pid - the id of the project the group belongs to.
  • name
  • permissions - a bit vector corresponding to the group's permissions. For a more detailed explanation, consult the KORA Permissions Model section of this document.

project

This table stores a list of all the projects in the KORA system.

  • pid
  • name
  • description
  • styleid – will eventually correspond to an identifier for a per-project stylesheet functionality; this feature has not been implemented as of KORA 1.0.0.
  • admingid – the id of the project's administrators group.
  • defaultgid – the group which users are by default joined to when they are first added to the project.
  • active – flag that tells whether a project is active or inactive. Inactive projects are “disabled” – they provide a way to prevent access to a project without actually deleting its data and layout.
  • quota – maximum file size for the project
  • currentsize – current size of the project

recordPreset

This table contains a list of which records in KORA are presets.

  • recpresetid - a primary key, used to have a unique integer associated with each preset (which can simplify the passing of variables in some instances).
  • schemeid - the scheme to which the record belongs.
  • kid - the record which is a preset.
  • name

scheme

This table stores a list of all the schemes in the KORA system.

  • schemeid
  • schemeName
  • description
  • sequence
  • pid – the id of the project to which the scheme belongs.
  • dublinCoreFields – contains an XML string corresponding to a set of associations of controls in the scheme to Dublin Core fields.
  • dublinCoreOutOfDate – a flag indicating whether the dublin core fields have changed since the last time the Dublin Core data for the scheme was rebuilt.
  • nextid – the next integer available to base a unique KORA Identifier from.
  • crossProjectAllowed – contains an xml-packed list of what schemes have permission to associate objects to and from this scheme. Each entry in the block corresponds to a project/scheme which is allowed to associate to the scheme, while each entry in the block corresponds to a project/scheme from which the scheme is allowed to associate.
  • allowPreset – a flag indicating whether this scheme is a preset
  • publicIngestion – a flag indicating whether the record can be ingested from outside of KORA
  • legal – a legal notice that a user can define for a scheme

style

This table is in place for a feature, not implemented as of KORA 1.0.0, to allow per-project selection of different stylesheets.

  • styleid
  • filepath – will contain the name of the stylesheet file.
  • description – will contain a brief description of the stylesheet.

systemInfo

  • version – database version of KORA.
  • baseURL – the base URL for a user's website in KORA.

user

This table stores information about the users in the KORA system.

  • uid – user ID
  • username
  • password – a stored password, salted with the contents of the salt column, and encrypted with the SHA256 algorithm. The exception is for search tokens, where salt will be set to 0, and the username and password will be a 24-character hexadecimal string. Since the password hashes are considerably longer than 24 characters, this prevents search tokens from being used as login accounts.
  • salt – a timestamp used in the authentication process with username and password.
  • email – user email address.
  • realName – user's real name.
  • organization – user's organization.
  • language – user's default language
  • admin – a flag for whether the user is a KORA Administrator.
  • confirmed – stores whether the user has confirmed their e-mail address or not.
  • searchAccount – flag is set if the account is a search token.
  • allowPasswordReset – used if the user requests their password to be reset.
  • resetToken – see following.

Upon requesting the page be sent to them, a 16-character random string from the set [a-zA-z0-9] is filled into the resetToken column and the allowPasswordReset flag is set. If the user resets their password, the allowPasswordReset flag is set to 0, preventing the token from being reused.

Project Control Tables (p*Control)

The control tables are created for each project and correspond to the controls for the various schemes in the project.

  • cid - the table's primary key. The value 0 is reserved as an implied control for storing the list of reverse associations i.e. what records associate to a record.
  • schemeid - scheme the control belongs to.
  • collid - collection the control belongs to.
  • type - the class column in the control table.
  • name
  • description
  • sequence - controls the control's ordering in its collection
  • options - contains an XML string with various configuration options for the control.
  • required - flag that states whether input into the control is required.
  • searchable - controls whether the control's data is looked at during internal searches.
  • advSearchable - controls whether the control's data is looked at during advanced scheme searches
  • showInResults - flag that controls whether the control is shown in search results lists internally.
  • showInPublicResults - related to public ingestion.
  • publicEntry - related to public ingestion.

Public ingestion is a feature that has not been implemented as of KORA 1.0.0.

Project Data Tables (p*Data)

The per-project data tables contain only four columns:

  • id - the KORA record identifier
  • cid - the control the data row is for.
  • schemeid - the scheme the record belongs to.
  • value - the control to store the necessary data.

This data model was chosen after a long debate and careful analysis of the needs of the system. A more traditional data model, with a table for each scheme and a column for each control, had been tried in the past. While more conducive to searching, this led to a considerable amount of wasted data with empty columns in rows and to considerable stress on the database due to an overuse of ALTER TABLE commands when controls were added, deleted, or changed. As well, it reduced the flexibility to change control names and other settings. To avoid these flaws and improve performance, this model was devised to eliminated wasted space while still maintaining a reasonable level of search performance.

Clone this wiki locally