Skip to content

Redesign #27

@SomajitDey

Description

@SomajitDey
  • refs must be branches as otherwise GraphQL fail, GitHub doesn't track activity, REST API doesn't list all refs pointing to expiry-commit for GC
  • shard refs as refs/kv-ab/val-cdef01.... for key-commit hash abcdef01.... just as done by git in .git/objects. Shard length should be provided as optional argument (default 2 such as ab) because if pre-sharding (see Discussions on scale #29) is done at organization level, shard length within repo should be > 2
  • data in data-blob, metadata in meta-blob (for CDN access) as well as in commit message (for REST API access). Each key or value now is a commit consisting of a data-blob, a meta-blob, usual tree containing paths with extensions including metadata.json, commit message with duplicate metadata
  • metadata contains JSON: { version, category, blobOid, treeOid, type, mime, extension, encrypted, encryption }. encryption contains user defined {kid, algo} to be fed to decryptor. version is git-keyval version, category: data | expiry | map | array | counter | redirect
  • expiry commits are special. Each expiry commit encapsulates the same dummy tree (anchored from git-GC by a tag), the expiry ID residing in the commit message
  • commit messages are obtained at once with GET /ref REST API
  • expiry commit Oids are used as tags in this repo (public) immutably mapping to the expiry ID, avail so over CDN even if user's repo is private. This spares a rate-limited REST API call. Cache using local lru-cache
  • if most users use database in a Redis-like fashion with memorable string-typed keys then they don't need to commit the keys or have a sharded-key-ref anchor the key from git-GC. Encourage users not to store keys if possible
  • no expiry-ref for persistent keys
  • only one statically-named (hence indexable by GitHub backend) mandatory value-ref per key mapping (and anchoring from git-GC) the value commit against keyId, as desired for speed, consistency and economy
  • redirect commits are like commit message only commit like expiry commits. Redirects to other repositories containing the object
  • counter is also commit message only for rapid INCR op and lookups
  • array and maps are trees with path name based keys/index, and commit message recording the metadata shared by all the blobs
  • array indices and map keys can have individual expiries, denoted as keyId-elementId => expiryID mapping in refs/heads
  • Use base_tree and tree.sha: null with https://docs.github.com/en/rest/git/trees?apiVersion=2022-11-28#create-a-tree
  • counter increment protocol:
    • try update first. if denied then,
    • leave breadcrumb with known prefix and timestamp in ref name in tags/ pointing to increment interval e.g. +1, +2 as commit message
    • unconditional SET op leaves own breadcrumb with known prefix and also timestamp in ref name to invalidate earlier breadcrumbs pointing to null commit
    • periodic gatherer aggregates tags/ and deletes
    • those not deleted/collected are GCd based on timestamp
  • GETDEL command to be implemented with branch renaming into refs/gc/ where refs are dumped for GC

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions