Incremental Data Migration

### Problem
We have a schema based design that strictly adheres to candid and it's migration policy. Which means users would have to use either variants or Optional fields to add new fields and types to their schema. These two options don't provide the best user experience in motoko as handling them requires a series of switch statements and extractor functions to access the newly added field or type. The alternative, adding the field directly to a record, leads to a totally new type that's incompatible with the old one. Resulting in a need to run a migration function over all the values to convert them to the new type at the cost of a considerable amount of cycles depending on how large your dataset it.

### Solution
Our schema based approach can be an advantage here as the user let's us know their intended type definitions. With this data the db can format the documents so it matches the intended type. Provided with a set of instructions by the user on how to upgrade the data between schemas, the db can retrieve previous documents and present them in the new type on demand, preventing the user from having to spend cycles migrating all the data at once. Instead we unlock a new paradigm, incremental updates, where we migrate documents to the new type only when their content is being updated. For example, if we were adding a new field `comments` to an existing type `Post` with two fields an `id` and `likes`, it would look something like this.

```motoko
    type PostV1 = {
         id : Nat;
         likes: Nat;
    };
    
    posts.insert({ id = 1; likes = 328_832 });
    posts.insert({ id = 2; likes = 232 });
    posts.insert({ id = 3; likes = 91 });
         
```

There three documents in the db are stored internally as `PostV1`

```motoko

    type PostV2 = #Record([
        ("id", #Nat), ("likes", #Nat), ("comments", #Array(#Text))
    ]);

    type PostV2 = {
         id : Nat;
         likes: Nat;
         comments: [Text];
    };
    
    let migration_instructions = [("comments", #Array([]))]
    
    posts.update_schema(PostV2, migration_instructions);
    
    assert posts.search(ZenDB.QueryBuilder()) == #ok([
        { id = 1; likes = 328_832; comments = []},
        { id = 2; likes = 232; comments = [] },
        { id = 3; likes = 91; comments = [] }
    ]);

```

The schema is updated but none of the contents are updated, they are still stored as `PostV1`.
However, when the data is retrieved the `migration_instructions` are applied to the data stored as `PostV1` to transform it to `PostV2`

Internal Representation after transform and get
| PostV1 | content | 
| - | - |
| 1 | { id = 1; likes = 328_832 } |
| 2 | { id = 2; likes = 232 } |
| 3 | { id = 3; likes = 91 } |

Because this is most likely a query call the transformed data will not be saved so the contents will still be stored as type `PostV1`. However, when a field is upda
```motoko
   let #ok(ids) = post.update(
       ZenDB.QueryBuilder().Where("id", #eq(#Nat(1))),
       [
           ("likes", #add(#currValue, #Nat(100))),
           ("comments", #Array([#Text("migration")]))
       ]
   );
   
   assert post.get(ids[0]) == #ok({ id = 1; likes = 328_932; comments = ["migration"]});
   
    let #ok(ids) = post.update(
       ZenDB.QueryBuilder().Where("id", #eq(#Nat(3))),
       [
           ("likes", #mul(#currValue, #Nat(100))),
       ]
   );
   
   assert post.get(ids[0]) == #ok({ id = 3; likes = 9_100; comments = []});
```

Internal Representation after update
| PostV1 | content | 
| - | - |
| 2 | { id = 2; likes = 232 } |

| PostV2 | content | 
| - | - |
| 1 | { id = 1; likes = 328_932; comments = ["migration"]} |
| 3 | { id = 3; likes = 91; comments = [] } |

### Proposed Features
- Support Schema Versioning, so users still have access to all the previously saved schemas after a schema is updated 
- Add instructions to migrate data when updating a schema
- Incrementally migrate documents to the new version of the schema whenever their contents are updated 
- An an internal index to keep track of the documents and their schema versions.
  - This will allow users to view the number of documents that have been migrated to existing schemas
  

### Incremental `get()` migrations
Initially, I thought it would be best to make just apply the migrate fn on the results of any query that doesn't update the data to accommodate canister query calls that don't update the canister state. After taking a look at the [`new-base` motoko library ](https://mops.one/new-base@0.4.0/versions), I noticed there was an `isReplicated()` method that essentially checks if the canister function the code is being executed in is either a `query` call or an `update` call. `True` means it is and `false` means its not. With this functionality we can store the data migrated to the new schema. If not, we will just return the migrated data without saving.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incremental Data Migration #18

Problem

Solution

Proposed Features

Incremental `get()` migrations

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PostV1	content
1	{ id = 1; likes = 328_832 }
2	{ id = 2; likes = 232 }
3	{ id = 3; likes = 91 }

PostV2	content
1	{ id = 1; likes = 328_932; comments = ["migration"]}
3	{ id = 3; likes = 91; comments = [] }

Incremental Data Migration #18

Description

Problem

Solution

Proposed Features

Incremental get() migrations

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Incremental `get()` migrations