Example org to demonstrate creating a monorepo with multiple shards that are synced to dedicated read-only repos using git subtrees.
Git subtrees allow nesting one or more repositories inside of another within a sub-directory. Changes could then be synced to read-only child repositories in near real-time. This allows development of a Crystal project to reap the benefits of a monorepo, while still adhering to Shard's "1 shard per repo" requirement. This repo represents a mono-repo of a mock project to demonstrate the process/how it looks.
There are some things worth pointing out based on my experiences playing around with this. Definitely open to suggestions/PRs on how to address there, or to add extra info from you experiences:
- Individual commits from
subtree addlose association with the prefix.- E.g. https://github.com/crystal-manyrepos/root/commits/master/src/components/one and notice how it does not include the
Add .sum methodcommit, only the merge commit when it was added.
- E.g. https://github.com/crystal-manyrepos/root/commits/master/src/components/one and notice how it does not include the
git subtree pushneeds to traverse EVERY commit, which could lead to performance issues as time goes on.- Are ways that this can be improved, so can worry about it if/when there is a reproducible case of this issue (thousands of commits).
This repo COULD be used as the main shard for the project by defining a shard.yml that adds the required components as dependencies, then creating a src/root.cr file that requires "./one" where that file does require "one". This way both single components can be required as well as all of them.
NOTE: Using this shard results in the source code being duplicated, once from the required child shards, and once from
src/. However, since the code fromsrc/is never directly required, it won't be included in the binary.
In regards to versioning, one option is to version everything together by syncing tags down to child repos. Another option would be to version each component on its own within the child repos themselves.
- Subtree in the repo into a related component directory, keeping past history:
git subtree add --squash --prefix src/components/<component-name> git@github.com:crystal-manyrepos/<repo-name>.git <branch>- The
--squashoption can be used to add the child repo's history as one commit, versus essentially duplicating it into therootrepo. Due to the first gotcha, squashing the history makes the most sense as you would need to use the child repo anyway to look at the full history of files/directories. In this repothreewas squashed whileoneandtwowere not. - NOTE: If using a dev branch and creating a PR into
master, be sure to NOT squash merge the PR, especially when adding more than one child repo, as this will break the special text that goes into the commit message thatgit subtreeuses. Create a merge commit or rebase merge to ensure the commits added bygit subtree addare not altered.
- The
- Update scripts/sync.sh to include handle the new repo
- Add new repo to
shard.dev.ymlas a dependency - (optional) Add it to
shard.ymland/orsrc/root.cras well as making an entry point file withinsrc/if this repo is a shard itself
- Do development in this repo
- (optional) A shard.dev.yml can be used for testing purposes by installing all child components as symlinks to their src/components/ directory.
SHARDS_OVERRIDE=shard.dev.yml shards update
- (optional) A shard.dev.yml can be used for testing purposes by installing all child components as symlinks to their src/components/ directory.
- Push up/merge a PR with changes into the
masterbranch - Sync change to remotes
./scripts/sync.sh
- See commits have been synced:
The sync.sh script would ideally be invoked as part of a CD flow. I.e. once a PR's checks pass, is approved, and merged into master, the sync script would run to push the changes to each child repos in near real time. While there are other options, this can most easily be accomplished via defining a GitHub Action on the push event scoped to the master branch. An example of how this would look is .github/workflows/sync.yml where SYNC_TOKEN is a Personal Access Token (PAT) with the repo permissions added as an Encrypted Secret.
NOTE: Since a PAT cannot be scoped to certain repos, it is suggested to create a dedicated Machine User that only has access to the child repos, and no personal repos, to handle the syncing.
An example run using this workflow for the previous 900cb commit to root would look like: https://github.com/crystal-manyrepos/root/runs/4496182825?check_suite_focus=true.