bocas is an opinionated open source framework for organizing,
orchestrating, and ultimately publishing research experiments.
Some design highlights of bocas include:
- the ability to cache artifacts between experiment runs
- the de-coupling of plot generation and training jobs
bocasaugments theml-collectionslibrary to allow you to describes an array of experiments in a single config- run all of the experiments with a single command
- gather artifacts from the experiments
- aggregate the results into plots, tables, and figures for use in your final report
- easily combine results from multiple experiments
- and more!
- Basic: Oxford 102 flowers classification example
- Intermediate: Object detection benchmarks with KerasCV
- Overview
Using bocas is easy!
To get started, you need to be familiar with a few concepts.
This overview covers everything you need to know.
To quickly jump right into things, check out the Oxford 102 flowers classification example.
In the mental model of bocas there exists Tasks and Tactics. A Task is
something like: "classify images from MNIST", or "cluster samples into N classes", or
"perform generative learning in X style".
A Tactic refers to the combination of all the details used to produce a
solution to a Task. For example, one such Tactic for solving MNIST classification might
be to train a ResNet50V2 on data augmented with AugMix.
Typically, to get a publishable result your paper will require you to have numerous
tactics to benchmark your novel tactic against.
Typically, a research work will have many Tasks: where the overall goal of the paper is to benchmark a new Tactic's ability at solving a variety of tasks.
bocas is structured around this idea: you will have at least one Task, and
each Task may be solved by numerous tactics.
As such, I recommend breaking your codebase down at the Task level, structuring your
paper's artifact with splits made on the Task level. For example, a classification
paper might have the structure:
- tasks/
- mnist/
- ...
- imagenet/
- ...
bocas provides an opinionated framework for generating
Keeping these concepts in mind, bocas recommends that you structure your code
into three levels:
library/holds anything unique to your report/paper/publication. This might include a new augmentation, a newkeras.Layer, a new loss function, or a new metric.tasks/holds all of the tasks to benchmark your new technique on.paper/holds theLatexorMarkdowncode required to render your paperpaper/artifactssubdirectory ofpaperthat holds all of the artifacts produced by thetasks. Typically when running a Task sweep you'll want to provide this directory to your scripts.
Your tasks should be structured as follows:
All code for a task should reside in tasks/{task}/, i.e. tasks/oxford_102.
You should create a run.py script. This script must have a run() method that
accepts an ml_collections.ConfigDict as its first positional argument. If you follow
the example in the Oxford Flowers 102 example, your
run.py file will support both independent run and mass-scale sweeps:
def run(config):
name = f'{config.optimizer}'
train_ds, test_ds = tfds.load(
"oxford102", as_supervised=True, split=["train", "test"]
)
model = keras_cv.models.ResNet50V2(
include_rescaling=True,
include_top=True,
classes=102
)
model.compile(loss="mse", optimizer=config.optimizer)
history = model.fit(train_ds, epochs=10)
return bocas.Result(
name=name,
artifacts=[
bocas.artifacts.KerasHistory(history, name="fit_history"),
],
)Once you are happy with the results from a single run.py run, create a sweep.py
config file. In sweep.py, specify a ml_collections.ConfigDict containing
bocas.Sweep objects for any value you'd like to sweep oer.
config = ml_collections.ConfigDict()
config.static_value = 'any-string-or-int-or-float-or-python-object'
config.optimizer = bocas.Sweep(['sgd', 'adam'])Anytime a value of type bocas.Sweep() is encountered, the product of all
other defined bocas.Sweep() parameters is run with the addition of the new
values in that sweep.
Be careful with this! It is easy to create a lot of experiments:
config = ml_collections.ConfigDict()
config.learning_rate = bocas.Sweep([x/100 for x in range(5, 21)])
config.optimizer = bocas.Sweep(['sgd', 'adam'])
config.model = bocas.Sweep(
['resnet50', 'resnet50v2', 'densenet101', 'efficientnet']
)This configuration already contains 15 * 2 * 4 or 120 runs! That is probably
way more than you'd like. Try to define a few experiments that are all encompassing.
To accomplish this, run hyper parameter sweeps separately, and hardcode the values into
the final runs that are used to produce the charts.
After all of your runs are complete, create some charts and plots. Save them to your
designated directory in your paper/ directory so that they are rendered
into your updated paper.
I recommend writing a script to produce desired plots based on the artifacts that can
be run entirely separately from your experiments themselves. Any example of this can
be found in the oxford_102 example:
# scripts/create_plots.py
results = bocas.Result.load_collection("artifacts/")
metrics_to_plot = {}
for experiment in results:
metrics = experiment.get_artifact("fit_history").metrics
metrics_to_plot[f"{experiment.name} Train"] = metrics["accuracy"]
metrics_to_plot[f"{experiment.name} Validation"] = metrics["val_accuracy"]
luketils.visualization.line_plot(
metrics_to_plot,
path=f"{paper_dir}/results/combined-accuracy.png",
title="Model Accuracy",
)Check out the full code in oxford_102.
Thats all it takes to get running with bocas. Please check out the
examples/ directory for more reading. It contains a few more patterns
that might be useful in structuring your experiments.
bocas is under active development
While the API is relatively straightforward and simple, bocas
lacks support for multi-worker experiment runs. This means that you will need to run
all of your experiments concurrently on a single machine. If you are running 10-20
fit() loops to convergence, this will likely be an extremely expensive process.
Personally, I'd rather just wait for my experiments to run then fiddle with a ton of infrastructure. That being said, I mainly run small scale research.
If someone wants to contribute distributed runs, feel free!
Contributions are more than welcome to bocas.
Please see the GitHub issue tracker, and feel free to pick up any issue annotated
with Contribution Welcome.
Additionally, bug reports are not only welcome but encouraged.
Help me improve bocas!
I made this project because I needed the tool.
I'm sure many others do as well.
If you find this tool helpful, please toss a GitHub star on the repo and follow me on Twitter.
Thank you to all of our GitHub contributors: