Skip to content

Working with multiple source files w/ .cb.nb #45

@rossbar

Description

@rossbar

I recently experienced an issue with working with multiple source files that would then be combined into one larger document, e.g. multiple files representing book chapters. If the files are set up to run individually with the notebook executor (i.e. .cb.nb) then execution will fail silently when trying to execute and combine the files into a single document.

Minimal reproducing example

Say you have two source files ch1.md and ch2.md that you want to execute+compile into book.pdf:

Contents of ch1.md:

# Ch. 1 - Uniform distribution

A histogram of uniformly-distributed random numbers.

```{.python .cb.nb jupyter_kernel=python3}
import numpy as np
import matplotlib.pyplot as plt

rng = np.random.default_rng()
plt.hist(rng.uniform(size=1000))
```

Contents of ch2.md

# Ch 2. - Normal Distribution

A histogram of normally-distributed random numbers.

```{.python .cb.nb jupyter_kernel=python3}
import numpy as np
import matplotlib.pyplot as plt

rng = np.random.default_rng()
plt.hist(rng.standard_normal(size=1000))
```

Executing/converting the files individually works as expected:

$ codebraid pandoc --from markdown --to pdf ch1.md --standalone -o book.pdf

However, if you try to compile both documents into a single book, neither document is executed, though no warning or error are given on the command line:

$ codebraid pandoc --from markdown --to pdf ch1.md ch2.md --standalone -o book.pdf

In the latter case, if you look at the output book.pdf you will find an error printed:

SOURCE ERROR in "ch2.md" near line 6:
Some options are only valid for the first code chunk in a session: "jupyter_kernel"

IMO it would be helpful to the user if this error were raised at the command line rather than (or in addition to being) embedded in the output document. In my actual use-case with much larger chapters, it was a very long time before I noticed this in the output book.

The error in book.pdf seems to suggest that the problem lies with the "special" metadata jupyter_kernel, which is only supposed to be supplied in the first code cell. This suggests that an author would have to modify source file metadata if they wanted to switch between building individual chapters and the entire book. I hadn't noticed this mentioned in the docs before - if it's not there, then it would be an improvement if this behavior were documented.

Perhaps this can be avoided if .cb.run is used instead of .cb.nb? Is there a preferred way of using codebraid to have flexible outputs w/ multiple source files?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions