diff --git a/README.md b/README.md index f73b81021..18cd54d6d 100644 --- a/README.md +++ b/README.md @@ -5,6 +5,7 @@ This is the website repo for [viash.io](https://viash.io). ## Requirements [Quarto](https://quarto.org/docs/get-started/), R 4.2 and Python 3.10. +We also need the 'tree' util to be present. ## First setup diff --git a/guide/nextflow_vdsl3/channels_and_workflows.qmd b/guide/nextflow_vdsl3/channels_and_workflows.qmd new file mode 100644 index 000000000..78add7b36 --- /dev/null +++ b/guide/nextflow_vdsl3/channels_and_workflows.qmd @@ -0,0 +1,93 @@ +--- +title: Channels and workflows +order: 10 +--- + +Here we introduce the main concepts of Nextflow programming with DSL2. + +```{r setup, include = FALSE} +knitr::opts_chunk$set( + fig.path = "images/200-" +) +``` + +Nextflow DSL2 borrows some elements from event-driven functional programming. As a matter of fact, one could argue that Nextflow's [`Channel`] concept being strictly speaking an example of the [DataFlow Programming Model] can in fact be regarded as an implementation of a (albeit limited) [Functional Reactive Programming] library. + +Contents of `main.nf`: + +```{embed, lang="groovy"} +workflows/200-first_nextflow_pipeline/main.nf +``` + +This workflow consists of three steps: + +* An `channel` is created containing 4 strings +* A `map` which removes spaces around the strings (`.trim()`). +* A view which displays the contents of the events to the user + +The pipe operator (`|`) allows connecting steps (which might generate and/or consume events) together. + +Quite a lot is going on in these 3 lines of code. Before we dissect this in detail, let us first explore the `Channel` or usually called (reactive) stream concept. + +## `Channel` and data flow + +Below you can see an illustration of how an empty `channel` can be created and how events can be _put_ on that `channel`. The technical term for _putting_ events on the `channel` is `bind`. + +```{embed, lang="groovy"} +workflows/201-first_nextflow_pipeline_revisited/main.nf +``` + +This pipeline definition does exactly the same as our previous example and just aims to describe what is happening under the hood. The `Channel.fromList()` used in the first example is an illustration of a [`Channel` factory method](https://www.nextflow.io/docs/latest/channel.html#channel-factory). + +The data flow of channels (and later processes) can be visualised as shown in @fig-dataflow. + +```{swirly echo=FALSE, label="fig-dataflow", fig.cap="The data flow of `main.nf`."} +-a-b-c-d----------| +a := " a " +b := " b" +c := " c" +d := "d " + +> map{ f -> f.trim() } + +---a-b-c-d--------| +a := "a" +b := "b" +c := "c" +d := "d" + +> view +``` + +### A note on Nextflow / Groovy syntax + +Nextflow is a DSL on top of the Groovy programming language, so +you can use whatever Groovy code to manipulate `Channel` events in however way you like[^well]. + +* `[ 1, 2, 3 ]`: A list of integers +* `[ a: 1, b: 2, c: 3 ]`: A hash map (dictionary, named list) +* `{ elem -> elem.trim() }`: An anonymous function, aka _closure_ +* `{ it.trim() }`: The same anonymous function with the implicit variable `it` +* `a ? b : c`: If `a` then `b` else `c` + +~~Here is a [cheat sheet](http://www.cheat-sheets.org/saved-copy/rc015-groovy_online.pdf) on Groovy syntax.~~ + +:::{.callout-note} +There is at least one exception with consequences for how VDSL3's API is defined: function overloading is not available in Nextflow code. +::: + +### Running the pipeline + +Let's see what happens when we run the pipeline above using Nextflow: + +```{bash} +nextflow run workflows/200-first_nextflow_pipeline/main.nf +``` + +```{r echo=FALSE} +unlink("./output/", recursive = TRUE) +``` + +[DataFlow Programming Model]: https://en.wikipedia.org/wiki/Dataflow_programming +[`Channel`]: https://www.nextflow.io/docs/latest/channel.html +[Functional Reactive Programming]: https://gist.github.com/staltz/868e7e9bc2a7b8c1f754 \ No newline at end of file diff --git a/guide/nextflow_vdsl3/create-and-use-a-module.qmd b/guide/nextflow_vdsl3/create-and-use-a-module.qmd index f3ab0cfff..edd3c82c7 100644 --- a/guide/nextflow_vdsl3/create-and-use-a-module.qmd +++ b/guide/nextflow_vdsl3/create-and-use-a-module.qmd @@ -1,6 +1,6 @@ --- title: Create and use a module -order: 10 +order: 20 --- diff --git a/guide/nextflow_vdsl3/workflows/200-first_nextflow_pipeline/main.nf b/guide/nextflow_vdsl3/workflows/200-first_nextflow_pipeline/main.nf new file mode 100644 index 000000000..a541ab0eb --- /dev/null +++ b/guide/nextflow_vdsl3/workflows/200-first_nextflow_pipeline/main.nf @@ -0,0 +1,5 @@ +workflow { + Channel.fromList( [" a ", " b", " c", "d "] ) + | map{ elem -> elem.trim() } + | view +} \ No newline at end of file diff --git a/guide/nextflow_vdsl3/workflows/201-first_nextflow_pipeline_revisited/main.nf b/guide/nextflow_vdsl3/workflows/201-first_nextflow_pipeline_revisited/main.nf new file mode 100644 index 000000000..2e37392eb --- /dev/null +++ b/guide/nextflow_vdsl3/workflows/201-first_nextflow_pipeline_revisited/main.nf @@ -0,0 +1,12 @@ +workflow { + ch = Channel.empty() + + ch << " a " + ch << " b" + ch << " c" + ch << "d " + + ch + | map{ elem -> elem.trim() } + | subscribe{ print "$it" } +} \ No newline at end of file