From 3908f275b5b754f582b6d8d22c4f63f2a12f1973 Mon Sep 17 00:00:00 2001 From: Hendrik Cannoodt Date: Fri, 13 Jun 2025 11:12:15 +0200 Subject: [PATCH 1/3] Add channels and workflows page from the course and add the referenced workflows too --- .../nextflow_vdsl3/channels_and_workflows.qmd | 91 +++++++++++++++++++ .../create-and-use-a-module.qmd | 2 +- .../200-first_nextflow_pipeline/main.nf | 5 + .../main.nf | 12 +++ 4 files changed, 109 insertions(+), 1 deletion(-) create mode 100644 guide/nextflow_vdsl3/channels_and_workflows.qmd create mode 100644 guide/nextflow_vdsl3/workflows/200-first_nextflow_pipeline/main.nf create mode 100644 guide/nextflow_vdsl3/workflows/201-first_nextflow_pipeline_revisited/main.nf diff --git a/guide/nextflow_vdsl3/channels_and_workflows.qmd b/guide/nextflow_vdsl3/channels_and_workflows.qmd new file mode 100644 index 000000000..df814e428 --- /dev/null +++ b/guide/nextflow_vdsl3/channels_and_workflows.qmd @@ -0,0 +1,91 @@ +--- +title: Channels and workflows +order: 10 +--- + +Here we introduce the main concepts of Nextflow programming with DSL2. + +```{r setup, include = FALSE} +knitr::opts_chunk$set( + fig.path = "images/200-" +) +``` + +Nextflow DSL2 borrows some elements from event-driven functional programming. As a matter of fact, one could argue that Nextflow's [`Channel`] concept being strictly speaking an example of the [DataFlow Programming Model] can in fact be regarded as an implementation of a (albeit limited) [Functional Reactive Programming] library. + +Contents of `guide/nextflow_vdsl3/workflows/200-first_nextflow_pipeline/main.nf`: + +```{embed, lang="groovy"} +guide/nextflow_vdsl3/workflows/200-first_nextflow_pipeline/main.nf +``` + +This workflow consists of three steps: + +* An `channel` is created containing 4 strings +* A `map` which removes spaces around the strings (`.trim()`). +* A view which displays the contents of the events to the user + +The pipe operator (`|`) allows connecting steps (which might generate and/or consume events) together. + +Quite a lot is going on in these 3 lines of code. Before we dissect this in detail, let us first explore the `Channel` or usually called (reactive) stream concept. + +## `Channel` and data flow + +Below you can see an illustration of how an empty `channel` can be created and how events can be _put_ on that `channel`. The technical term for _putting_ events on the `channel` is `bind`. + +```{embed, lang="groovy"} +guide/nextflow_vdsl3/workflows/201-first_nextflow_pipeline_revisited/main.nf +``` + +This pipeline definition does exactly the same as our previous example and just aims to describe what is happening under the hood. The `Channel.fromList()` used in the first example is an illustration of a [`Channel` factory method](https://www.nextflow.io/docs/latest/channel.html#channel-factory). + +The data flow of channels (and later processes) can be visualised as shown in @fig-dataflow. + +```{swirly echo=FALSE, label="fig-dataflow", fig.cap="The data flow of `guide/nextflow_vdsl3/workflows/200-first_nextflow_pipeline/main.nf`."} +-a-b-c-d----------| +a := " a " +b := " b" +c := " c" +d := "d " + +> map{ f -> f.trim() } + +---a-b-c-d--------| +a := "a" +b := "b" +c := "c" +d := "d" + +> view +``` + +### A note on Nextflow / Groovy syntax + +Nextflow is a DSL on top of the Groovy programming language, so +you can use whatever Groovy code to manipulate `Channel` events in however way you like[^well]. + +* `[ 1, 2, 3 ]`: A list of integers +* `[ a: 1, b: 2, c: 3 ]`: A hash map (dictionary, named list) +* `{ elem -> elem.trim() }`: An anonymous function, aka _closure_ +* `{ it.trim() }`: The same anonymous function with the implicit variable `it` +* `a ? b : c`: If `a` then `b` else `c` + +Here is a [cheat sheet](http://www.cheat-sheets.org/saved-copy/rc015-groovy_online.pdf) on Groovy syntax. + +[^well]: Actually, there is at least one exception we've bumped into with consequences for how VDSL3's API is defined: function overloading is not available in Nextflow code. + +### Running the pipeline + +Let's see what happens when we run the pipeline above using Nextflow: + +```{bash} +nextflow run workflows/200-first_nextflow_pipeline/main.nf +``` + +```{r echo=FALSE} +unlink("./output/", recursive = TRUE) +``` + +[DataFlow Programming Model]: https://en.wikipedia.org/wiki/Dataflow_programming +[`Channel`]: https://www.nextflow.io/docs/latest/channel.html +[Functional Reactive Programming]: https://gist.github.com/staltz/868e7e9bc2a7b8c1f754 \ No newline at end of file diff --git a/guide/nextflow_vdsl3/create-and-use-a-module.qmd b/guide/nextflow_vdsl3/create-and-use-a-module.qmd index f3ab0cfff..edd3c82c7 100644 --- a/guide/nextflow_vdsl3/create-and-use-a-module.qmd +++ b/guide/nextflow_vdsl3/create-and-use-a-module.qmd @@ -1,6 +1,6 @@ --- title: Create and use a module -order: 10 +order: 20 --- diff --git a/guide/nextflow_vdsl3/workflows/200-first_nextflow_pipeline/main.nf b/guide/nextflow_vdsl3/workflows/200-first_nextflow_pipeline/main.nf new file mode 100644 index 000000000..a541ab0eb --- /dev/null +++ b/guide/nextflow_vdsl3/workflows/200-first_nextflow_pipeline/main.nf @@ -0,0 +1,5 @@ +workflow { + Channel.fromList( [" a ", " b", " c", "d "] ) + | map{ elem -> elem.trim() } + | view +} \ No newline at end of file diff --git a/guide/nextflow_vdsl3/workflows/201-first_nextflow_pipeline_revisited/main.nf b/guide/nextflow_vdsl3/workflows/201-first_nextflow_pipeline_revisited/main.nf new file mode 100644 index 000000000..2e37392eb --- /dev/null +++ b/guide/nextflow_vdsl3/workflows/201-first_nextflow_pipeline_revisited/main.nf @@ -0,0 +1,12 @@ +workflow { + ch = Channel.empty() + + ch << " a " + ch << " b" + ch << " c" + ch << "d " + + ch + | map{ elem -> elem.trim() } + | subscribe{ print "$it" } +} \ No newline at end of file From 7988fda0d731d55843684011d9c586e6605e09cf Mon Sep 17 00:00:00 2001 From: Hendrik Cannoodt Date: Fri, 13 Jun 2025 14:50:30 +0200 Subject: [PATCH 2/3] fix a few things in the new page. add comment in readme that we need the tree command --- README.md | 1 + guide/nextflow_vdsl3/channels_and_workflows.qmd | 8 ++++---- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index f73b81021..18cd54d6d 100644 --- a/README.md +++ b/README.md @@ -5,6 +5,7 @@ This is the website repo for [viash.io](https://viash.io). ## Requirements [Quarto](https://quarto.org/docs/get-started/), R 4.2 and Python 3.10. +We also need the 'tree' util to be present. ## First setup diff --git a/guide/nextflow_vdsl3/channels_and_workflows.qmd b/guide/nextflow_vdsl3/channels_and_workflows.qmd index df814e428..c96471b77 100644 --- a/guide/nextflow_vdsl3/channels_and_workflows.qmd +++ b/guide/nextflow_vdsl3/channels_and_workflows.qmd @@ -13,10 +13,10 @@ knitr::opts_chunk$set( Nextflow DSL2 borrows some elements from event-driven functional programming. As a matter of fact, one could argue that Nextflow's [`Channel`] concept being strictly speaking an example of the [DataFlow Programming Model] can in fact be regarded as an implementation of a (albeit limited) [Functional Reactive Programming] library. -Contents of `guide/nextflow_vdsl3/workflows/200-first_nextflow_pipeline/main.nf`: +Contents of `main.nf`: ```{embed, lang="groovy"} -guide/nextflow_vdsl3/workflows/200-first_nextflow_pipeline/main.nf +workflows/200-first_nextflow_pipeline/main.nf ``` This workflow consists of three steps: @@ -34,14 +34,14 @@ Quite a lot is going on in these 3 lines of code. Before we dissect this in deta Below you can see an illustration of how an empty `channel` can be created and how events can be _put_ on that `channel`. The technical term for _putting_ events on the `channel` is `bind`. ```{embed, lang="groovy"} -guide/nextflow_vdsl3/workflows/201-first_nextflow_pipeline_revisited/main.nf +workflows/201-first_nextflow_pipeline_revisited/main.nf ``` This pipeline definition does exactly the same as our previous example and just aims to describe what is happening under the hood. The `Channel.fromList()` used in the first example is an illustration of a [`Channel` factory method](https://www.nextflow.io/docs/latest/channel.html#channel-factory). The data flow of channels (and later processes) can be visualised as shown in @fig-dataflow. -```{swirly echo=FALSE, label="fig-dataflow", fig.cap="The data flow of `guide/nextflow_vdsl3/workflows/200-first_nextflow_pipeline/main.nf`."} +```{swirly echo=FALSE, label="fig-dataflow", fig.cap="The data flow of `main.nf`."} -a-b-c-d----------| a := " a " b := " b" From 6f6dd2005d721991e09a7e99844bc4873ac5d56a Mon Sep 17 00:00:00 2001 From: Hendrik Cannoodt Date: Fri, 13 Jun 2025 15:02:19 +0200 Subject: [PATCH 3/3] reword the callout and add the callout-note syntax we use in this repo --- guide/nextflow_vdsl3/channels_and_workflows.qmd | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/guide/nextflow_vdsl3/channels_and_workflows.qmd b/guide/nextflow_vdsl3/channels_and_workflows.qmd index c96471b77..78add7b36 100644 --- a/guide/nextflow_vdsl3/channels_and_workflows.qmd +++ b/guide/nextflow_vdsl3/channels_and_workflows.qmd @@ -70,9 +70,11 @@ you can use whatever Groovy code to manipulate `Channel` events in however way y * `{ it.trim() }`: The same anonymous function with the implicit variable `it` * `a ? b : c`: If `a` then `b` else `c` -Here is a [cheat sheet](http://www.cheat-sheets.org/saved-copy/rc015-groovy_online.pdf) on Groovy syntax. +~~Here is a [cheat sheet](http://www.cheat-sheets.org/saved-copy/rc015-groovy_online.pdf) on Groovy syntax.~~ -[^well]: Actually, there is at least one exception we've bumped into with consequences for how VDSL3's API is defined: function overloading is not available in Nextflow code. +:::{.callout-note} +There is at least one exception with consequences for how VDSL3's API is defined: function overloading is not available in Nextflow code. +::: ### Running the pipeline