How to run a workflow ?


To run a workflow you can either use :

  • a nextflow file (.nf)
  • a project name (from a repository)
  • an url repository

In this part we will see how to run from a file or a project name, but also how to change parameter value, to resume a workflow, to use a scheduler.

Here is the usage of the command run.

$ nextflow run -help
Execute a pipeline project
Usage: run [options] Project name or repository url
Options:
  -E
     Exports all current system environment
     Default: false

  [...]

  -without-docker
     Disable process execution with Docker
     Default: false
  -without-podman
     Disable process execution in a Podman container
  -w, -work-dir
     Directory where intermediate result files are stored

All the options linked to nextflow are prefixed by only one '-'

Run a workflow from a file *.nf

$ nextflow run tutorial.nf

The file tutorial.nf contains the following content:

#!/usr/bin/env nextflow
params.str = 'Hello world!'

process splitLetters {
    output:
    file 'chunk_*' into letters
    """
    printf '${params.str}' | split -b 6 - chunk_
    """
}

process convertToUpper {
    input:
    file x from letters.flatten()
    output:
    stdout result
    """
    cat $x | tr '[a-z]' '[A-Z]'
    """
}
result.view { it.trim() }

The workflow contains 2 main steps (called process), the first process splits a string into 6-character chunks, writing each one to a file with the prefix chunk_, and the second receives these files and transforms their contents to uppercase letters.

$ nextflow run tutorial.nf

It will output something similar to the text shown below:

N E X T F L O W  ~  version 19.10.0
Launching `tutorial.nf` [lonely_fourier] - revision: e3b475a61b
executor >  local (3)
[8a/f998c2] process > splitLetters       [100%] 1 of 1 ✔
[20/6d0533] process > convertToUpper (2) [100%] 2 of 2 ✔
HELLO
WORLD!

You can see that the first process is executed once, and the second twice. Finally the result string is printed.

It's worth noting that the process convertToUpper is executed in parallel, so there's no guarantee that the instance processing the first split (the chunk Hello) will be executed before the one processing the second split (the chunk world!).

Thus, it is perfectly possible that you will get the final result printed out in a different order:

WORLD!
HELLO

What does this create ?

  • Create a work directory which contains temporary files
  • .nextflow directory which contains cache of execution
  • .nextflow.log: log of the last execution
$ ls -altr
total 18
-rw-r--r-- 1 pervenche formation  372 17 janv. 17:04 tutorial.nf
drwx--x--x 5 pervenche formation 8192 20 janv. 14:29 ..
drwxr-xr-x 4 pervenche formation 4096 20 janv. 15:08 .
drwxr-xr-x 5 pervenche formation 4096 20 janv. 15:08 work
drwxr-xr-x 3 pervenche formation 4096 20 janv. 15:08 .nextflow
-rw-r--r-- 1 pervenche formation 5306 20 janv. 15:08 .nextflow.log

The content of these directories is explained in section outputs.

Change parameter value

This workflow has one parameter named str (not documented), to change the default value use --str in command line.

$ nextflow run tutorial.nf --str "mon texte a mettre en majuscule"

It will output something similar to the text shown below:

N E X T F L O W  ~  version 19.10.0
Launching `tutorial.nf` [goofy_kilby] - revision: e3b475a61b
executor >  local (7)
[ed/5b43df] process > splitLetters       [100%] 1 of 1 ✔
[6f/51ab49] process > convertToUpper (6) [100%] 6 of 6 ✔
XTE A
EN MA
METTRE
MON TE
JUSCUL
E

Resume a workflow

Nextflow keeps track of all the processes executed in your pipeline. With -resume option the execution of the processes that are not changed will be skipped and the cached result used instead.

$ nextflow run tutorial.nf --str "mon texte a mettre en majuscule" -resume

N E X T F L O W  ~  version 19.10.0
Launching `tutorial.nf` [scruffy_mccarthy] - revision: e3b475a61b
[87/29d13f] process > splitLetters       [100%] 1 of 1, cached: 1 ✔
[b0/ca75e0] process > convertToUpper (6) [100%] 6 of 6, cached: 6 ✔
XTE A
METTRE
MON TE
JUSCUL
EN MA
E

All the processes are retrieved from the cached as shown above.

Important :

  • nextflow options are prefixed by only one -
  • workflow parameters are prefixed by --

Run a workflow from a repository

While Nextflow run a pipeline, if the pipeline is not locally available, it is downloaded from a BitBucket, GitHub, and GitLab repositories, more info here.

nextflow run nextflow-io/hello 
N E X T F L O W  ~  version 20.10.0
Launching `nextflow-io/hello` [evil_jones] - revision: 96eb04d6a4 [master]
NOTE: Your local project version looks outdated - a different revision is available in the remote repository [e6d9427e5b]
executor >  local (4)
[55/d11576] process > sayHello (4) [100%] 4 of 4 ✔
Hello world!

Ciao world!

Bonjour world!

Hola world!

How to use slurm ?

Nextflow is designed to work on many executors such as SGE, SLURM, ... or even on clouds such as Kubernates, Amazon, ...

On Genotoul, we have the SLURM batch scheduler. To active it, create a file named nextflow.config in current directory and write the following lines:

process.executor = 'slurm'

Run the workflow

nextflow run nextflow-io/hello

In another terminal you can check the execution with :

squeue -u <username>

Exercise 2 :

  • Copy content of this file into tutorial.nf.
  • Run tutorial.nf with the workflow parameter --str "ceci est un Exercice". How many process were run ? Where did the processes run ?
  • List content of current directory with the command tree.
  • Re-execute the command with the nextflow option -resume.
  • Run the workflow named nextflow-io/hello. Where does the processes are run ?
  • Where does the workflow is downloaded ? If you don't find try nextflow info nextflow-io/hello
  • Create a file named nextflow.config in current directory and write the line : process.executor = 'slurm', then rerun the pipeline hello (without the resume option). Where does the processes are run ?

results matching ""

    No results matching ""