How to run a workflow ?
To run a workflow you can either use :
- a nextflow file (
.nf
) - a project name (from a repository)
- an url repository
In this part we will see how to run from a file or a project name, but also how to change parameter value, to resume a workflow, to use a scheduler.
Here is the usage of the command run.
$ nextflow run -help
Execute a pipeline project
Usage: run [options] Project name or repository url
Options:
-E
Exports all current system environment
Default: false
[...]
-without-docker
Disable process execution with Docker
Default: false
-without-podman
Disable process execution in a Podman container
-w, -work-dir
Directory where intermediate result files are stored
All the options linked to nextflow are prefixed by only one '-'
Run a workflow from a file *.nf
$ nextflow run tutorial.nf
The file tutorial.nf contains the following content:
#!/usr/bin/env nextflow
params.str = 'Hello world!'
process splitLetters {
output:
file 'chunk_*' into letters
"""
printf '${params.str}' | split -b 6 - chunk_
"""
}
process convertToUpper {
input:
file x from letters.flatten()
output:
stdout result
"""
cat $x | tr '[a-z]' '[A-Z]'
"""
}
result.view { it.trim() }
The workflow contains 2 main steps (called process), the first process splits a string into 6-character chunks, writing each one to a file with the prefix chunk_, and the second receives these files and transforms their contents to uppercase letters.
$ nextflow run tutorial.nf
It will output something similar to the text shown below:
N E X T F L O W ~ version 19.10.0
Launching `tutorial.nf` [lonely_fourier] - revision: e3b475a61b
executor > local (3)
[8a/f998c2] process > splitLetters [100%] 1 of 1 ✔
[20/6d0533] process > convertToUpper (2) [100%] 2 of 2 ✔
HELLO
WORLD!
You can see that the first process is executed once, and the second twice. Finally the result string is printed.
It's worth noting that the process convertToUpper is executed in parallel, so there's no guarantee that the instance processing the first split (the chunk Hello) will be executed before the one processing the second split (the chunk world!).
Thus, it is perfectly possible that you will get the final result printed out in a different order:
WORLD!
HELLO
What does this create ?
- Create a work directory which contains temporary files
- .nextflow directory which contains cache of execution
- .nextflow.log: log of the last execution
$ ls -altr
total 18
-rw-r--r-- 1 pervenche formation 372 17 janv. 17:04 tutorial.nf
drwx--x--x 5 pervenche formation 8192 20 janv. 14:29 ..
drwxr-xr-x 4 pervenche formation 4096 20 janv. 15:08 .
drwxr-xr-x 5 pervenche formation 4096 20 janv. 15:08 work
drwxr-xr-x 3 pervenche formation 4096 20 janv. 15:08 .nextflow
-rw-r--r-- 1 pervenche formation 5306 20 janv. 15:08 .nextflow.log
The content of these directories is explained in section outputs.
Change parameter value
This workflow has one parameter named str
(not documented), to change the
default value use --str
in command line.
$ nextflow run tutorial.nf --str "mon texte a mettre en majuscule"
It will output something similar to the text shown below:
N E X T F L O W ~ version 19.10.0
Launching `tutorial.nf` [goofy_kilby] - revision: e3b475a61b
executor > local (7)
[ed/5b43df] process > splitLetters [100%] 1 of 1 ✔
[6f/51ab49] process > convertToUpper (6) [100%] 6 of 6 ✔
XTE A
EN MA
METTRE
MON TE
JUSCUL
E
Resume a workflow
Nextflow keeps track of all the processes executed in your pipeline.
With -resume
option the execution of the processes that are not changed will
be skipped and the cached result used instead.
$ nextflow run tutorial.nf --str "mon texte a mettre en majuscule" -resume
N E X T F L O W ~ version 19.10.0
Launching `tutorial.nf` [scruffy_mccarthy] - revision: e3b475a61b
[87/29d13f] process > splitLetters [100%] 1 of 1, cached: 1 ✔
[b0/ca75e0] process > convertToUpper (6) [100%] 6 of 6, cached: 6 ✔
XTE A
METTRE
MON TE
JUSCUL
EN MA
E
All the processes are retrieved from the cached as shown above.
Important :
- nextflow options are prefixed by only one
-
- workflow parameters are prefixed by
--
Run a workflow from a repository
While Nextflow run a pipeline, if the pipeline is not locally available, it is downloaded from a BitBucket, GitHub, and GitLab repositories, more info here.
nextflow run nextflow-io/hello
N E X T F L O W ~ version 20.10.0
Launching `nextflow-io/hello` [evil_jones] - revision: 96eb04d6a4 [master]
NOTE: Your local project version looks outdated - a different revision is available in the remote repository [e6d9427e5b]
executor > local (4)
[55/d11576] process > sayHello (4) [100%] 4 of 4 ✔
Hello world!
Ciao world!
Bonjour world!
Hola world!
How to use slurm ?
Nextflow is designed to work on many executors such as SGE, SLURM, ... or even on clouds such as Kubernates, Amazon, ...
On Genotoul, we have the SLURM batch scheduler. To active it, create a file named nextflow.config
in current directory and write the following lines:
process.executor = 'slurm'
Run the workflow
nextflow run nextflow-io/hello
In another terminal you can check the execution with :
squeue -u <username>
Exercise 2 :
- Copy content of this file into
tutorial.nf
. - Run tutorial.nf with the workflow parameter
--str "ceci est un Exercice"
. How many process were run ? Where did the processes run ? - List content of current directory with the command
tree
. - Re-execute the command with the nextflow option
-resume
. - Run the workflow named
nextflow-io/hello
. Where does the processes are run ? - Where does the workflow is downloaded ? If you don't find try
nextflow info nextflow-io/hello
- Create a file named
nextflow.config
in current directory and write the line :process.executor = 'slurm'
, then rerun the pipeline hello (without the resume option). Where does the processes are run ?