Calibrating ressources
Nextflow submit jobs with default resources defined in configuration named base.config
which can be found in ̀~/.nextflow/assets/nf-core/rnaseq/conf/base.config
or with command nextflow config nf-core/rnaseq
.
It may be possible that jobs are killed due to exceeded memory limit or time limit or that jobs do not run as they required to much ressources. The default configuration can be overloaded with a configuration file.
By default, settings of nf-core workflows are fixed with a "retry strategy" depending on return errors. In case of the job is killed because it does not have enough cpu, time or memory you can tell Nextflow to retry thanks to thoses options :
process {
errorStrategy = { task.exitStatus in [143,137,104,134,139] ? 'retry' : 'terminate' }
maxRetries = 1
maxErrors = '-1'
// Process-specific resource requirements
withLabel:process_high {
cpus = { check_max( 12 * task.attempt, 'cpus' ) }
memory = { check_max( 72.GB * task.attempt, 'memory' ) }
time = { check_max( 16.h * task.attempt, 'time' ) }
}
}
In genotoul profile we fixed the max ressouces :
params {
// Max resources requested by a normal node on genotoul.
max_memory = 120.GB
max_cpus = 48
max_time = 96.h
}
The user can overload those parameters by creating a file in current directory called nextflow.config
or by using -c file.conf
option.
For example, if you want to give more memory to all large tasks across most nf-core pipelines, the following config could work:
process {
withLabel:process_high {
memory = 120.GB
}
}
You can be more specific than this by targetting a given process name instead of it's label using withName. For example:
process {
withName:bwa_align {
cpus = 32
}
}
Good practice :
- Try to run the pipeline on one sample only
- Check html report available in
results/pipeline_info
. Calibrate ressources to :
- pass faster on the cluster
- reduce carbon impact (less kill/retry)
Calibration can be done
- In workflow directory
~/.nextflow/assets/nf-core/rnaseq/conf/base.config
(can keep the retry strategy) - By overloading with
nextflow.config
from current directory
- In workflow directory
Exercise 11:
We are going to resume Methylseq workflow after donwgrading memory available for bismark step, and removing the results.
- Check the files
ResultsMeth/pipeline_info/execution_report.html
of your last execution of methylseq - Which value of memory could be set for bismark ?
- Edit
~/.nextflow/assets/nf-core/methylseq/conf/base.config
, and set memory of bismark_align to 800.MB (to test the retry strategy)
withName:bismark_align {
cpus = { check_max( 12 * task.attempt, 'cpus') }
memory = { check_max( 800.MB * task.attempt, 'memory') }
time = { check_max( 8.d * task.attempt, 'time') }
}
- Delete job directory of the previously completed bismark_align process of MT_rep1 (use the file
ResultsMeth/pipeline_info/execution_trace.txt
to find the working path):
$ cut -f 2,4,5 ResultsMeth/pipeline_info/execution_trace.txt
hash name status
35/1a236f makeBismarkIndex (1) CACHED
...
64/8e430a bismark_align (MT_rep1) COMPLETED
a2/ed8201 get_software_versions COMPLETED
...
Complete path for this job is ./work/64/8e430ac54a4eebb11f81e34aceb154/
, path start with key 64/8e430a
, use tabulation to get complete path name.
- rerun methylseq workflow with option
-resume