Calibrating ressources


Nextflow submit jobs with default resources defined in configuration named base.config which can be found in ̀~/.nextflow/assets/nf-core/rnaseq/conf/base.config or with command nextflow config nf-core/rnaseq.

It may be possible that jobs are killed due to exceeded memory limit or time limit or that jobs do not run as they required to much ressources. The default configuration can be overloaded with a configuration file.

By default, settings of nf-core workflows are fixed with a "retry strategy" depending on return errors. In case of the job is killed because it does not have enough cpu, time or memory you can tell Nextflow to retry thanks to thoses options :


process {
  errorStrategy = { task.exitStatus in [143,137,104,134,139] ? 'retry' : 'terminate' }
  maxRetries = 1
  maxErrors = '-1'


  // Process-specific resource requirements
  withLabel:process_high {
      cpus = { check_max( 12 * task.attempt, 'cpus' ) }
      memory = { check_max( 72.GB * task.attempt, 'memory' ) }
      time = { check_max( 16.h * task.attempt, 'time' ) }
   }
}

In genotoul profile we fixed the max ressouces :

params {
  // Max resources requested by a normal node on genotoul.
  max_memory = 120.GB
  max_cpus = 48
  max_time = 96.h
}

see complete file here

The user can overload those parameters by creating a file in current directory called nextflow.config or by using -c file.conf option.

For example, if you want to give more memory to all large tasks across most nf-core pipelines, the following config could work:

process {
  withLabel:process_high {
    memory = 120.GB
  }
}

You can be more specific than this by targetting a given process name instead of it's label using withName. For example:

process {
  withName:bwa_align {
    cpus = 32
  }
}

Good practice :

  • Try to run the pipeline on one sample only
  • Check html report available in results/pipeline_info.
  • Calibrate ressources to :

    • pass faster on the cluster
    • reduce carbon impact (less kill/retry)
  • Calibration can be done

    • In workflow directory ~/.nextflow/assets/nf-core/rnaseq/conf/base.config (can keep the retry strategy)
    • By overloading with nextflow.config from current directory

Exercise 11:

We are going to resume Methylseq workflow after donwgrading memory available for bismark step, and removing the results.

  • Check the files ResultsMeth/pipeline_info/execution_report.html of your last execution of methylseq
  • Which value of memory could be set for bismark ?
  • Edit ~/.nextflow/assets/nf-core/methylseq/conf/base.config, and set memory of bismark_align to 800.MB (to test the retry strategy)
withName:bismark_align {
    cpus = { check_max( 12 * task.attempt, 'cpus') }
    memory = { check_max( 800.MB * task.attempt, 'memory') }
    time = { check_max( 8.d * task.attempt, 'time') }
  }
  • Delete job directory of the previously completed bismark_align process of MT_rep1 (use the file ResultsMeth/pipeline_info/execution_trace.txt to find the working path):
$  cut -f 2,4,5 ResultsMeth/pipeline_info/execution_trace.txt

hash    name    status
35/1a236f    makeBismarkIndex (1)    CACHED
...
64/8e430a    bismark_align (MT_rep1)    COMPLETED
a2/ed8201    get_software_versions    COMPLETED
...

Complete path for this job is ./work/64/8e430ac54a4eebb11f81e34aceb154/, path start with key 64/8e430a, use tabulation to get complete path name.

  • rerun methylseq workflow with option -resume

View correction

results matching ""

    No results matching ""