TP 1.2: Single execution on a cluster
Goal: Identify genes of a transcript fasta file thanks to the alignment software blast (NCBI) by using cluster compute nodes.
Simple submission command¶
We will use sequence alignement with NCBI_Blast+
as a use case.
Interactive job¶
Question
Connect to a node in interactive mode.
Warning
When you connect on a cluster node in interactive mode you are systematically placed in your home directory
Grumpy administrator
Never run a calculation on a login node! Use an interactive job or a batch job.
Solution
srun --pty bash
Prerequisite¶
Load the NCBI_Blast+
module
module load bioinfo/NCBI_Blast+/2.10.0+
Run blast
¶
Question
Launch a blast against ensembl_danio_rerio_pep
databank in interactive mode on the cluster.
Your query is nucleic, your databank is proteic so you need to use the blastx
program.
Tip
For more help on blast, type
blastx -help
Solution
blastx -query contigs.fasta -db ensembl_danio_rerio_pep -evalue 10e-10 -out contigs.blastx_dr
/bank/blastdb
, however the cluster was configured in a way that you don't need to specify the path.
Look for running jobs¶
Question
Open a new terminal and check all the jobs running or waiting on the cluster. Check your own job.
Solution
squeue
squeue -t R
squeue -t PD
squeue -u <username>
Question
On which node are you running ?
Solution
squeue -u mtrotard
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
1232823 workq bash user R 0:05 1 node129
Stop a running job¶
Question
Kill your job.
Solution
scancel 1232823
Batch mode¶
Question
Use a text editor to create a command file blastn.sh
with the same module load and almost the same blast command line (replace blastx
with blastn
). The first line of the file must be :
1 |
|
Launch it in batch mode.
Solution
File blastn.sh
contains:
blastn.sh | |
---|---|
1 2 3 |
|
Launch it with :
sbatch blastn.sh
Check running job¶
Question
Check the execution. When it's over, look at the blast output file and the 2 execution trace files slurm-xxxxx.out
.
Has the job finished correctly ?
Solution
squeue -u <username>
less contigs.blastn_dr
less slurm-XXXXX.out
Batch mode with inline command¶
Question
Launch the same command without using a file ( option --wrap='command'
).
Check the execution.
When it's over, look at the blast output file and the execution trace file (slurm-xxxxx.out
).
Has the job finished correctly ?
Solution
sbatch -J blastdr --wrap='module load bioinfo/NCBI_Blast+/2.10.0+; blastn -db ensembl_danio_rerio_cdna -query contigs.fasta -evalue 10e-10 -out contigs.blastn_dr'
Look at the trace file¶
If you didn't have any error until now, redo the previous submission with an error in the command. Have a look to the trace file.
How much ressources¶
Question
Look at the ressources used by previous jobs. In particular, pay attention to CPU and Memory usage.
Solution
seff <job_id>