Synopsis

This page contains examples on how to request resources for different type of jobs on our HPC system Snellius. Check the usage and accounting sections for Snellius for more information on the partitions configuration and nodes' hardware. 


Single node, serial program

This job script launches a single, serial program on one CPU node, using a single core on the node. 

Snellius
#!/bin/bash
#Set job requirements
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=32
#SBATCH --partition=rome
#SBATCH --time=01:00:00

#Execute program located in $HOME
$HOME/my_serial_program


This script will allocate 1/4 of the resources of the node (32cores) for the requested time (1h) or until completion of the job.




Single node, multithreaded program (CPU and GPU)

If you have a multithreaded program, e.g. through the use of OpenMP, a single instance can already use multiple cores on the node. 

Snellius
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=64
#SBATCH --partition=rome
#SBATCH --time=01:00:00

# Execute program located in $HOME
srun $HOME/my_multithreaded_program


This script will allocate 1/2 of the resources of the node (64 cores) for the requested time (1h) or until completion of the job.

The minimum size of allocatable resources on Snellius is 1/4th of a node. For single node jobs, users can allocate 1/4, 1/2, 3/4 or the full node.

Snellius - GPU
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=18
#SBATCH --gpus=1
#SBATCH --partition=gpu
#SBATCH --time=01:00:00

# Execute program located in $HOME
srun $HOME/my_multithreaded_GPU-program


This script will allocate 1/4th of a GPU node (18 cores + 1 GPU) for the requested time (1h) or until completion of the job.

Snellius - Multi-instance GPU (MIG)
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=9
#SBATCH --gpus=1
#SBATCH --partition=gpu_mig
#SBATCH --time=01:00:00

# Create folder and copy input to scratch. This will copy the input file 'input_file' to the shared scratch space
mkdir -p /scratch-shared/$USER
cp $HOME/input_file /scratch-shared/$USER
 
cd /scratch-shared/$USER
# Execute the program. In this example we request 1/2 of GPU (MIG)

$HOME/my_gpu_program input_file output

# Copy output back from scratch to the directory from where the job was submitted
cp -r output $SLURM_SUBMIT_DIR

This script will allocate one half of one GPU in the node (+ 9 CPU cores) for the requested time (1h) or until completion of the job. Here, "one half of a GPU" means one of the two MIG instances available on a GPU.



Single node, concurrent programs on the same node (CPU and GPU)

This job script executes the same program on all the cores and all GPUs available on the node. 

The program has two arguments: an input file and an output files name. In this example, the program uses the same input file for every execution. To avoid having the program read it from the slow home file-system repeatedly, we first copy the inputs and we work on the scratch disk.

Snellius - Serial program
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=128
#SBATCH --partition=rome
#SBATCH --time=01:00:00

# Create folder and copy input to scratch. This will copy the input file 'input_file' to the shared scratch space
mkdir -p /scratch-shared/$USER
cp $HOME/input_file /scratch-shared/$USER
 
cd /scratch-shared/$USER
# Execute the program N times, where N=ntasks requested to SLURM. In this example we request all the cores availables on the node.
# The '&' sign is used to start each program in the background, so that the programs start running concurrently.

for i in `seq 1 $SLURM_NTASKS`; do
	srun --ntasks=1 --nodes=1 --cpus-per-task=1 $HOME/my_serial_program input_file output_i &
done
wait

 
#Copy output back from scratch to the directory from where the job was submitted
cp -r output_* $SLURM_SUBMIT_DIR


This script will allocate 128 cores on a node in the rome partition for the requested time (1h) or until completion of the job.

Snellius - Parallel program
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=128
#SBATCH --partition=rome
#SBATCH --time=01:00:00

# Create folder and copy input to scratch. This will copy the input file 'input_file' to the shared scratch space
mkdir -p /scratch-shared/$USER
cp $HOME/input_file /scratch-shared/$USER
 
cd /scratch-shared/$USER
# Execute the program 4 times, each program will use 32 cores in parallel. In this example we request all the cores available on the node.
# The '&' sign is used to start each program in the background, so that the programs start running concurrently.

for i in `seq 1 4`; do
  srun -n 32 --gres=cpu:32 --exclusive $HOME/my_parallel_program input_file output_$i &
done
wait
 
#Copy output back from scratch to the directory from where the job was submitted
cp -r output_* $SLURM_SUBMIT_DIR


This script will allocate 128 cores on a node in the rome partition for the requested time (1h) or until completion of the job.

Snellius - GPU
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=4
#SBATCH --cpus-per-task=18
#SBATCH --gpus=4
#SBATCH --partition=gpu
#SBATCH --time=01:00:00

# Create folder and copy input to scratch. This will copy the input file 'input_file' to the shared scratch space
mkdir -p /scratch-shared/$USER
cp $HOME/input_file /scratch-shared/$USER
 
cd /scratch-shared/$USER
# Execute the program N times, where N=ntasks requested to SLURM. In this example we request all the cores availables on the node.
# The '&' sign is used to start each program in the background, so that the programs start running concurrently.

for i in `seq 1 $SLURM_NTASKS`; do
  srun $HOME/my_gpu_program input_file output_$i &
done
wait
 
#Copy output back from scratch to the directory from where the job was submitted
cp -r output_* $SLURM_SUBMIT_DIR


This script will allocate 4 GPUs node (+ 72 cores ) for the requested time (1h) or until completion of the job.




Single node, concurrent pipelines

Sometimes, you want to execute the same pipeline (or single program) on different samples of data. For example, you want to do some preprocessing, then run the actual analysis, and then run some post processing on a number of different input files. This pipeline has to be run sequentially on a given input, i.e. you cannot start your analysis before you did your preprocessing step. You can, however start multiple instances of such pipelines at the same time, each on a different input.

This example runs a pipeline with three steps (preprocessing, analysis and postprocessing) on 10 different input files (input_file_1 ... input_file_10), each using a single core on the node.

Snellius
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=10
#SBATCH --partition=rome
#SBATCH --time=01:00:00

# Create folder and copy input to scratch. This will copy all files with a name starting with 'input_file_'
mkdir -p /scratch-shared/$USER
cp $HOME/input_file_* /scratch-shared/$USER
 
# Create output directory on scratch
mkdir -p  /scratch-shared/$USER/output
 
# Execute a series of serial programs (i.e. a 'pipeline'). This example pipeline consists of preprocessing, analysis and postprocessing.
# In this example, each of our programs take two arguments: the input (defined with -i argument) and an output (defined with -o).
# The '(' and ')' brackets are used to define a code block.
# The '&' sign is used to start each code block in the background, so that the pipeline starts running on input_file_1 to input_file_10 concurrently.
# Within the code block, the commands are executed sequentially. That way, the analysis will not start before the preprocessing is finished.

cd /scratch-shared/$USER

for i in `seq 1 $SLURM_NTASKS`; do
(
  $HOME/preprocessing -i input_file_$i -o output/preprocessed_$i
  $HOME/analysis -i output/preprocessed_$i -o output/analyzed_$i
  $HOME/postprocessing -i output/analyzed_$i -o output/output_$i
) &
done
wait
 
#Copy output folder back from scratch to the directory from where the job was submitted
cp -r output $SLURM_SUBMIT_DIR


This script will allocate 1/4 of the resources of the node (32cores) for the requested time (1h) or until completion of the job.




Single node, parallel program (CPU and GPU)

This example a code in parallel on a CPU and a GPU node using half of the resources available.

Snellius
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=64
#SBATCH --partition=rome
#SBATCH --time=01:00:00

# Load modules for MPI and other parallel libraries
module load 2021 
module load foss/2021a

# Create folder and copy input to scratch. This will copy the input file 'input_file' to the shared scratch space
mkdir -p /scratch-shared/$USER
cp $HOME/input_file /scratch-shared/$USER
 
cd /scratch-shared/$USER

# Execute the program in parallel on ntasks cores 

srun $HOME/my_parallel_program input_file output 
 
# Copy output back from scratch to the directory from where the job was submitted
cp -r output $SLURM_SUBMIT_DIR


This script will allocate 64 cores on a node in the rome partition for the requested time (1h) or until completion of the job.

Snellius - GPU
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=2
#SBATCH --gpus=2
#SBATCH --cpus-per-task=18
#SBATCH --partition=gpu
#SBATCH --time=01:00:00

# Load modules for MPI and other parallel libraries
module load 2021 
module load foss/2021a
module load CUDA/version

# Create folder and copy input to scratch. This will copy the input file 'input_file' to the shared scratch space
mkdir -p /scratch-shared/$USER
cp $HOME/input_file /scratch-shared/$USER
 
cd /scratch-shared/$USER
# Execute the program in parallel on ntasks cores 

srun $HOME/my_parallel_program input_file output 
 
# Copy output back from scratch to the directory from where the job was submitted
cp -r output $SLURM_SUBMIT_DIR


This script will allocate 2 GPUs on the node (+ 36 cores ) for the requested time (1h) or until completion of the job.




Multiple nodes, parallel program (CPU and GPU)

This example a code in parallel on a multiple CPU and a GPU nodes using half of the resources available on the nodes.



Snellius
#!/bin/bash
#SBATCH --nodes=100
#SBATCH --ntasks-per-node=64 
#SBATCH --partition=rome
#SBATCH --time=01:00:00

# Load modules for MPI and other parallel libraries
module load 2021 
module load foss/2021a

# Create folder and copy input to scratch. This will copy the input file 'input_file' to the shared scratch space
mkdir -p /scratch-shared/$USER
cp $HOME/input_file /scratch-shared/$USER
 
cd /scratch-shared/$USER

# Execute the program in parallel on 6400 cores

srun $HOME/my_parallel_program input_file output 
 
# Copy output back from scratch to the directory from where the job was submitted
cp -r output $SLURM_SUBMIT_DIR


This script will allocate 100 nodes in the rome partition for the requested time (1h) or until completion of the job. Jobs on multiple nodes will get exclusive access to the node, independent from the amount of cores requested and used.


Snellius - GPU
#!/bin/bash
#SBATCH --nodes=10
#SBATCH --ntasks-per-node=4 
#SBATCH --gpus-per-node=4
#SBATCH --cpus-per-task=18
#SBATCH --partition=gpu
#SBATCH --time=01:00:00

# Load modules for MPI and other parallel libraries
module load 2021 
module load foss/2021a
module load CUDA/version

# Create folder and copy input to scratch. This will copy the input file 'input_file' to the shared scratch space
mkdir -p /scratch-shared/$USER
cp $HOME/input_file /scratch-shared/$USER
 
cd /scratch-shared/$USER

# Execute the program in parallel. The code will run 4 tasks (18 threads each) and 4 GPUs per node

srun $HOME/my_parallel_program input_file output 
 
# Copy output back from scratch to the directory from where the job was submitted
cp -r output $SLURM_SUBMIT_DIR


This script will allocate 40 GPUs on the 10 node (+ 720 cores ) for the requested time (1h) or until completion of the job. Jobs on multiple nodes will get exclusive access to the node, independent from the amount of cores and GPUs requested and used.





Multiple nodes, multiple programs with SLURM array jobs

Through SLURM it is possible to launch a large number of independent jobs with one single sbatch command using the Job Array functionality. For example the command:


sbatch --array 1-100 myjob.sh

will launch 100 identical copies of the job script "myjob.sh". The tasks array jobs, are copies of the master script that are submitted to the scheduler. A job array can be specified in different ways:

# Array with tasks numbered from 0 to 100.
#SBATCH --array=0-100

# Array with tasks numbered 1, 7, 20, 27, 101.
#SBATCH --array=1,7,20,27,101

# Array with tasks numbered from 1 to 15 with 2 spacing (1,3,5,7,9,11,13,15)
#SBATCH --array=1-15:2

A job script for an array job could just be identical to any of the job scripts in the examples above. However, specifically for array jobs, you can use the environment variable SLURM_ARRAY_TASK_ID to differentiate what each job in the array does. For example, you might use an array job to start many MPI jobs, with the SLURM_ARRAY_TASK_ID as argument to my_MPI_program to launch 100 identical copies of the jobscript 'myjob'.

The example below uses the %A_%a notation to fill in the output/error file names, where %A is the master job id and %a is the array task id. This is a simple way to create output files in which the file name is different for each job in the array.

Snellius
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=128
#SBATCH --partition=rome
#SBATCH --time=01:00:00
#SBATCH --output=array_%A_%a.out
#SBATCH --error=array_%A_%a.err

# Execute program located in $HOME passing as argument the $SLURM_ARRAY_ID
srun $HOME/my_multithreaded_program $SLURM_ARRAY_TASK_ID
  • No labels