A GPU node on Snellius is available for interactive software development and compiling codes that utilize GPUs.
This page includes instructions on how to connect to this node and an example compilation.
This node is meant for users who want to compile their GPU codes on Snellius and perform small test runs, not lasting more than a few minutes.
Just like any A100 GPU node on Snellius this node also consists of 4 GPU cards, which are divided into 7 MIG instances each, resulting in a total of 28 MIG instances available to the users.
Restrictions for the interactive GPU node
- A user needs to have access to the
gpupartition and at least 1 SBU of GPU budget.
- This node is not exposed to the external world therefore this is not a login node, meaning accessing this node is via the login nodes (i.e. the '
- This node is not meant for production runs, it is meant for sanity checking of your code.
- This is a shared node and regular usage policy applies, meaning your jobs will automatically be killed if your host job is running there for more than 15 minutes.
- Once you assign yourself a MIG instance, you will need to load the necessary modules to run properly in your environment, including at least the CUDA runtime libraries.
MIG instances are slices of a GPU in terms of memory and cuda cores, meaning a full GPU will not be available to you on this node.
If you need a full GPU for your testing, allocate a full node using
sallocor an interactive session using
- A MIG instance is not exclusive, therefore you may run out of GPU memory when running your software.
Logging into the GPU nodes
gcn1 using ssh from a login node:
If you cannot ssh into gcn1, check that you have access to the GPU partition ('
partition ' contains '
And check that you have a postitive GPU budget:
If you have access to the GPU partition and have a positive budget, but still cannot login to
gcn1 , please contact the service desk ( https://servicedesk.surf.nl ).
Compilation and testing
In this section we compile a simple CUDA application performing ping-pong cycles using CUDA aware MPI. In this case the GPU instances are treated as individual devices and assigned to each MPI rank using a wrapper script.
Code: (File name: pp_cuda_aware.cu ):
Wrapper script: (File name: mpi_wrapper.sh)
This wrapper script needs to be made an executable using
Load required modules:
Compilation and execution:
- Please note that the modules used in this example are from the 2022 environment, this can vary based on the environment your application uses.
If we run
nvidia-smi we see that the four available GPUs are split into a total of
4 x 7 = 28 MIG devices:
Assign yourself a different MIG instance
MIG instances are not exclusive which means that another user may already be utlizing the MIG instance you are trying to use, in which case you can assign yourself another MIG instance.
First, check which processes are running on a particular MIG instance using the
Then, to assign yourself a different MIG instance, you can use the code snippet below:
As you can see in the snippet above, first you need to load the ids of the MIG instances into a bash array.
Then you can assign specific ids to the environment variable
CUDA_VISIBLE_DEVICES, which in this case is the id of the 15th MIG-instance.