Here we list the issues that are known to us and that you don't have to report to the Service Desk. Of course, if you encounter issues on Snellius not listed here then please let us know (through the Service Desk).
Infiniband and file system performance
We found that the infiniband connections are not always stable, and may be underperforming. Similarly, the GPFS file systems (home, project and scratch) are not performing as expected. Reading/writing large files performs as expected, but reading/writing many small files is slower than expected. Among other things, this can affect the time it takes to start a binary, run commands, etc.
We are looking into both issues.
Using NCCL for GPU <=> GPU communication
NCCL is a communication library that offers optimized primitives for inter-GPU communication. We have found that it often hangs during initialization on Snellius. Probability of a hang during init increases with the amount of GPUs in the allocation. The issue is that NCCL sets up its communication using an ethernet-based network interface. By default, it selects the 'ib-bond0' interface, which supports IP over the infiniband network in Snellius. This interface seems to be experiencing issues however.
As a workaround, you can configure NCCL to use the traditional ethernet interface, which on Snellius GPU nodes is called 'eno1np0', by exporting the following environment variable
Note that if you use mpirun as launcher, you should make sure that it gets exported to the other nodes in the job too
(note that when launching your parallel application with srun, your environment gets exported automatically, so the 2nd step is not needed).
Impact on performance of this workaround is expected to be minimal: the traditional ethernet interface is only used to initialize the connection. Any further NCCL communication between nodes is performed using native infiniband.
Cartopy: ibv_fork_init() warning
Users can encounter the following warning message, when import "cartopy" and "netCDF" modules in Python:
The issue is similar to the one reported here. The warning will disappear if "cartopy" is imported before "netCDF".
Another solution is to disable OFI before running the python script:
Attaching to a process with GDB can fail
gdb -p <pid> (or the equivalent
attach <pid> command in gdb) to attach to a process running in a SLURM job, you might encounter errors or warnings related to executable and library files than cannot be opened:
Such issues will also prevent symbols from being resolved correctly, making debugging really difficult.
The reason that this happens is that processes in a SLURM job get a slightly different view of file system mounts (using a so-called namespace). When you want to attach GDB to a running process and use SSH to log into the node where the process is running, the
gdb process will not be in the same namespace, causing GDB to have issues to directly access the binary (and its libraries) you're trying to debug.
The workaround is to use a slightly different method for attaching to the process:
$ gdb <executable>
(gdb) set sysroot /
(gdb) attach <pid>
For the example above, to attach to
/usr/bin/sleep (PID 1054730) the steps would become:
Allocating multiple GPU nodes
Normally, batch scripts like
Should get you an allocation with 2 GPU nodes, 8 gpus, and 4 MPI tasks per node. However, right now, there is an issue related to specifying an amount of GPUs larger than 4: jobs with the above SBATCH arguments that use OpenMPI and call srun or mpirun will hang.
Instead of specifying the total number of GPUs, please specify the number of GPUs per node, combined with the number of nodes instead. E.g.
This will give you the desired allocation with a total of 2 GPU nodes, 8 gpus, and 4 MPI tasks per node, and the srun (or mpirun) will not hang.
Running my MPI job
I am getting the following error when I run my MPI job
This error occurs when one tries to use
srun along with a process management interface (PMI) version that is not available. The reason for the non-availability could be that the
pmi version was upgraded recently. The user can also force a particular
pmix version to be used within their application by using the execution command in the following manner:
In the above case,
pmix_v2 is not available anymore.
The best way to use
srun without the
--mpi option or yet still, if you want to force
pmix usage, do not specify the version, scheduler will choose the latest version that is installed:
If you want to list the
pmi versions that are available, you can do that by executing the following on the command line:
Some background regarding Process Management Interface (PMI):
PMI provides an API and a library which interacts with different MPI libraries via the API to facilitate inter process communication. PMI libraries typically store processor/rank information in the form of a database which the MPI libraries can query and perform communication. For further reading please refer to: https://docs.openpmix.org/en/latest/history.html and https://link.springer.com/chapter/10.1007/978-3-642-15646-5_4 and https://dl.acm.org/doi/pdf/10.1145/3127024.3127027 .