This page explains the partitions available to users and the accounting for each partition. The configuration of Snellius also allows users to use a node in a “shared” mode where they can use a subset of the resources of a full node, with implications for accounting.
This page assumes knowledge on partition usage and how to submit a job using SLURM. Please refer to the HPC user guide for a general introduction to these topics.
Snellius partitions
Compute nodes are grouped into partitions to allow the user to select different hardware to run their software on. Each partition includes a subset of nodes with a different type of hardware and a specific maximum wall time.
The partitions can be selected by users via the SLURM option:
#SBATCH --partition=<partition name>
#SBATCH -p <partition name>
The partitions available on Snellius are summarised in the table below. For details of the different hardware available on each node, please look at the Snellius hardware page.
The “Available memory per node” is the amount of memory available to users and what can be requested within a job. This value is smaller than the “Total memory” on the node, as it doesn't include the memory reserved for the OS and other system's processes.
Partition name | Node type | # cores per node | Available memory per node | Smallest possible allocation | Max wall time | Notes |
---|---|---|---|---|---|---|
rome / thin | tcn (thin compute node, AMD Rome CPU) | 128 | 224 GiB | 1/8 node: 16 cores + | 120 h (5 days) | The “thin” and “rome” partitions are currently aliases for the same set of nodes, this might change in the near future. |
genoa | tcn (thin compute node, AMD Genoa CPU) | 192 | 336 GiB | 1/8 node: 24 cores + | 120 h (5 days) | |
fat_rome | fcn (fat compute node, Rome) | 128 | 960 GiB | 1/8 node: 16 cores + | 120 h (5 days) | Access only through NWO Large Compute applications or Small Compute applications. Please contact the service desk for more information. |
fat_genoa | fcn (fat compute node, Genoa) | 192 | 1440 GiB | 1/8 node: 24 cores + | 120 h (5 days) | Access only through NWO Large Compute applications or Small Compute applications. Please contact the service desk for more information. |
himem_4tb | hcn, PH.hcn4T (High memory node 4 TiB) | 128 | 3840 GiB | 1/8 node: 16 cores + | 120 h (5 days) | Access only through NWO Large Compute applications or Small Compute applications. Please contact the service desk for more information. |
himem_8tb | hcn, PH.hcn8T (High memory node 8TiB) | 128 | 7680 GiB | 1/8 node: 16 cores + | 120 h (5 days) | Access only through NWO Large Compute applications or Small Compute applications. Please contact the service desk for more information. |
gpu_a100 / gpu | gcn | 72 | 480 GiB | 1/4 node: 18 cores + 1 GPU + | 120 h (5 days) | NVIDIA A100 GPUs
|
gpu_h100 | gcn | 64 | 720 GiB | 1/4 node: | 120 h (5 days) | NVIDIA H100 GPUs, 94GiB per GPU AMD 4th GEN EPYC CPUs |
gpu_mig | gcn (MIG) | 72 | 480 GiB | 1/8 node: 9 cores + 1 GPU (MIG) + | 120 h (5 days) | Multi-Instance GPU (MIG) is NVIDIA technology that allows partitioning a single GPU into multiple instances, each fully isolated with its own high-bandwidth memory, cache, and compute cores. On Snellius, GPU (MIG) are partitioned into 2 independent instances, for a total of 8 GPU (MIG) per node. |
gpu_vis | gcn | 72 | 480 GiB | 1/4 node: 18 cores + 1 GPU + | 24h (1 day) | The nodes in this partition are meant for (interactive) data visualization usage only, not for GPU compute. Access is restricted by default. Please contact the service desk to request access to this partition for visualization purposes. |
staging | srv | 16 (32 threads with SMT) | 224 GiB | 1 thread + | 120 h (5 days) | SMT is activated on the srv nodes, enabling up to 32 threads per node (2 threads/core). |
Short jobs (at most 1 hour walltime)
Whenever you submit a job that uses at most 1 hour walltime to the “thin”, “fat” or “gpu” partitions, SLURM will schedule the job on a node that is only available for such short jobs. This effectively reduces the wait time for short jobs compared to longer jobs, which is useful for testing the setup and correctness of your jobs before submitting long-running production runs.
Note that the number of nodes that can run short jobs is relatively small. So submitting a short running job which uses many (e.g. tens or hundreds of) nodes will not work.
Accounting
Resource usage is measured in SBUs (System Billing Units). An SBU can be thought of as a “weighthed” or “normalised” core hour. Because nodes differ in the type of CPU, the amount of memory, and attached resources like a GPU or a local NVMe disk, SBUs are assigned and weighthed per node type. On Snellius charging for resource usage is based on how long a resource was used (wall-clock time) in addition to the type and number of nodes (or partial nodes) used. We also have a bit more verbose tutorial on how to estimate an SBU here.
The tables below shows the “SBU pricing” of core and GPU hours for the various node types.
Accounting (CPU nodes)
Node type | Description | Weight | # CPU cores per node | Smallest possible allocation | SBUs per 1 hour | SBUs per 1 hour |
---|---|---|---|---|---|---|
tcn rome | Thin compute node, AMD Rome CPU | 1.0 | 128 | 1/8 node: 16 cores | 128 SBUs | 16 SBUs |
tcn genoa | Thin compute node, AMD Genoa CPU | 1.0 | 192 | 1/8 node: 24 cores | 192 SBUs | 24 SBUs |
fcn rome | Fat compute node, AMD Rome CPU | 1.5 | 128 | 1/8 node: 16 cores | 192 SBUs | 24 SBUs |
fcn genoa | Fat compute node, AMD Genoa CPU | 1.5 | 192 | 1/8 node: 24 cores | 288 SBUs | 36 SBUs |
hcn, PH1.hcn4T | High memory node 4 TiB | 2.0 | 128 | 1/8 node: 16 cores | 256 SBUs | 32 SBUs |
hcn, PH1.hcn8T | High memory node 8 TiB | 3.0 | 128 | 1/8 node: 16 cores | 384 SBUs | 48 SBUs |
srv | Service node, for data transfer jobs | 2.0 | 16 (32 threads with SMT) | 1 thread | 32 SBUs | 1 SBU |
Accounting (GPU nodes)
Accounting for GPU node types is on a “per GPU” basis.
For example:
- If you use a single A100 GPU for 1 hour, you will be charged: 1 GPU x 128 Accounting Weight Factor (per GPU) x 1 hour = 128 SBUs
- If you use a full H100 node (i.e. 4 GPUs) for a full day: 4 GPUs x 192 Accounting Weight Factor (per GPU) x 24 hours = 18432 SBUs
*Node Type | Description | Accounting Weight Factor (per GPU) | # GPUs per node | # CPU cores per node | Smallest possible allocation | SBUs per 1 hour (full node) | SBUs per 1 hour |
---|---|---|---|---|---|---|---|
gcn_a100 | 4 GPUs enhanced compute node | 128 | 4 | 72 | 1/4 node: 18 cores + 1 GPU | 512 SBUs | 128 SBUs |
gcn_h100 | 4 GPUs enhanced compute node | 192 | 4 | 64 | 1/4 node: 16 cores + 1 GPU | 768 SBUs | 192 SBUs |
gcn (MIG) | 8 MIG GPUs instances compute node | 64 | 8 | 72 | 1/8 node: 9 cores + 1 GPU (MIG) | 512 SBUs | 64 SBUs |
Shared usage accounting
It is possible to submit a single-node job on Snellius that uses only part of the resources of a full node. “Resources” here means either cores or memory of a node. The rules for shared resource accounting are described below. Example shared usage job scripts can be found here.
For single-node jobs (only), users can request part of a node's resources. Jobs that require multiple nodes will always allocate (and get charged for) full nodes, i.e. there are no multi-node jobs that share nodes with other jobs.
The requested resources, i.e. CPU and memory, will be enforced by cgroups limits. This means that when you request, say, 1 CPU core and 1 GB of memory, those will be the hardware resources your job gets access to, and only those (even if a node has more hardware resources).
However, the accounting of shared jobs using less than a full node is done in increments of 1/8th of a node (1/4th of a node for the GPU nodes). So any combination of memory and/or cores (or GPUs) will be rounded up to the next eighth node (quarter for the GPU nodes), up to a full node. An eighth of a node's resources is defined to be an eighth of a node's total cores or total memory. The resource (memory/cores) that is requested at the highest fraction will define the resource allocation of the job. So requesting a quarter of the memory and half the CPU cores will lead to half the node being accounted.
For nodes with attached GPUs, a quarter of a node implies: 1 GPU + a quarter of the cores of the CPU and memory.
Here is a list example shared usage allocations:
- 1/8 node reservation
- Single-node jobs requesting up to and including 16 cores for a rome or high memory node
- Single-node jobs requesting up to and including 28 GiB memory on a thin node or 224 GiB on a fat node
- 1/2 node reservation
- Single-node jobs requesting up to and including 64 cores for a rome or high memory node
- Single-node jobs requesting up to and including 120 GiB memory on a thin node or 480 GiB on a fat node
- 3/4 node reservation
- Single-node jobs requesting up to and including 96 cores for a rome or high memory node
- Single-node jobs requesting up to and including 180 GiB memory on a thin node or 720 GiB on a fat node
- Full node reservation
- Jobs requesting all the cores in the node
- Jobs requesting all the memory of a node
- 1/4 node reservation
- Single-node jobs requesting up to and including 1 GPU (or 1/4 of the node memory, or 1/4 of the cores) for a GPU node
- Single-node jobs requesting up to and including 1 GPU (or 1/4 of the node memory, or 1/4 of the cores) for a GPU node
- 1/2 node reservation
- Single-node jobs requesting up to and including 2 GPUs (or 1/2 of the node memory, or 1/2 of the cores) for a GPU node
- Single-node jobs requesting up to and including 2 GPUs (or 1/2 of the node memory, or 1/2 of the cores) for a GPU node
- 3/4 node reservation
- Single-node jobs requesting up to and including 3 GPUs (or 3/4 of the node memory, or 3/4 of the cores) for a GPU node
- Single-node jobs requesting up to and including 3 GPUs (or 3/4 of the node memory, or 3/4 of the cores) for a GPU node
- Full node reservation
- Single-node jobs requesting up to and including 4 GPUs (or all the node memory or cores) for a GPU node
- For multi-node jobs independent of the number of cores (memory, GPUs) requested
You will be charged for this share of the node independently from the number of cores actually used.
Jobs requesting more than 1 node, will get exclusive access (only one job can run at the same time) to the allocated nodes, independent from the amount of core/memory requested. The batch system will accept jobs that request 1 node, 2 nodes, 3 nodes, and so on, providing exclusive use of all the cores, GPUs and memory on the node(s). It is important to note that Snellius is a machine designed for large compute jobs. We encourage users to develop workflows that schedule jobs running on at least a full node of a particular type.
Service nodes
The "odd one out" node type is the service node (srv node). Srv nodes are dedicated for the automation of data transfer tasks. The transferring of data in or out of the system, is a task that does not involve much "compute" at all. Usually it is more limited by network bandwidth than by CPU resources. Therefore, jobs submitted to srv nodes by default are jobs using just a single thread out of the 32 available per node (on srv nodes we enabled SMT).
Core hours versus job time limit
The use of the unit of "core hour" above does not imply anything about the minimum or maximum duration of a job. The job scheduling and accounting systems have a time resolution of 1 second. Accounts will be budgeted only for the time they used the resources, independently from the requested walltime.
How resources are accounted in terms of SBU budget subtracted differs between regular jobs and jobs run within a reservation:
- For regular jobs (i.e. not part of a reservation) the wall clock time that is accounted is the time from the actual start time of allocation of the resources to the actual end time and de-allocation of the resources. If such a job ends before its reserved time limit (as specified with
-t <duration>
tosbatch
) is over then only SBUs for the the actual run time in wall-clock time are consumed. Jobs that are submitted and subsequently cancelled before they ever were provided with an allocation of nodes do not consume any SBU budget. - A reservation will always be accounted for the full duration and set of resources reserved. This is even the case when all or part of the reserved resources are left idle, e.g. because smaller jobs than would be possible where run within the reservation.
Our HPC User Guide contains guidelines and several examples on how to request resources on our HPC systems. Check the Creating and running jobs section or the Example job scripts for more details.
Costs of inefficient use
You will be charged for all cores in the node(s) that you reserved, regardless of the actual number of cores used by the job/application. So if your application uses only a few (or even one) of the CPU cores of a node then it makes sense to write a job script that runs multiple instances of this application in parallel, in order to fully utilize the reserved resources and your budget.
Getting account and budget information
You can view your account details using
$ accinfo
This shows information such as the e-mail associated with the account, the initial and remaining budget, and until when the account is valid.
An overview of the SBU consumption for the current account can be obtained with
$ accuse
By default, consumption is shown for the current login, per month, over the last year. Per day usage can be obtained by adding the -d
flag. The start and end of the period shown in the overview can be changed with the -s DD-MM-YYYY
and -e DD-MM-YYYY
flags, respectively. Finally, consumption for a specific account or login can be obtained using -a accountname
and -u username
, respectively.
In case you want to know the CPU/GPU budgets separately, you can try:
accinfo --product=cpu accinfo --product=gpu
Note that accinfo and accuse report the state of your account's budget as it is registered on the SURF central accounting server. The data on the central accounting server is updated asynchronously, typically only once every 24 hours. So the output of accuse and accinfo don't take into account recently finished or still running jobs. Use the budget-overview tool, described below, for this.
The budget-overview tool
Another option, with a slightly different focus, is the budget-overview
tool. This tool checks and/or reports the usage of your budget for batch jobs on Snellius. It reports more accurately how much budget you have left than some other SURF accounting tools, such as "accinfo" and "accuse" described above. The latter tools report the state of your account's budget as it is registered on the SURF central accounting server. However, the data on the central accounting server is updated asynchronously, typically only once every 24 hours.
The budget-overview tool interacts with the accounting server to get the last known centrally registered budget state, plus it interacts with the Slurm batch system. From the latter it can take recently finished jobs and jobs that are still active into account, and overall check and report more accurately how much budget is left, and how fast it diminishes during the day.
The budget-overview tool can also inform you about the cost of recently finished, active and queued jobs. It is a tool that is complementary to other SURF accounting tools. Since budget-overview only reports about jobs that have not yet been registered at the central accounting server it has a horizon of at most 24-48 hours (usually less). It is not suitable for producing overviews of, say, last month's batch usage. You need "accuse" for that sort of longer term overviews.
$ budget-overview
Monitoring usage of larger accounts (e.g. courses)
If you manage an account with lot's of logins (e.g. a course account), monitoring your total budget with accinfo
probably isn't sufficient, and you want to know more about the consumption of resources from your account. For example, you may have questions like "Who are my biggest users?" and "What type of allocations do my users use (how many nodes per job, which partitions, etc)?". Here, we'll provide you with some examples to show how to retrieve that information.
How much SBUs do my users consume?
To determine how much credits your individual users use, you can run accuse
with the following arguments:
$ accuse --account <account_name> --sbu Month Account User SBUs Restituted ------- ---------- ------------ --------------- --------------- ... 2025-01 jhssrf019 scur1239 6.1 0.0 Totals for this user 6.1 0.0 2025-01 jhssrf019 scur1279 8.5 0.0 Totals for this user 8.5 0.0 Totals for this account 116.9 0.0
In this example, our account name was jhssrf019
. The left column with numbers shows the amount of SBUs subtracted. By default, accuse
reports the monthly usage, but you can usage per day by adding the -d
flag:
$ accuse --account <account_name> --sbu -d Month Account User SBUs Restituted ------- ---------- ------------ --------------- --------------- ... 2025-01-19 jhssrf019 scur1239 6.1 0.0 Totals for this user 6.1 0.0 2025-01-16 jhssrf019 scur1279 2.0 0.0 2025-01-19 jhssrf019 scur1279 6.5 0.0 Totals for this user 8.5 0.0 Totals for this account 116.9 0.0
Note that with accuse
you get aggregate information on the records being sent to our accounting database. I.e. you don't see individual job details, but you do see if (and how many) SBUs were actually subtracted for a given user in a given period.
What type of allocations do my users use?
To answer this question, we will use sacct
, the slurm accounting tool. This tool provides information on individual jobs. For example, to list all jobs for all users from a given account:
$ sacct --accounts <account_name> --allusers -X JobID JobName Partition Account AllocCPUS State ExitCode ------------ ---------- ---------- ---------- ---------- ---------- -------- ... 9513274 jupyterhu+ gpu_course jhssrf019 2 COMPLETED 0:0 9513294 jupyterhu+ gpu_course jhssrf019 2 RUNNING 0:0 9513370 pi gpu_mig jhssrf019 0 PENDING 0:0
Again, in this example, our account name was jhssrf019
. The -X
flag makes sure we only see the parent allocations, not individual job steps (as you're likely not interested in those). The standard fields that are shown per job are the job ID, job name, partition, account, allocated CPUs, job state, and exit code, but, you may (probably) want different information. You can select which fields show up in the output using the --format
argument. A combination of fields that gives a reasonable amount of information is:
$ sacct --accounts <account_name> --allusers -X --format="jobid,user,partition,start,end,Elapsed,AllocCPUS,AllocNodes,AllocTRES%80,reservation" JobID User Partition Start End Elapsed AllocCPUS AllocNodes AllocTRES Reservation ... 9514232 scur1282 gpu_course 2025-01-20T18:05:09 2025-01-20T18:12:41 00:07:32 2 1 billing=18,cpu=2,gres/cpu=2,gres/gpu:a100_1g.5gb=1,gres/gpu=1,mem=16G,node=1 jhs_homework_gpu 9517029 scur1187 rome 2025-01-20T19:17:40 2025-01-20T19:17:57 00:00:17 16 1 billing=16,cpu=16,gres/cpu=16,mem=28G,node=1 9517034 scur1187 gpu_mig 2025-01-20T19:18:55 Unknown 00:00:09 36 1 billing=256,cpu=36,gres/cpu=36,gres/gpu:a100_3g.20gb=1,gres/gpu=1,mem=60G,node=1
As you can see, this is already a pretty large amount of fields. You can add the --parsable
option, redirect your output to a file, and then e.g. import the data into Microsoft Excel (using |
as the field separator) for more careful inspection.
$ sacct --accounts jhssrf019 --allusers -X --format="jobid,user,partition,start,end,Elapsed,AllocCPUS,AllocNodes,AllocTRES%80,reservation" --parsable > /tmp/casparl/sacct_output.csv $ head /tmp/casparl/sacct_output.csv JobID|User|Partition|Start|End|Elapsed|AllocCPUS|AllocNodes|AllocTRES|Reservation| 9500985|scur1224|gpu_mig|2025-01-20T01:56:14|2025-01-20T01:56:39|00:00:25|36|1|billing=256,cpu=36,gres/cpu=36,gres/gpu:a100_3g.20gb=1,gres/gpu=1,mem=60G,node=1|| 9502407|scur1224|gpu_mig|None|2025-01-20T11:52:39|00:00:00|0|0||| 9502439|scur1187|staging|2025-01-20T09:32:26|2025-01-20T09:32:37|00:00:11|32|1|billing=32,cpu=32,gres/cpu=32,mem=224G,node=1|| 9502940|scur1206|gpu_mig|2025-01-20T13:12:05|2025-01-20T13:12:21|00:00:16|36|1|billing=256,cpu=36,gres/cpu=36,gres/gpu:a100_3g.20gb=1,gres/gpu=1,mem=60G,node=1|| 9503324|scur1206|gpu_mig|2025-01-20T13:12:32|2025-01-20T13:12:48|00:00:16|36|1|billing=256,cpu=36,gres/cpu=36,gres/gpu:a100_3g.20gb=1,gres/gpu=1,mem=60G,node=1|| 9506846|scur1224|gpu_mig|2025-01-20T13:13:00|2025-01-20T13:13:23|00:00:23|36|1|billing=256,cpu=36,gres/cpu=36,gres/gpu:a100_3g.20gb=1,gres/gpu=1,mem=60G,node=1|| 9509050|scur1187|staging|2025-01-20T13:40:25|2025-01-20T13:40:34|00:00:09|32|1|billing=32,cpu=32,gres/cpu=32,mem=224G,node=1|| 9509258|scur1237|staging|2025-01-20T13:48:33|2025-01-20T13:48:49|00:00:16|32|1|billing=32,cpu=32,gres/cpu=32,mem=224G,node=1|| 9509381|scur1222|gpu|2025-01-20T13:53:09|2025-01-20T13:54:36|00:01:27|18|1|billing=128,cpu=18,gres/cpu=18,gres/gpu:a100=1,gres/gpu=1,mem=120G,node=1||
To find out all additional formatting fields, check sacct --helpformat
.
The one thing sacct
does not give you is the amount of SBUs deducted for these jobs. That's because the calculation of that is slightly non-trivial. You should be able to compute it by multiplying the value for the billing
item in the AllocTRES
field by the amount of time elapsed (in hours). In the above example, the job last job in the output above (9509381
) ran for 1 minute and 27 seconds, i.e. 87 seconds in total. The billing
field says 128 for this entry. Thus, the total SBU cost of this job was 128 * (87/3600) = 3.1 SBUs.
Note that jobs run in a reservation are not charged. I.e. even though those have a non-zero billing
field in AllocTRES
, the amount of deducted SBUs for these jobs will be 0.