Synopsis

This page gives an overview of the Dutch national supercomputer Snellius supercomputer and details the various types of file systems, nodes, and system services available to end-users.

Snellius is a general purpose capability system and is designed to be a well balanced system, meaning Snellius is designed to handle tasks which require:

  • many cores
  • large symmetric multi-processing nodes
  • high memory
  • a fast interconnect
  • a lot of work space on disk
  • a fast I/O subsystem

Nodes Overview


Node types

The set of Snellius node available to end-users comprises three interactive nodes and a large number of batch nodes, or "worker nodes".  We distinguish the following different node flavours:

  • (int) : CPU-only interactive nodes ,
  • (tcn) : CPU-only "thin" compute nodes, some of which have truly node-local NVMe based scratch space
  • (fcn) : CPU-only "fat"  compute nodes  which have more memory than the default worker nodes as well as truly node-local NVMe based scratch space,
  • (hcn) : CPU-only "high-memory" compute nodes with even more memory than fat nodes,
  • (gcn) : GPU-enhanced "gpu" compute nodes with NVIDIA GPUs, some of which have truly node-local NVMe based scratch space,
  • (srv) : CPU-only not-for-computing "service"  nodes, that are primarily intended to facilitate the running of user-submitted jobs that automate data transfers into or out of the Snellius system.


The table below lists the current available Snellius node types.

# Nodes

Node flavour

Lenovo Node Type

CPU SKU

CPU Cores  per Node

Accelerator(s)


DIMMs 

CPU Memory per node

Local storage

Network connectivity

3intThinkSystem SR665

AMD EPYC 7F32 (2x)

8 Cores/Socket
3.7GHz
180W

16N/A

16 x 16GiB
3200MHz, DDR4


  • 256 GiB DRAM
    (16 GiB per core)

 

  • 1x HDR100, 100GbE ConnectX-6 VPI Dual port
  • 2x 25GbE SFP28 Mellanox OCP
525

tcn
(Rome)

ThinkSystem SR645



AMD Rome 7H12 (2x)

64 Cores/Socket
2.6GHz
280W

128N/A

16 x 16GiB
3200MHz,
DDR4 

  • 256 GiB DRAM
    (2 GiB per core)
 

A subset of 21 nodes contain:

  • 1x HDR100 ConnectX-6 single port
  • 2x 25GbE SFP28 OCP
738

tcn
(Genoa)

ThinkSystem SD665v3

AMD Genoa 9654 (2x)

96 Cores/Socket
2.4GHz
360W

192

N/A

24 x 16GiB
4800MHz, DDR5 

  • 384 GiB DRAM
    (2 GiB per core) 

A subset of 72 nodes contain:

  • /scratch-node: 6.4TB NVMe SSD 
  • 1x NDR ConnectX-7 single port (200Gbps within a rack, 100Gbps outside the rack)
  • 2x 25GbE SFP28 OCP
72fcn
(Rome)

ThinkSystem SR645

AMD Rome 7H12 (2x)

64 Cores/Socket
2.6GHz
280W

128N/A

16 x 64GiB
3200MHz,
DDR4 

  • 1 TiB DRAM
    (8 GiB per core)
  • 1x HDR100 ConnectX-6 single port
  • 2x 25GbE SFP28 OCP
48fcn
(Genoa)

ThinkSystem SD665v3

AMD Genoa 9654 (2x)

96 Cores/Socket
2.4GHz
360W

192N/A


 

  • 1.5 TiB DRAM
    (8 GiB per core)
  • 1x NDR ConnectX-7 single port (200Gbps within a rack, 100Gbps outside the rack)
  • 2x 25GbE SFP28 OCP
2

hcn
(4TiB)

ThinkSystem SR665

AMD Rome 7H12 (2x)

64 Cores/Socket
2.6GHz
280W

128N/A

32 x 128GiB

2666 MHz, DDR4 

  • 4 TiB DRAM
    (32 GiB per core)
N/A
  • 1x HDR100 ConnectX-6 single port
  • 2x 25GbE SFP28 OCP

 

2hcn
(8 TiB)
ThinkSystem SR665

AMD Rome 7H12 (2x)

64 Cores/Socket
2.6GHz
280W

128N/A

32 x 256GiB
2666 MHz, DDR4 

  • 8 TiB DRAM
    (64 GiB per core)

N/A

  • 1x HDR100 ConnectX-6 single port
  • 2x 25GbE SFP28 OCP

 

72gcnThinkSystem SD650-N v2

Intel Xeon Platinum 8360Y (2x)

36 Cores/Socket
2.4 GHz (Speed Select SKU)
250W


72

NVIDIA A100 (4x)

40 GiB HBM2 memory with 5 active memory stacks per GPU

16 x 32 GiB
3200 MHz, DDR4

  • 512GiB DRAM
    (7.111 GiB per core)
  • 160GiB HBM2
    (40GiB per GPU)


A subset of 36 nodes contain:

  • /scratch-node: 7.68TB NVMe SSD ThinkSystem PM983
  • 2x HDR200 ConnectX-6 single port
  • 2x 25GbE SFP28 LOM
  • 1x 1GbE RJ45 LOM
88gcnThinkSystem SD665-N V3

AMD EPYC 9334 (2x)

32 Cores/Socket
2.7 GHz
210W

64

NVIDIA H100 (SXM5) (4x)
94 GiB HBM2e memory with 5 active memory stacks per GPU

24 x 32 GiB
4800 MHz, DDR5

  • 768 GiB DRAM
    (12 GiB per core)
  • 376 GiB HBM2e
    (94 GiB per GPU)


A subset of 22 nodes contain:

  • /scratch-node: NVMe SSD 
  • 4x NDR200 ConnectX-7
  • 2x 25GbE SFP28 LOM
  • 1x 1GbE RJ45 LOM
7srvThinkSystem SR665

AMD EPYC 7F32 (2x)

8 Cores/Socket
3.7GHz
180W

16N/A16 x 16GiB
3200MHz, DDR5
  • 256 GiB DRAM
    (16 GiB per core)

 

  • 1x HDR100, 100GbE ConnectX-6 VPI Dual port
  • 2x 25GbE SFP28 Mellanox OCP

Nodes per expansion phase

Snellius is planned to be built in three consecutive expansion phases. All phases are planned to be in operation until end of life of the machine. Since Snellius will grow in phases, it will become increasingly heterogeneous when phase 2 and phase 3 will be operational. In order to maintain a clear reference to node flavours i.e. int, tcn, gcn, we will introduce a node type acronym. This will account for the node flavour along with which phase the node was implemented in (PH1, PH2, PH3). A thin CPU-only node that was implemented in phase 1 will follow the Node Type Acronym PH1.tcn. 

Phase 1 (Q3 2021)

The table below, lists the available Snellius node types available in Phase 1.

# Nodes

Node Flavour

Node Type Acronym

Lenovo Node Type

CPU SKU

CPU Cores  per Node

Accelerator(s)


DIMMs 

Memory per node

Local storage

Network connectivity

3intPH1.intThinkSystem SR665

AMD EPYC 7F32 (2x)

8 Cores/Socket
3.7GHz
180W

16N/A

16 x 16GiB
3200MHz, DDR4


  • 256 GiB DRAM
    (16 GiB per core)
  • 1x HDR100, 100GbE ConnectX-6 VPI Dual port
  • 2x 25GbE SFP28 Mellanox OCP
504

tcn

PH1.tcn

ThinkSystem SR645

AMD Rome 7H12 (2x)

64 Cores/Socket
2.6GHz
280W

128N/A

16 x 16GiB
3200MHz,
DDR4 

  • 256 GiB DRAM

N/A

  • 1x HDR100 ConnectX-6 single port
  • 2x 25GbE SFP28 OCP

 

72fcnPH1.fcn

ThinkSystem SR645

AMD Rome 7H12 (2x)

64 Cores/Socket
2.6GHz
280W

128N/A

16 x 64GiB
3200MHz,
DDR4 

  • 1 TiB DRAM
    (8 GiB per core)
  • 1x HDR100 ConnectX-6 single port
  • 2x 25GbE SFP28 OCP
2

hcn

PH1.hcn4TThinkSystem SR665

AMD Rome 7H12 (2x)

64 Cores/Socket
2.6GHz
280W

128N/A

32 x 128GiB

2666 MHz, DDR4 

  • 4 TiB DRAM
    (32 GiB per core)

N/A

  • 1x HDR100 ConnectX-6 single port
  • 2x 25GbE SFP28 OCP

 

2hcnPH1.hcn8TThinkSystem SR665

AMD Rome 7H12 (2x)

64 Cores/Socket
2.6GHz
280W

128N/A

32 x 256GiB
2666 MHz, DDR4 

  • 8 TiB DRAM
    (64 GiB per core)

N/A

  • 1x HDR100 ConnectX-6 single port
  • 2x 25GbE SFP28 OCP

 

36gcnPH1.gcnThinkSystem SD650-N v2

Intel Xeon Platinum 8360Y (2x)

36 Cores/Socket
2.4 GHz (Speed Select SKU)
250W


72

NVIDIA A100 (4x)

40 GiB HBM2 memory with 5 active memory stacks per GPU

16 x 32 GiB
3200 MHz, DDR4

  • 512GiB DRAM
    (7.111 GiB per core)
  • 160 GiB HBM2
    (40 GiB per GPU)


 

N/A
  • 2x HDR100 ConnectX-6 single port
  • 2x 25GbE SFP28 LOM
  • 1x 1GbE RJ45 LOM

 

7srvPH1.srvThinkSystem SR665

AMD EPYC 7F32 (2x)

8 Cores/Socket
3.7GHz
180W

16N/A16 x 16GiB
3200MHz, DDR5
  • 256 GiB DRAM
    (16 GiB per core)
  • 1x HDR100, 100GbE ConnectX-6 VPI Dual port
  • 2x 25GbE SFP28 Mellanox OCP

Phase 1A + 1B + 1C (Q4 2022)

# Nodes

Node Flavour

Node Type Acronym

Lenovo Node Type

CPU SKU

CPU Cores  per Node

Accelerator(s)


DIMMs 

Memory per node

Local storage

Network connectivity

21tcn

ThinkSystem SR645

AMD Rome 7H12 (2x)

64 Cores/Socket
2.6GHz
280W

128N/A

16 x 16GiB
3200MHz,
DDR4 

  • 256 GiB DRAM
    (2 GiB per core) 
  • 1x HDR100 ConnectX-6 single port
  • 2x 25GbE SFP28 OCP
36gcn
ThinkSystem SD650-N v2Intel Xeon Platinum 8360Y (2x)

36 Cores/Socket
2.4 GHz (Speed Select SKU)
250W
72

NVIDIA A100 (4x)

40 GiB HBM2 Memory with 5 active memory stacks per GPU

16 x 32GiB
3200 MHz, DDR4


  • 512 GiB DRAM
    (7.111 GiB per core)
  • 160 GiB HBM2
    (40 GiB per GPU)
  • 7.68TB NVME SSD ThinkSystem PM983 
  • 2x HDR100 ConnectX-6 single port
  • 2x 25GbE SFP28 LOM
  • 1x 1GbE RJ45 LOM

Phase 2 (Q3 2023)

# Nodes

Node Flavour

Node Type Acronym

Lenovo Node Type

CPU SKU

CPU Cores  per Node

Accelerator(s)

DIMMs 

Memory per node

Local storage

Network connectivity

714

tcn


ThinkSystem SD665v3

AMD Genoa 9654 (2x)

96 Cores/Socket
2.4GHz
360W

192

N/A

24 x 16GiB
4800MHz, DDR5 

  • 384 GiB DRAM
    (2 GiB per core) 

N/A

  • 1x NDR ConnectX-7 single port (200Gbps within a rack, 100Gbps outside the rack)
  • 2x 25GbE SFP28 OCP

Phase 2A (LISA replacement, Q3 2023)

# Nodes

Node Flavour

Node Type Acronym

Lenovo Node Type

CPU SKU

CPU Cores  per Node

Accelerator(s)

DIMMs 

Memory per node

Local storage

Network connectivity

72

tcn


ThinkSystem SD665v3

AMD Genoa 9654 (2x)

96 Cores/Socket
2.4GHz
360W

192

N/A

24 x 16GiB
4800MHz, DDR5 

  • 384 GiB DRAM
    (2 GiB per core) 
  • 6.4TB NVMe SSD
  • 1x NDR ConnectX-7 single port (200Gbps within a rack, 100Gbps outside the rack)
  • 2x 25GbE SFP28 OCP

Phase 3 (Q2 2024)

# Nodes

Node Flavour

Node Type Acronym

Lenovo Node Type

CPU SKU

CPU Cores  per Node

Accelerator(s)

DIMMs 

Memory per node

Local storage

Network connectivity

88gcn

ThinkSystem SD665-N V3

AMD EPYC 9334 (2x)

32 Cores/Socket
2.1 GHz
210W


64

NVIDIA H100 (SXM5) (4x)
94 GiB HBM2e memory with 5 active memory stacks per GPU

24 x 32 GiB
4800 MHz, DDR5

  • 768 GiB DRAM
    (12 GiB per core)
  • 376 GiB HBM2e
    (94 GiB per GPU)

A subset of 22 nodes contain:

  • /scratch-node: NVMe SSD 
  • 4x NDR200 ConnectX-7
  • 2x 25GbE SFP28 LOM
  • 1x 1GbE RJ45 LOM

Interconnect

All compute nodes on Snellius  use the same interconnect, which is based on Infiniband HDR100 (100Gbps), fat tree topology.

With phase 2 and phase 3 extensions added,  there is also a single InfiniBand fabric, but  part of it is based on InfiniBand NDR, to connect the older tree and the new tree with sufficient  bandwidth,