Hardware Overview
Hardware Overview
ARIS is the name of the Greek supercomputer, deployed and operated by GRNET S.A. (National Infrastructures for Research and Technology S.A.) in Athens. ARIS consists of 67 computational nodes seperated in four “islands” as listed here:
- 48 thin nodes: Regular compute nodes without accelerator.
- 3 gpu nodes: “4 x NVIDIA Ampere A100 80GB” accelerated nodes.
- 16 fat nodes: Fat compute nodes have larger amount of memory per core than a thin node.
All the nodes are connected via Infiniband network and share 2.7PB GPFS storage.
Access to the system is provided by two login nodes.
Nodes Summary
| Node Type |
Count |
Accelerator |
Memory |
Cores |
| THIN nodes |
48 |
w/o |
512 GB |
128@2.45 GHz (two sockets) |
| GPU nodes |
3 |
4 x NVIDIA Ampere A100 80GB |
512 GB |
128@2.45 GHz + 4 x A100 |
| FAT nodes |
16 |
w/o |
1024 GB |
128@2.45 GHz (two sockets) |
|
|
| Architecture |
x86-64 |
| Operating System |
Rocky Linux 9 (Blue Onyx) |
| Interconnect |
|
| Technology |
NVIDIA infiniband HDR 100 Gbps |
| Topology |
Fat tree |
| Bandwidth [Gb/s] |
100 |
| Storage |
|
| Type |
IBM GPFS |
| Size [PByte] |
2.7 |
| Bandwidth [GB/s] |
35 |
| System Software |
|
| Operating system |
Rocky Linux 9 (Blue Onyx) |
| Batch system |
SLURM |
| System Management |
xCat IBM |
| Monitoring |
Nagios, Ganglia |
Technical Info
Thin nodes
The 48 thin compute nodes (thin node island) have a theoretical peak performance (Rpeak) of 240.87 TFlops.The thin island is best suited for high-scalable applications utilizing MPI or hybrid MPI/OpenMP programming.
| THIN nodes technical information |
|
| Architecture |
x86-64 |
| System |
Dell PowerEdge R6525 |
| Total number of nodes |
48 |
| Total number of cores |
6144 |
| Total amount of RAM [TByte] |
24 |
| Total Linpack Performance [TFlop/s] |
240 |
| Components |
|
| Processor Type |
AMD EPYC 7763 |
| Nominal Frequency [GHz] |
2.45 |
| Processors per Node |
2 |
| Cores per Processor |
64 |
| Cores per Node |
128 |
| Hyperthreading |
OFF |
| Memory |
|
| Memory per Node [GByte] |
512 |
GPU nodes
3 GPU nodes offer a combined total theoritical peak performance of 249.05 TFlops (15.05 TFlops from CPUs and 234 TFlops from GPUs). Each NVidia A100 GPU incorporate 6912 CUDA cores.
| GPU nodes technical information |
|
| Architecture |
x86-64 |
| System |
Dell PowerEdge XE8545 |
| Total number of nodes |
3 |
| Total number of cores |
384 |
| Total number of gpus |
12 |
| Total amount of RAM [TByte] |
1.5 |
| Total Linpack Performance [TFlop/s] |
240 |
| Components |
|
| Processor Type |
AMD EPYC 7763 |
| Nominal Frequency [GHz] |
2.45 |
| Processors per Node |
2 |
| Cores per Processor |
64 |
| Cores per Node |
128 |
| Hyperthreading |
OFF |
| Accelerators |
|
| Accelerator type |
GPU - NVIDIA Ampere A100 |
| Accelerators per node |
4 |
| Accelerator memory [GByte] |
80 |
| Memory |
|
| Memory per Node [GByte] |
512 |
Fat nodes
Fat nodes offers more memory per server comparing with the regular two-socket nodes (thin, gpu). The total theoritical performance is 80.29 TFlops. Fat nodes are best suited for shared memory applications (e.g. OpenMP-based) and in general applications that require to perform in-memory processing of large datasets.
| FAT nodes technical information |
|
| Architecture |
x86-64 |
| System |
Dell PowerEdge R6525 |
| Total number of nodes |
16 |
| Total number of cores |
2048 |
| Total amount of RAM [TByte] |
16 |
| Total Linpack Performance [TFlop/s] |
80 |
| Components |
|
| Processor Type |
AMD EPYC 7763 |
| Nominal Frequency [GHz] |
2.45 |
| Processors per Node |
2 |
| Cores per Processor |
64 |
| Cores per Node |
128 |
| Hyperthreading |
OFF |
| Memory |
|
| Memory per Node [TByte] |
1 |
Login nodes
| Login nodes |
|
| Number of Nodes |
2 |
| Processor Type |
AMD EPYC 7413 |
| Nominal Frequency [GHz] |
2.65 |
| Processors per Node |
2 |
| Cores per Processor |
24 |
| Cores per Node |
48 |
| Hyperthreading |
OFF |
| Memory per Node [GByte] |
256 |