HPC Glossary


What you are looking for What it means
CUDA Compute Unified Device Architecture. An NVIDIA parallel computing platform and programming model which enables GPUs to be used for general purpose computing.
Computing node The basic building block of a calculation cluster. Equipped with latest generation processors, it can rapidly process the jobs sent to it by the master server. A HPC cluster can be made up of several dozen or several hundred computing nodes.
Distributed memory architecture A system in which each computing unit only has direct access to part of the global memory space. Each computing node is independent in a HPC cluster and thus does not have direct access to the memory of the other nodes to carry out simultaneous processing. This access is still nevertheless possible via ad hoc programming interfaces such as MPI, which would then require an adapted code. Adaptation of the code may involve considerable work.
GPGPU General-Purpose (Computing on) Graphics Processing Units: Remote computing on a GPU so as to benefit from its capacity for massively parallel processing to handle all kinds of information (not only graphics). By extension, the term GPGPU may designate the GPU itself, or its use.
GPU Graphics Processing Unit: originally the processing unit of a graphics card. Modern GPUs are designed according to massively parallel architectures with a gross capacity exceeding that of traditional processors (CPU) for specific tasks.
HPC High Performance Computing.
HPC cloud A hosted calculation service often based on virtual machines.
HPC cluster A group of servers intended to simultaneously run an application or a code on multiple servers. The clusters are largely built on hardware similar to that of general use computers, so are less costly than the vector supercomputers used for HPC in the past, and have thus totally replaced them over the last decade.
HPCDrive A web portal developed by OVH to facilitate the use and administration of an HPC cluster.
HPCaaS HPC as a Service
Infiniband  A super high speed, very low latency network which is widely used in HPC, due to its superior performances in comparision with traditional Ethernet technologies.
Job scheduler A programme intended to save the description of tasks to be completed so as to allocate the resources necessary to carrying out these tasks as efficiently as possible, thus minimising the waiting time prior to the task actually being launched. Job scheduler efficiency is one of the cornerstones of HPC clusters.
MPI Message Parsing Interface. A communication standard for nodes that run parallel programmes on distributed memory systems such as HPC clusters.
Master server It manages the servers and different resources of the HPC cluster and centralises all the data. It is the central point of a calculation infrastructure.
Numerical simulation The execution of a computer programme on a computer or network so as to reproduce a complex real phenomenon. Numerical simulation aims to virtually enact experiences which are too complex, too long, too expensive or simply impossible to enact in reality.
Open MP A programming interface for simultaneous calculation on shared memory architecture. This interface enables a more streamlined parallelisation of some parts of the code what may be required by MPI, but without providing the means to run this code by spreading it across several nodes of the a cluster.
OpenACC The OpenACC API enables programmers to integrate simple directives into their code, which speeds up some loops and functions by making use of the CPU or GPU technologies.
OpenCL  A programming interface designed to run heterogeneous systems, meaning CPUs and GPGPUs at the same time. Schematically, OpenCL can also be used as a standard alternative to CUDA (proprietary software but older).
Parallelism A set of hardware and software techniques which simultaneously handle multiple data so as to reduce the total processing time. Parallelism is the key to the power of HPC clusters and modern processors: a numerical simulation is typically divided into sub-sections, or more or less independent tasks. Each of these sub-parts or tasks is then processed in an execution unit (a core processor) at the same time as the others. Parallelism saves a huge amount of time for some types of calculation, but imposes its own constraints: not all algorithms can be parallelised, and the exchange of data between execution cores often requires very fast input-output buses to limit the time lost during these exchanges.
SLURM Simple Linux Utility for Resource Management. A job scheduler increasingly used on HPC clusters, including the most powerful ones to date.
SaaS Software as a Service
Shared memory architecture A system in which each processing unit, by way of a core processor for example, can access the entire memory space. This is typically the case with a traditional computer, where two processes can easily share the memory to exchange information quickly.
Tesla A range of NVIDIA accelerator cards dedicated to GPGPU.
Xeon Phi An Intel multi-processor architecture for simultaneous processing. The Xeon Phi cards compete with GPGPU solutions such as Tesla, but aim to simplify the programming by using processing units closer to traditional core processors than those of GPUs.