IT Brief New Zealand - Technology news for CIOs & IT decision-makers
Story image
NVIDIA launches 'world's most advanced AI system' - the DGX A100
Fri, 15th May 2020
FYI, this story is more than a year old

NVIDIA today announced the arrival of its AI system DGX A100, delivering 5 petaflops of AI performance, available now and shipping worldwide.

NVIDIA says the first order of the system will be delivered to a lab in the US, which will use the DGX A100's computing power to ‘better understand COVID-19'.

The system integrates eight of the Tensor Core GPUs from the new NVIDIA A100 GPU, also announced today. 

This will provide a whopping 320GB of memory for training large AI datasets.

“NVIDIA DGX A100 is the ultimate instrument for advancing AI,” says NVIDIA founder and CEO Jensen Huang.

“[It's] the first AI system built for the end-to-end machine learning workflow — from data analytics to training to inference.

“And with the giant performance leap of the new DGX, machine learning engineers can stay ahead of the exponentially growing size of AI models and data.

DGX A100 technical specifications
  • Eight NVIDIA A100 Tensor Core GPUs, delivering 5 petaflops of AI power, with 320GB in total GPU memory with 12.4TB per second in bandwidth.
     
  • Six NVIDIA NVSwitch interconnect fabrics with third-generation NVIDIA NVLink technology for 4.8TB per second of bi-directional bandwidth.
     
  • Nine Mellanox ConnectX-6 HDR 200Gb per second network interfaces, offering a total of 3.6Tb per second of bi-directional bandwidth.
     
  • Mellanox In-Network Computing and network acceleration engines such as RDMA, GPUDirect® and Scalable Hierarchical Aggregation and Reduction Protocol (SHARP) to enable the highest performance and scalability.
     
  • 15TB Gen4 NVMe internal storage, which is 2x faster than Gen3 NVMe SSDs.
     
  • NVIDIA DGX software stack, which includes optimized software for AI and data science workloads, delivering maximised performance and enabling enterprises to achieve a faster return on their investment in AI infrastructure.


Multiple smaller workloads can be accelerated by partitioning the DGX A100 into as many as 56 instances per system, using the A100 multi-instance GPU feature, according to NVIDIA.

Enterprises can optimise both computing power and resources on-demand to bolster diverse workloads – including data analytics, training and inference, on a single software-defined platform.

Availability

NVIDIA DGX A100 systems start at US$199,000 and are shipping now through NVIDIA Partner Network resellers worldwide.


Also introducing... the NVIDIA DGX SuperPOD

In conjunction with the DGX A100, NVIDIA also revealed its next-generation DGX SuperPOD, a cluster of 140 DGX A100 systems capable of achieving 700 petaflops of AI computing power.

The DGX SuperPOD combines 140 DGX A100 systems with Mellanox HDR 200Gbps InfiniBand interconnects, and was designed to for internal research in areas such as conversational AI, genomics and autonomous driving.

NVIDIA says the cluster achieves a level of performance which previously required ‘thousands' of servers – making it one of the world's fastest supercomputers.

It was designed and build with immense help from the DGX A100 – the system's architecture enabled NVIDIA to build the DGX SuperPOD in less than a month, as opposed to the months or years put into planning and assembling previously required to deliver these supercomputing capabilities.