Multi-GPU Scaling and Peering/ Enabling Real GPU Compute Performance

Enabling Massive Multi-GPU Scaling and Peering

GPU-accelerated computing is enabling various scientific, analytics, engineering, consumer, and enterprise applications worldwide. GPU accelerators now power energy-efficient datacenters in government labs, universities, enterprises, and small-and-medium businesses around the world. However, as GPU application developers begin to truly discover the power that can be harnessed to accelerate their applications, they also begin to discover some limitations. That's where Cirrascale and the GB5600 Series servers can help.

The Cirrascale GB5600 Series servers are designed to provide near-linear scaling of GPUs enabling up to eight GPU accelerator cards to communicate peer-to-peer on just one PCIe root hub -- such as the NVIDIA® Tesla® P100 GPU Accelerators for HPC and Deep Learning. This allows for the ultimate marriage of peering and scaling to provide you with the best performance for your GPU-accelerated applications.

Discover the True Advantages: Why the GB5600 is Different

The GB5600 Series servers are different from other GPU supporting hardware implementations. Most of the hardware configurations available today only provide maximum performance between specific pairs of GPUs; and since GPUs are paired up, jobs requiring communication between arbitrary GPUs experience a performance impact. Additionally, there can be a significant performance impacts with trying to scale more than four GPUs on multi-socket systems. These have been persistent problems for customers who are pushing the limits of GPUs with large, complex data-sets and calculations, or where data must be streamed between GPUs. Cirrascale has been able to overcome these issues, and achieve near linear performance scaling with its design.

Maximize PCIe Bandwidth

Cirrascale is a strong believer in utilizing a technology to its fullest potential whenever possible and GPUs and GPU Accelerators are no exception. If the GPU has a PCIe Gen3 x16 link, then it should use it when communicating with other GPUs — any other GPUs. Our switch riser technology allows us to scale and peer multiple PCIe x16 Gen3 cards on a single root hub ensuring that the maximum PCIe bandwidth is available utilized for inter-card communication.

Minimize Intercard Latency and Obtain Consistent Performance Between GPUs

Our switch riser allows GPUs to communicate as if they are all on the same bus... because they are. Gone are the days of needing a bounce-buffer in host memory, or leaving GPU DMA engines unused because they couldn't address other devices in the system. This reduces intercard latency while helping to maintain a consistent performance level between GPUs.

Enable GPU-Centric Development and Usage

Since most all of the GPU traffic is passed between the GPUs directly via the Cirrascale SR3415 switch riser, a very negligible amount of host resources are needed to perform GPU work. Additionally, with a single address space and simultaneous inter-card communication at full PCIe x16 Gen3 speeds, software can spend more time doing work than thinking about when to schedule data copies.

Supports the Largest Number of GPU Offerings

We work closely with our technology partners to ensure you're given the broadest offerings for your application. The Cirrascale GB5600 Series supports both professional and consumer cards from the leading manufacturers including ground-breaking GPU accelerators, such as the NVIDIA® Tesla® K80 Dual-GPU Accelerators and the new Tesla M40 GPU accelerators specifically designed for Deep Learning applications.

Cirrascale's peer to peer technology allows us to unlock the true power of NVIDIA GPUs for scientific computation. This is the first platform of its kind and provides the unprecedented bandwidth needed to scale an individual molecular dynamics simulation with AMBER across all the GPUs within a node. This previously unattainable performance will have impact on many fields.
Ross Walker
Associate Research Professor, SDSC and
Adjunct Associate Professor, UCSD

Deep Learning Case Study

Read about how NYU researchers take on bigger challenges and create deep learning models that let computers do human-like perceptual tasks for research projects and educational programs at the NYU Center for Data Science.

Download Our Case Study on
NYU Center for Data Sciences

Download NVIDIA's Popular GPU-Accelerated Applications Catalog

NVIDIA identifies over two hundred applications across a wide range of industries already optimized for GPUs to help you accelerate your work. Then, put those applications in action with a GB5600 series blade server.

Share This Page:  

©2016. All Rights Reserved. Cirrascale Corporation..