Machine / Deep Learning Solutions/ Enabling Deep Discovery

Understanding Machine and Deep Learning

In 1959, Arthur Samuel defined machine learning as a "field of study that gives computers the ability to learn without being explicity programmed". Since then, machine learning has evolved greatly and now plays a ever increasing role in the development of critical applications that are used throughout a variety of industries and begins to solve problems that cannot be solved by numerical needs alone. Common applications include data mining, natural language processing, expert systems, video analytics and security, as well as image recognition.

Over the past several years, Cirrascale has been working with a variety of application developers as well as data scientists in both industry and academia. Together, our goal has been to continue making groundbreaking improvements in developing some of the most advanced hardware capable of increasing the overall speed and flexibility in these areas of discovery by using multi-GPU compute solutions.

Cirrascale works closely with its partner NVIDIA to deploy some of the world's fastest deep learning training solutions. In fact, NVIDIA recently released the Tesla M40 GPU accelerator which was purpose built for scale out deep learning training deployments. It dramatically reduces the time to train deep neural networks — as much as 8X faster than a CPU. The new Tesla M40 features NVIDIA GPU Boost™ technology, which converts power headroom into user-controlled performance boosts, enabling the Tesla M40 to deliver 7 Tflops of single precision peak performance. Additionally, it provides 12GB of ultra-fast GDDR5 memory, which enables a single Cirrascale GB5600 blade server to house up to an incredible 96GB of GPU memory.

Cirrascale Multi-GPU Solutions for Deep Learning

Deep learning neural networks learn from many levels of abstraction. They range from simple concepts to complex ones. This is what puts the "deep" in deep learning. Each layer categorizes some kind of information, refines it and passes it along to the next. Deep learning lets a machine use this process to build a hierarchical representation. So, the first layer might look for simple edges. The next might look for collections of edges that form simple shapes like rectangles, or circles. The third might identify features like eyes and noses. After five or six layers, the neural network can put these features together. The result: a machine that can recognize faces.

GPUs are ideal for deep learning, speeding a process that could otherwise take a year or more to just weeks or days. That's because GPUs perform many calculations at once — or in parallel. And once a system is "trained," with GPUs, scientists and researchers can put that learning to work. That's why researchers at top universities worldwide and a host of startups are rushing to put deep learning to work and doing so with Cirrascale GB Series blade and rackmount solutions, powered by NVIDIA Tesla M40 GPU Accelerators.

Cirrascale Solutions are Different

Our machine and deep learning solutions are different from other GPU supporting hardware implementations. Most of the hardware configurations available today only provide maximum performance between specific pairs of GPUs; and since GPUs are paired up, jobs requiring communication between arbitrary GPUs experience a performance impact. Additionally, there can be a significant performance impacts with trying to scale more than four GPUs on multi-socket systems. These have been persistent problems for customers who are pushing the limits of GPUs with large, complex data-sets and calculations, or where data must be streamed between GPUs. Cirrascale has been able to overcome these issues, and achieve near linear performance scaling with its design.

Maximize PCIe Bandwidth

Cirrascale is a strong believer in utilizing a technology to its fullest potential whenever possible and GPUs and GPU Accelerators are no exception. If the GPU has a PCIe Gen3 x16 link, then it should use it when communicating with other GPUs — any other GPUs. Our switch riser technology allows us to scale and peer multiple PCIe x16 Gen3 cards on a single root hub ensuring that the maximum PCIe bandwidth is available utilized for inter-card communication.

Minimize Intercard Latency and Obtain Consistent Performance Between GPUs

Our switch riser allows GPUs to communicate as if they are all on the same bus... because they are. Gone are the days of needing a bounce-buffer in host memory, or leaving GPU DMA engines unused because they couldn't address other devices in the system. This reduces intercard latency while helping to maintain a consistent performance level between GPUs.

Enable GPU-Centric Development and Usage

Since most all of the GPU traffic is passed between the GPUs directly via the Cirrascale SR3415 switch riser, a very negligible amount of host resources are needed to perform GPU work. Additionally, with a single address space and simultaneous inter-card communication at full PCIe x16 Gen3 speeds, software can spend more time doing work than thinking about when to schedule data copies.

Supports the Largest Number of GPU Offerings

We work closely with our technology partners to ensure you're given the broadest offerings for your application. The Cirrascale GB5600 Series supports both professional and consumer cards from the leading manufacturers such as NVIDIA®. We can support any NVIDIA® GTX, Quadro®, or Tesla® GPU Accelerators. In fact, Cirrascale is an NVIDIA Tesla Preferred Partner and can create some of the most unique and powerful GPU-enabled solutions.

Download Our White Paper

The challenge faced by researchers, software developers, and engineers for applications such as deep learning, molecular dynamics and high performance computing has shifted from "How can I make use of a GPU?" to "How can I get more performance out of GPUs?" Learn how we're helping them make the shift.

Download Our White Paper on
Scaling GPU Compute Performance

Deep Learning Case Study

Read about how NYU researchers take on bigger challenges and create deep learning models that let computers do human-like perceptual tasks for research projects and educational programs at the NYU Center for Data Science.

Download Our Case Study on
NYU Center for Data Sciences

Learn More About the GB5600 Series

The GB5600 Series solution is the only blade server on the market that can support up to 16 GPUs on a single PCIe root complex. Our advanced PCIe witch riser technology enables our multi-GPU solutions to maximize PCIe bandwidth, minimize intercard latency, and enable GPU-centric usage. Click here to learn more.

Our Deep Learning Solutions In The News

The below articles and websites have mentions of the Cirrascale GB Series products in regards to deep learning use cases. We appreciate the coverage provided by these organizations and their help in furthering the message of scalable, peer-to-peer GPU deep learning implementations.

NYU to Advance Deep Learning Research with Multi-GPU Cluster

April 30, 2015, Kimberly Powell, NVIDIA

Multi-GPU Cluster to Power Deep Learning Research at NYU

April 30, 2015, Inside HPC Staff,

Deep Learning Pioneer Pushing GPU Neural Network Limits

May 11, 2015, Nicole Hemsoth, The Platform

Share This Page:  

©2017. All Rights Reserved. Cirrascale Corporation..