Graphcore Cloud Services and Graphcore Servers for Natural Language Processing

Faster Training and Inference Performance

Experience up to 100x faster training and inference performance on deep learning workflows
such as BERT for Natural Language Processing and MCMC for financial risk analysis.

Accelerating Next Generation Artificial Intelligence

Graphcore IPU accelerators and Poplar software together make the fastest and most flexible platform for current and future machine intelligence applications, lowering the cost of AI in the cloud and datacenter, improving performance and efficiency for applications such as Natural Language Processing, financial risk analysis and more.

Graphcore systems excel at both training and inference. The highly parallel computational resources together with graph software tools and libraries, allows researchers to explore machine intelligence across a much broader front than current solutions. This technology lets recent success in deep learning evolve rapidly towards useful, general artificial intelligence.

Graphcore IPU Servers

Graphcore IPU Solutions are Made for Machine Learning

What is the Graphcore IPU

The Intelligence Processing Unit (IPU) is completely different from today’s CPU and GPU processors. It is a highly flexible, easy to use, parallel processor that has been designed from the ground up to deliver state of the art performance on current machine intelligence models for both training and inference. But more importantly, the IPU has been designed to allow new and emerging machine intelligence workloads to be realized.

Arithmetic Efficiency

The IPU delivers much better arithmetic efficiency on small batch sizes for both training and inference which results in faster model convergence in training, models that generalise better, the ability to parallelize over many more IPU processors to reduce training time for a given batch size, and also delivers much higher throughput at lower latencies for inference.

Natural Language Processing - BERT

Graphcore has achieved state of the art performance and accuracy with the BERT language model, training BERT Base in 56 hours with seven C2 IPU-Processor PCIe cards, each with two IPUs, in an IPU Server system. With BERT inference, we see 3x higher throughput with over 20% improvement in latency to serve up results faster than ever.

Image Classification - ResNext

The Graphcore C2 IPU-Processor PCIe card achieves 3.7x higher throughput at 10x lower latency compared to a leading alternative processor. High throughput at the lowest possible latency is key in many of the important use cases today.

Detailed Benchmarks

Graphcore has provided training and inference benchmarks, such as BERT, RESNEXT and MCMC, to show how Graphcore C2 Intelligence Processing Units (IPUs) perform against GPUs.


Citadel Technical Report

Download and read Citadel's Technical Report which focuses on the architecture and performance of the Graphcore C2 IPU.

Get the Graphcore Benchmark Code to Run Your Own Tests of Graphcore Benchmarks