Cirrascale Cloud Services

Expand Horizons with NVIDIA in the Cloud

Purpose-Built for AI and HPC

AI, complex simulations, and massive datasets require multiple GPUs with extremely fast interconnections and a fully accelerated software stack. The NVIDIA HGX™ AI supercomputing platform brings together the full power of NVIDIA GPUs, NVIDIA NVLink™, NVIDIA networking, and fully optimized AI and high-performance computing (HPC) software stacks to provide the highest application performance and drive the fastest time to insights.

This fully connected topology from NVSwitch enables any GPU to talk to any other GPU concurrently. Notably, this communication runs at the NVLink bidirectional speed of 900 gigabytes per second (GB/s), which is more than 14x the bandwidth of the current PCIe Gen4 x16 bus.

Accelerating HGX With NVIDIA Networking

The data center is the new unit of computing, and networking plays an integral role in scaling application performance across it. Paired with NVIDIA Quantum InfiniBand, HGX delivers world-class performance and efficiency, which ensures the full utilization of computing resources.

At Cirrascale, our NVIDIA HGX B200 and H200 clusters are built using NVIDIA InfiniBand NDR networking so you receive the most performant cluster for your training and inference needs. Our infrastructure is setup to be optimized for your specific configuration to make sure your training experiments maximize your compute per dollar.

No items found.

Popular NVIDIA Offerings on the Cirrascale AI Innovation Cloud

NVIDIA HGX B200: The New Era of Accelerated Computing is Here

The NVIDIA Blackwell architecture introduces groundbreaking advancements for generative AI and accelerated computing. The incorporation of the second generation Transformer Engine, alongside the faster and wider NVIDIA NVLink interconnect, propels the data center into a new era, with orders of magnitude more performance compared to the previous architecture generation.

Cirrascale offers the HGX B200 in its AI Innovation Cloud as an 8-GPU configuration giving you full GPU-to-GPU bandwidth through NVIDIA NVLink™ Switch. As a premier accelerated scaleup x86 platform with up to 15X faster real-time inference performance, 12X lower cost, and 12X less energy use, HGX B200 is designed for the most demanding AI, data analytics, and high-performance computing (HPC) workloads.

Learn More >

NVIDIA HGX H200: The World’s Leading AI Computing Platform

As workloads explode in complexity, there’s a need for multiple GPUs to work together with extremely fast communication between them. NVIDIA HGX H200 combines multiple H200 GPUs with a high-speed interconnect powered by NVIDIA NVLink and NVSwitch™ to enable the creation of the world’s most powerful scale-up servers.

Cirrascale offers the HGX H200 as a dedicated, bare-metal offering in an eight H200 GPU configuration. The eight-GPU configuration offers full GPU-to-GPU bandwidth through NVIDIA NVSwitch. Leveraging the power of H200 multi-precision Tensor Cores, an eight-way HGX H200 provides over 32 petaFLOPS of FP8 deep learning compute and over 1.1TB of aggregate HBM memory for the highest performance in generative AI and HPC applications.

HGX H200 enables standardized servers that provide the highest performance on various application workloads, including LLM training and inference for the largest models beyond 175 billion parameters, while accelerating time to market for NVIDIA’s ecosystem of partner server makers.

Learn More >

NVIDIA HGX H100 in the Cloud with Cirrascale Cloud Services

The NVIDIA HGX H100 brings together the full power of NVIDIA H100 Tensor Core GPUs, NVIDIA® NVLink®, NVSwitch technology, and NVIDIA Quantum-2 InfiniBand networking. As a specialized cloud services provider, Cirrascale delivers all of this to you via the cloud. We offer fully-managed NVIDIA GPU-based clusters at a fraction of the cost of traditional cloud service providers. These bare-metal servers are completely dedicated to you with no contention and no performance issues due to virtualization overhead.

Our flat-rate, no surprises billing model means we can provide you with a price that is up to 30% lower than the other cloud service providers. We also don't nickel-and-dime you by charging to get your data in to or out of our cloud. Instead, we charge no ingress or egress fees, so you never receive a supplemental bill.

Pricing

AMD Instinct Series Instance Pricing

8X AMD Instinct MI325X

Dual 48-Core

2.3TB

(1) 960 NVMe
(4) 3.84TB NVMe

25Gb Bonded
_{(3200Gb Available)}

8X AMD Instinct MI300X

Dual 48-Core

2.3TB

(1) 960 NVMe
(4) 3.84TB NVMe

25Gb Bonded
_{(3200Gb Available)}

$22,499

$20,249

$17,999

4X AMD Instinct MI250

Dual 64-Core

1TB

(1) 960 NVMe
(1) 3.84TB NVMe

25Gb Bonded

$4,679

$4,211

$3,743

8X AMD Instinct MI300X

4X AMD Instinct MI250

Dual 48-Core

Dual 64-Core

2.3TB

1TB

(1) 960 NVMe
(4) 3.84TB NVMe

(1) 960 NVMe
(1) 3.84TB NVMe

25Gb Bonded
_{(3200Gb Available)}

25Gb Bonded

$22,499

$4,679

$20,249

$4,211

$17,999

$3,743

All pricing above is based on Cirrascale's No Surprises billing model. There are no hidden fees and discounts may apply for long-term commitments depending on the service requested. All pricing shown for servers are per server per month.

Pricing

NVIDIA GPU Cloud

8-GPU
NVIDIA B200

Dual 48-Core

2TB

960 NVMe
(4) 3.84TB NVMe

25Gb Bonded
_{(3200Gb Available)}

$34,999

$31,499

$27,999

8-GPU
NVIDIA H200

Dual 48-Core

2TB

960 NVMe
(4) 3.84TB NVMe

25Gb Bonded
_{(3200Gb Available)}

$26,499

$23,849

$21,199

8-GPU
NVIDIA H100

Dual 48-Core

2TB

(1) 960 NVMe
(4) 3.84TB NVMe

25Gb Bonded
_{(3200Gb Available)}

$24,999

$22,499

$19,999

Cirrascale Cloud Services has one of the largest selections of NVIDIA GPUs available in the cloud.
The above represents our most popular instances, but check out our pricing page for more instance types.
Not seeing what you need? Contact us for a specialized cloud quote for the configuration you need.

8-GPU NVIDIA B200

8-GPU NVIDIA H200

8-GPU NVIDIA H100

Dual 48-Core

2TB

(1) 960 NVMe
(4) 3.84TB NVMe

25Gb Bonded
_{(3200Gb Available)}

$34,999

$26,499

$24,999

$31,499

$23,849

$22,499

$27,999

$21,199

$19,999

Pricing

Qualcomm Cloud AI 100 Series Bare-Metal Pricing

8X AI 100 Ultra

128

512GB

(2) 3.84TB NVMe

$4,699

$3,759

Octo AI 100 Pro

384GB

1TB NVMe

$2,499

$2,019

Quad AI 100 Pro

182GB

1TB NVMe

$1,259

$1,009

Dual AI 100 Pro

48GB

1TB NVMe

$629

$519

Single AI 100 Pro (128)

128GB

1TB NVMe

$549

$439

Single AI 100 Pro (64)

64GB

1TB NVMe

$369

$289

Single AI 100 Pro (48)

48GB

1TB NVMe

$329

$259

8X AI 100 Ultra

Octo AI 100 Pro

Quad AI 100 Pro

Dual AI 100 Pro

Single AI 100 Pro (128)

Single AI 100 Pro (64)

Single AI 100 Pro (48)

128

512GB

384GB

182GB

48GB

64GB

48GB

(2) 3.84TB NVMe

1TB NVMe

$4,699

$2,499

$1,259

$629

$549

$369

$329

$3,759

$2,019

$1,009

$519

$439

$289

$259

Pricing

The Cerebras AI Model Studio

Fine-Tuning - Standard Offering Pricing

Eleuther GPT-J

$0.00055

$0.0011

$0.0023

132

Eleuther GPT-NeoX

$0.00190

$0.0039

$0.0078

451

CodeGen* 350M

0.35

$0.00003

$0.00006

$0.00013

CodeGen* 2.7B

2.7

$0.00026

$0.0005

$0.0027

CodeGen* 6.1B

6.1

$0.00065

$0.0013

$0.0030

154

CodeGen* 16.1B

16.1

$0.00147

$0.0030

$0.011

350

Eleuther GPT-J

Eleuther GPT-NeoX

CodeGen* 350M

CodeGen* 2.7B

CodeGen* 6.1B

CodeGen* 16.1B

0.35

2.7

6.1

16.1

$0.00055

$0.00190

$0.00003

$0.00026

$0.00065

$0.00147

$0.0011

$0.0039

$0.00006

$0.0005

$0.0013

$0.0030

$0.0023

$0.0078

$0.00013

$0.0027

$0.0030

$0.011

132

451

154

350

* T5 tokens to train from the original T5 paper. Chinchilla scaling laws not applicable.
‍
** Note that GPT-J was pre-trained on ~400B tokens. Fine-tuning jobs can employ a wide range of dataset sizes, but often use order 1-10% of the pre-training tokens. As such, one might fine-tune a model like GPT-J with ~4-40B tokens. We provide estimated wall clock time to fine-tune train the model checkpoints above with 10B tokens on Cerebras AI Model Studio and an AWS p4d instance in the table above to give you a sense of how much time jobs of this scale could take.

Fixed-Price Production Model Training

GPT3-XL

1.3

0.4

$2,500

GPT-J

120

$45,000

GPT-3 6.7B

6.7

134

$40,000

T-5 11B

34*

$60,000

GPT-3 13B

260

$150,000

GPT NeoX

400

$525,000

GPT 70B

1,400

Contact For Quote

GPT 175B

175

3,500

Contact For Quote

GPT3-XL

GPT-J

GPT-3 6.7B

T-5 11B

GPT-3 13B

GPT NeoX

GPT 70B

GPT 175B

1.3

6.7

175

120

134

34*

260

400

1,400

3,500

0.4

Contact For Quote

$2,500

$45,000

$40,000

$60,000

$150,000

$525,000

Contact For Quote

* T5 tokens to train from the original T5 paper. Chinchilla scaling laws not applicable.
‍
** Expected number of days, based on training experience to date, using a 4-node Cerebras Wafer-Scale Cluster. Actual training of model may take more or less time.

NVIDIA GPU Cloud

NVIDIA GPU Cloud

Expand Horizons with NVIDIA in the Cloud

Purpose-Built for AI and HPC

Accelerating HGX With NVIDIA Networking

Discover the Benefits of NVIDIA AI Hosted by Cirrascale

Flexibility for Training, Fine Tuning, and Inference‍

Ease of Use‍

Extensive Software Support‍

Simple & Secure Cloud Operations‍

Popular NVIDIA Offerings on the Cirrascale AI Innovation Cloud

NVIDIA HGX B200: The New Era of Accelerated Computing is Here

NVIDIA HGX H200: The World’s Leading AI Computing Platform

NVIDIA HGX H100 in the Cloud with Cirrascale Cloud Services

AMD Instinct Series Instance Pricing

NVIDIA GPU Cloud

Qualcomm Cloud AI 100 Series Bare-Metal Pricing

The Cerebras AI Model Studio

Fine-Tuning - Standard Offering Pricing

Fixed-Price Production Model Training

Ready To Get Started?