NVIDIA GPU Cloud

NVIDIA GPU Cloud

Unmatched End-to-End Accelerated Computing Platform

NVIDIA AI acceleration devices, hosted by Cirrascale, provide multiple GPUs with extremely fast interconnections and a fully accelerated software stack, creating the most optimal platform for HPC and AI training, tuning and inference.

No items found.

Expand Horizons with NVIDIA in the Cloud

Purpose-Built for AI and HPC

AI, complex simulations, and massive datasets require multiple GPUs with extremely fast interconnections and a fully accelerated software stack. The NVIDIA HGX™ AI supercomputing platform brings together the full power of NVIDIA GPUs, NVIDIA NVLink™, NVIDIA networking, and fully optimized AI and high-performance computing (HPC) software stacks to provide the highest application performance and drive the fastest time to insights.

This fully connected topology from NVSwitch enables any GPU to talk to any other GPU concurrently. Notably, this communication runs at the NVLink bidirectional speed of 900 gigabytes per second (GB/s), which is more than 14x the bandwidth of the current PCIe Gen4 x16 bus.

Accelerating HGX With NVIDIA Networking

The data center is the new unit of computing, and networking plays an integral role in scaling application performance across it. Paired with NVIDIA Quantum InfiniBand, HGX delivers world-class performance and efficiency, which ensures the full utilization of computing resources.

At Cirrascale, our NVIDIA HGX H100 and H200 clusters are built using NVIDIA InfiniBand NDR networking so you receive the most performant cluster for your training and inference needs. Our infrastructure is setup to be optimized for your specific configuration to make sure your training experiments maximize your compute per dollar.

No items found.

Discover the Benefits of NVIDIA AI Hosted by Cirrascale

Flexibility for Training, Fine Tuning, and Inference

  • Supports model development and tuning
  • Provides tuned performance for all leading AI models
  • The most recognized platform for developing and deploying AI and HPC solutions

Ease of Use

  • Direct support for leading frameworks
  • The reference platform for model libraries such as Hugging Face, enabling easy model deployment and use

Extensive Software Support

  • Native CUDA support for the most extensive compatibility with existing compute GPU software, frameworks and tools
  • Predictable pricing model deployed on Cirrascale

Simple & Secure Cloud Operations

  • Simple onboarding – No DevOps required
  • SDKs, storage and network pre-configured and ready to go

Popular NVIDIA Offerings on the Cirrascale AI Innovation Cloud

NVIDIA HGX H200: The World’s Leading AI Computing Platform

As workloads explode in complexity, there’s a need for multiple GPUs to work together with extremely fast communication between them. NVIDIA HGX H200 combines multiple H200 GPUs with a high-speed interconnect powered by NVIDIA NVLink and NVSwitch™ to enable the creation of the world’s most powerful scale-up servers.

Cirrascale offers the HGX H200 as a dedicated, bare-metal offering in an eight H200 GPU configuration. The eight-GPU configuration offers full GPU-to-GPU bandwidth through NVIDIA NVSwitch. Leveraging the power of H200 multi-precision Tensor Cores, an eight-way HGX H200 provides over 32 petaFLOPS of FP8 deep learning compute and over 1.1TB of aggregate HBM memory for the highest performance in generative AI and HPC applications.

HGX H200 enables standardized servers that provide the highest performance on various application workloads, including LLM training and inference for the largest models beyond 175 billion parameters, while accelerating time to market for NVIDIA’s ecosystem of partner server makers.

Learn More >

NVIDIA HGX H100 in the Cloud with Cirrascale Cloud Services

The NVIDIA HGX H100 brings together the full power of NVIDIA H100 Tensor Core GPUs, NVIDIA® NVLink®, NVSwitch technology, and NVIDIA Quantum-2 InfiniBand networking. As a specialized cloud services provider, Cirrascale delivers all of this to you via the cloud. We offer fully-managed NVIDIA GPU-based clusters at a fraction of the cost of traditional cloud service providers. These bare-metal servers are completely dedicated to you with no contention and no performance issues due to virtualization overhead.

Our flat-rate, no surprises billing model means we can provide you with a price that is up to 30% lower than the other cloud service providers. We also don't nickel-and-dime you by charging to get your data in to or out of our cloud. Instead, we charge no ingress or egress fees, so you never receive a supplemental bill.

Pricing

AMD Instinct Series Instance Pricing

OAM
Processor Specs
System RAM
Local Storage
Network
Monthly Pricing
6-Month Pricing
Annual Pricing
8X AMD Instinct MI300X
Dual 48-Core
2.3TB
(1) 960 NVMe
(4) 3.84TB NVMe
25Gb Bonded
(3200Gb Available)
$22,499
$20,249
$17,999
4X AMD Instinct MI250
Dual 64-Core
1TB
(1) 960 NVMe
(1) 3.84TB NVMe
25Gb Bonded
$4,679
$4,211
$3,743
OAM
8X AMD Instinct MI300X
4X AMD Instinct MI250
Processor Specs
Dual 48-Core
Dual 64-Core
System RAM
2.3TB
1TB
Local Storage
(1) 960 NVMe
(4) 3.84TB NVMe
(1) 960 NVMe
(1) 3.84TB NVMe
Network
25Gb Bonded
(3200Gb Available)
25Gb Bonded
Monthly Pricing
$22,499
$4,679
6-Month Pricing
$20,249
$4,211
Annual Pricing
$17,999
$3,743
All pricing above is based on Cirrascale's No Surprises billing model. There are no hidden fees and discounts may apply for long-term commitments depending on the service requested. All pricing shown for servers are per server per month.

Pricing

NVIDIA GPU Cloud

Instance
Processor Specs
System RAM
Local Storage
Network
Monthly Pricing
6-Month Pricing
Annual Pricing
8-GPU
NVIDIA H200
Dual 48-Core
2TB
960 NVMe
(4) 3.84TB NVMe
25Gb Bonded
(3200Gb Available)
$26,499
$23,849
$21,199
8-GPU
NVIDIA H100
Dual 48-Core
2TB
(1) 960 NVMe
(4) 3.84TB NVMe
25Gb Bonded
(3200Gb Available)
$24,999
$22,499
$19,999

Cirrascale Cloud Services has one of the largest selections of NVIDIA GPUs available in the cloud.
The above represents our most popular instances, but check out our pricing page for more instance types.
Not seeing what you need? Contact us for a specialized cloud quote for the configuration you need.

OAM
8-GPU NVIDIA H200
8-GPU NVIDIA H100
Processor Specs
Dual 48-Core
Dual 48-Core
System RAM
2TB
2TB
Local Storage
(1) 960 NVMe
(4) 3.84TB NVMe
(1) 960 NVMe
(4) 3.84TB NVMe
Network
25Gb Bonded
(3200Gb Available)
25Gb Bonded
(3200Gb Available)
Monthly Pricing
$26,499
$24,999
6-Month Pricing
$23,849
$22,499
Annual Pricing
$21,199
$19,999
All pricing above is based on Cirrascale's No Surprises billing model. There are no hidden fees and discounts may apply for long-term commitments depending on the service requested. All pricing shown for servers are per server per month.

Pricing

Qualcomm Cloud AI 100 Series Pricing

Config
vCPUs
System RAM
Local Storage
Monthly Pricing
Annual Pricing
8X AI 100 Ultra
128
512GB
(2) 3.84TB NVMe
$4,699
$3,759
Octo AI 100 Pro
64
384GB
1TB NVMe
$2,499
$2,019
Quad AI 100 Pro
48
182GB
1TB NVMe
$1,259
$1,009
Dual AI 100 Pro
24
48GB
1TB NVMe
$629
$519
Single AI 100 Pro (128)
32
128GB
1TB NVMe
$549
$439
Single AI 100 Pro (64)
32
64GB
1TB NVMe
$369
$289
Single AI 100 Pro (48)
12
48GB
1TB NVMe
$329
$259
Config
8X AI 100 Ultra
Octo AI 100 Pro
Quad AI 100 Pro
Dual AI 100 Pro
Single AI 100 Pro (128)
Single AI 100 Pro (64)
Single AI 100 Pro (48)
vCPUs
128
64
48
24
32
32
12
System RAM
512GB
384GB
182GB
48GB
64GB
64GB
48GB
Local Storage
(2) 3.84TB NVMe
1TB NVMe
1TB NVMe
1TB NVMe
1TB NVMe
1TB NVMe
1TB NVMe
Monthly Pricing
$4,699
$2,499
$1,259
$629
$549
$369
$329
Annual Pricing
$3,759
$2,019
$1,009
$519
$439
$289
$259
All pricing above is based on Cirrascale's No Surprises billing model. There are no hidden fees and discounts may apply for long-term commitments depending on the service requested. All pricing shown for servers are per server per month.

Qualcomm branded products are products of Qualcomm Technologies, Inc. and/or its subsidiaries.

Pricing

The Cerebras AI Model Studio

Fine-Tuning - Standard Offering Pricing
Model
Parameters
Fine-tuning price per 1K tokens
Fine-tuning price per example (MSL 2048)
Fine-tuning price per example (MSL 4096)
Cerebras time to 10B tokens (h)**
Cerebras time to 10B tokens (h)**
Eleuther GPT-J
6
$0.00055
$0.0011
$0.0023
17
132
Eleuther GPT-NeoX
20
$0.00190
$0.0039
$0.0078
56
451
CodeGen* 350M
0.35
$0.00003
$0.00006
$0.00013
1
8
CodeGen* 2.7B
2.7
$0.00026
$0.0005
$0.0027
8
61
CodeGen* 6.1B
6.1
$0.00065
$0.0013
$0.0030
19
154
CodeGen* 16.1B
16.1
$0.00147
$0.0030
$0.011
44
350
Model
Eleuther GPT-J
Eleuther GPT-NeoX
CodeGen* 350M
CodeGen* 2.7B
CodeGen* 6.1B
CodeGen* 16.1B
Parameters
6
20
0.35
2.7
6.1
16.1
Fine-tuning price per 1K tokens
$0.00055
$0.00190
$0.00003
$0.00026
$0.00065
$0.00147
Fine-tuning price per example (MSL 2048)
$0.0011
$0.0039
$0.00006
$0.0005
$0.0013
$0.0030
Fine-tuning price per example (MSL 4096)
$0.0023
$0.0078
$0.00013
$0.0027
$0.0030
$0.011
Cerebras time to 10B tokens (h)**
17
56
1
8
19
44
AWS p4d (8xA100) time to 10B tokens (h)
132
451
8
61
154
350
* T5 tokens to train from the original T5 paper. Chinchilla scaling laws not applicable.

** Note that GPT-J was pre-trained on ~400B tokens. Fine-tuning jobs can employ a wide range of dataset sizes, but often use order 1-10% of the pre-training tokens. As such, one might fine-tune a model like GPT-J with ~4-40B tokens. We provide estimated wall clock time to fine-tune train the model checkpoints above with 10B tokens on Cerebras AI Model Studio and an AWS p4d instance in the table above to give you a sense of how much time jobs of this scale could take.
Fixed-Price Production Model Training
Model
Parameters
Tokens to Train to Chinchilla Point (B)
Cerebras AI Model Studio CS-2 Days to Train
Cerebras AI Model Studio Price to Train
GPT3-XL
1.3
26
0.4
$2,500
GPT-J
6
120
8
$45,000
GPT-3 6.7B
6.7
134
11
$40,000
T-5 11B
11
34*
9
$60,000
GPT-3 13B
13
260
39
$150,000
GPT NeoX
20
400
47
$525,000
GPT 70B
70
1,400
Contact For Quote
Contact For Quote
GPT 175B
175
3,500
Contact For Quote
Contact For Quote
Model
GPT3-XL
GPT-J
GPT-3 6.7B
T-5 11B
GPT-3 13B
GPT NeoX
GPT 70B
GPT 175B
Parameters
1.3
6
6.7
11
13
20
70
175
Tokens to Train to Chinchilla Point (B)
26
120
134
34*
260
400
1,400
3,500
Cerebras AI Model Studio CS-2 Days to Train
0.4
8
11
9
39
47
Contact For Quote
Contact For Quote
Cerebras AI Model Studio Price to Train
$2,500
$45,000
$40,000
$60,000
$150,000
$525,000
Contact For Quote
Contact For Quote
* T5 tokens to train from the original T5 paper. Chinchilla scaling laws not applicable.

** Expected number of days, based on training experience to date, using a 4-node Cerebras Wafer-Scale Cluster.  Actual training of model may take more or less time.

Ready To Get Started?

Ready to take advantage of our flat-rate monthly billing, no ingress/egress data fees, and fast multi-tiered storage?

Get Started