Cirrascale Cloud Services

Qualcomm Cloud AI 100 Ultra Solutions

Providing industry leading performance-per-TCO$ spanning GenAI, including Large Language Models, as well as Natural Language Processing and Computer Vision. Unlocking new possibilities for Al applications in the cloud for model developers, Al inference solution providers, and enterprises.

Purpose-Built Inferencing

Generative Al models being developed today, require robust, high-performance acceleration during development and training. However, when deploying a pre-built model for a service or enterprise offering, the main requirement is cost-effective inference, avoiding the high costs from devices that are optimized for training.

The Qualcomm Cloud Al Platform includes devices like the Cloud Al 100 Ultra, purpose-built for Generative Al. It accelerates inference for Large Language Models (LLMs), Natural Language Processing (NLP), and Computer Vision (CV).

Inference Cloud powered by Qualcomm

Accelerate generative AI development with ready-to-use applications and agents that enable production deployment with the Qualcomm AI Inference Suite.

Discover the Inference Cloud powered by Qualcomm >

Customized Options Available

For specialized needs or enhanced scalability, Cirrascale offers the Qualcomm Cloud AI 100 Ultra in a bare-metal solution that enables deep integration of custom DevOps workforces with your inference requirements. We work with you to develop the solution you need.

‍

No items found.

Pricing

AMD Instinct Series Instance Pricing

8X AMD Instinct MI325X

Dual 48-Core

2.3TB

(1) 960 NVMe
(4) 3.84TB NVMe

25Gb Bonded
_{(3200Gb Available)}

8X AMD Instinct MI300X

Dual 48-Core

2.3TB

(1) 960 NVMe
(4) 3.84TB NVMe

25Gb Bonded
_{(3200Gb Available)}

$22,499

$20,249

$17,999

4X AMD Instinct MI250

Dual 64-Core

1TB

(1) 960 NVMe
(1) 3.84TB NVMe

25Gb Bonded

$4,679

$4,211

$3,743

8X AMD Instinct MI300X

4X AMD Instinct MI250

Dual 48-Core

Dual 64-Core

2.3TB

1TB

(1) 960 NVMe
(4) 3.84TB NVMe

(1) 960 NVMe
(1) 3.84TB NVMe

25Gb Bonded
_{(3200Gb Available)}

25Gb Bonded

$22,499

$4,679

$20,249

$4,211

$17,999

$3,743

All pricing above is based on Cirrascale's No Surprises billing model. There are no hidden fees and discounts may apply for long-term commitments depending on the service requested. All pricing shown for servers are per server per month.

Pricing

NVIDIA GPU Cloud

8-GPU
NVIDIA B200

Dual 48-Core

2TB

960 NVMe
(4) 3.84TB NVMe

25Gb Bonded
_{(3200Gb Available)}

$34,999

$31,499

$27,999

8-GPU
NVIDIA H200

Dual 48-Core

2TB

960 NVMe
(4) 3.84TB NVMe

25Gb Bonded
_{(3200Gb Available)}

$26,499

$23,849

$21,199

8-GPU
NVIDIA H100

Dual 48-Core

2TB

(1) 960 NVMe
(4) 3.84TB NVMe

25Gb Bonded
_{(3200Gb Available)}

$24,999

$22,499

$19,999

Cirrascale Cloud Services has one of the largest selections of NVIDIA GPUs available in the cloud.
The above represents our most popular instances, but check out our pricing page for more instance types.
Not seeing what you need? Contact us for a specialized cloud quote for the configuration you need.

8-GPU NVIDIA B200

8-GPU NVIDIA H200

8-GPU NVIDIA H100

Dual 48-Core

2TB

(1) 960 NVMe
(4) 3.84TB NVMe

25Gb Bonded
_{(3200Gb Available)}

$34,999

$26,499

$24,999

$31,499

$23,849

$22,499

$27,999

$21,199

$19,999

Pricing

Qualcomm Cloud AI 100 Series Bare-Metal Pricing

8X AI 100 Ultra

128

512GB

(2) 3.84TB NVMe

$4,699

$3,759

Octo AI 100 Pro

384GB

1TB NVMe

$2,499

$2,019

Quad AI 100 Pro

182GB

1TB NVMe

$1,259

$1,009

Dual AI 100 Pro

48GB

1TB NVMe

$629

$519

Single AI 100 Pro (128)

128GB

1TB NVMe

$549

$439

Single AI 100 Pro (64)

64GB

1TB NVMe

$369

$289

Single AI 100 Pro (48)

48GB

1TB NVMe

$329

$259

8X AI 100 Ultra

Octo AI 100 Pro

Quad AI 100 Pro

Dual AI 100 Pro

Single AI 100 Pro (128)

Single AI 100 Pro (64)

Single AI 100 Pro (48)

128

512GB

384GB

182GB

48GB

64GB

48GB

(2) 3.84TB NVMe

1TB NVMe

$4,699

$2,499

$1,259

$629

$549

$369

$329

$3,759

$2,019

$1,009

$519

$439

$289

$259

Pricing

The Cerebras AI Model Studio

Fine-Tuning - Standard Offering Pricing

Eleuther GPT-J

$0.00055

$0.0011

$0.0023

132

Eleuther GPT-NeoX

$0.00190

$0.0039

$0.0078

451

CodeGen* 350M

0.35

$0.00003

$0.00006

$0.00013

CodeGen* 2.7B

2.7

$0.00026

$0.0005

$0.0027

CodeGen* 6.1B

6.1

$0.00065

$0.0013

$0.0030

154

CodeGen* 16.1B

16.1

$0.00147

$0.0030

$0.011

350

Eleuther GPT-J

Eleuther GPT-NeoX

CodeGen* 350M

CodeGen* 2.7B

CodeGen* 6.1B

CodeGen* 16.1B

0.35

2.7

6.1

16.1

$0.00055

$0.00190

$0.00003

$0.00026

$0.00065

$0.00147

$0.0011

$0.0039

$0.00006

$0.0005

$0.0013

$0.0030

$0.0023

$0.0078

$0.00013

$0.0027

$0.0030

$0.011

132

451

154

350

* T5 tokens to train from the original T5 paper. Chinchilla scaling laws not applicable.
‍
** Note that GPT-J was pre-trained on ~400B tokens. Fine-tuning jobs can employ a wide range of dataset sizes, but often use order 1-10% of the pre-training tokens. As such, one might fine-tune a model like GPT-J with ~4-40B tokens. We provide estimated wall clock time to fine-tune train the model checkpoints above with 10B tokens on Cerebras AI Model Studio and an AWS p4d instance in the table above to give you a sense of how much time jobs of this scale could take.

Fixed-Price Production Model Training

GPT3-XL

1.3

0.4

$2,500

GPT-J

120

$45,000

GPT-3 6.7B

6.7

134

$40,000

T-5 11B

34*

$60,000

GPT-3 13B

260

$150,000

GPT NeoX

400

$525,000

GPT 70B

1,400

Contact For Quote

GPT 175B

175

3,500

Contact For Quote

GPT3-XL

GPT-J

GPT-3 6.7B

T-5 11B

GPT-3 13B

GPT NeoX

GPT 70B

GPT 175B

1.3

6.7

175

120

134

34*

260

400

1,400

3,500

0.4

Contact For Quote

$2,500

$45,000

$40,000

$60,000

$150,000

$525,000

Contact For Quote

* T5 tokens to train from the original T5 paper. Chinchilla scaling laws not applicable.
‍
** Expected number of days, based on training experience to date, using a 4-node Cerebras Wafer-Scale Cluster. Actual training of model may take more or less time.

Qualcomm Cloud AI

Qualcomm Cloud AI 100

Qualcomm Cloud AI 100 Ultra Solutions

Purpose-Built Inferencing

Inference Cloud powered by Qualcomm

Customized Options Available

Discover the Benefits of Qualcomm Cloud AI

Deploy Tuned Leading AI Models for less cost‍

Ease of use‍

Price

Simple and Secure Cloud Operations

AMD Instinct Series Instance Pricing

NVIDIA GPU Cloud

Qualcomm Cloud AI 100 Series Bare-Metal Pricing

The Cerebras AI Model Studio

Fine-Tuning - Standard Offering Pricing

Fixed-Price Production Model Training

Ready To Get Started?