NVIDIA Ampere architecture is now available with cloud solutions
based on NVIDIA A100, A40, A30 and A6000 GPUs.

Introducing NVIDIA Ampere Architecture on the Cirrascale Cloud Services Platform

Scientists, researchers, and engineers—the da Vincis and Einsteins of our time—are working to solve the world’s most important scientific, industrial, and big data challenges with AI and high-performance computing (HPC). Meanwhile businesses and even entire industries seek to harness the power of AI to extract new insights from massive data sets, both on-premises and in the cloud. The NVIDIA Ampere architecture, designed for the age of elastic computing, delivers the next giant leap by providing unmatched acceleration at every scale, enabling these innovators to do their life’s work.

Third-Generation Tensor Cores

First introduced in the NVIDIA Volta™ architecture, NVIDIA Tensor Core technology has brought dramatic speedups to AI, bringing down training times from weeks to hours and providing massive acceleration to inference. The NVIDIA Ampere architecture builds upon these innovations by bringing new precisions—Tensor Float (TF32) and Floating Point 64 (FP64)—to accelerate and simplify AI adoption and extend the power of Tensor Cores to HPC.

TF32 works just like FP32 while delivering speedups of up to 20X for AI without requiring any code change. Using NVIDIA Automatic Mixed Precision, researchers can gain an additional 2X performance with automatic mixed precision and FP16 adding just a couple of lines of code. And with support for bfloat16, INT8, and INT4, Tensor Cores in NVIDIA A100 Tensor Core GPUs create an incredibly versatile accelerator for both AI training and inference. Bringing the power of Tensor Cores to HPC, A100 also enables matrix operations in full, IEEE-certified, FP64 precision.



Our NVIDIA RTX™ A6000 based cloud solutions are perfect for massive data sets and workloads. In fact, the RTX A6000 has 48GB of ultra-fast GDDR6 memory, scalable up to 96 GB with NVLink, giving data scientists, engineers, and creative professionals the large memory necessary to work with data science and simulation workflows.

New Tensor Float 32 (TF32) precision provides up to 5X the training throughput over the previous generation to accelerate AI and data science model training without requiring any code changes. Hardware support for structural sparsity doubles the throughput for inferencing.

View Pricing   Signup


Built on the NVIDIA Ampere architecture, the NVIDIA® RTX™ A5000 perfectly balances power, performance, and memory to spearhead the future of innovation in the cloud. It combines 64 second-generation RT Cores, 256 third-generation Tensor Cores, and 8,192 CUDA® cores with 24GB of graphics memory to supercharge rendering, AI, graphics, and compute tasks. Connect two RTX A5000s for 48GB of combined GPU memory with NVIDIA NVLink, unlocking the ability to work with larger models, renders and scenes, tackle memory-intensive tasks like natural language processing, and run higher-fidelity simulations to enhance your product development process.

View Pricing   Signup


The NVIDIA RTX A4000 is the most powerful single-slot GPU for professionals, delivering real-time ray tracing, AI-accelerated compute, and high-performance graphics. Built on the NVIDIA Ampere architecture, the RTX A4000 combines 48 second-generation RT Cores, 192 third-generation Tensor Cores, and 6,144 CUDA cores with 16GB of graphics memory. So, you can engineer next-generation products, design cityscapes of the future, and create immersive entertainment experiences of tomorrow, today.

View Pricing   Signup


NVIDIA A100 Tensor Core GPU delivers unprecedented acceleration at every scale to power the world’s highest-performing elastic data centers for AI, data analytics, and HPC. Powered by the NVIDIA Ampere Architecture, A100 is the engine of the NVIDIA data center platform. A100 provides up to 20X higher performance over the prior generation and can be partitioned into seven GPU instances to dynamically adjust to shifting demands. Available in 40GB and 80GB memory versions, A100 80GB debuts the world’s fastest memory bandwidth at over 2 terabytes per second (TB/s) to run the largest models and datasets.

View Pricing   Signup


The NVIDIA A40 GPU with 48GB of GDDR6 memory, is an evolutionary leap in performance and multi-workload capabilities from the data center, combining best-in-class professional graphics with powerful compute and AI acceleration to meet today’s design, creative, and scientific challenges. Driving the next generation of virtual workstations and server-based workloads, NVIDIA A40 brings state-of-the-art features for ray-traced rendering, simulation, virtual production, and more to professionals anytime, anywhere. The NVIDIA A40 combines the latest NVIDIA Ampere architecture RT Cores, Tensor Cores, and CUDA® Cores and brings next generation NVIDIA RTX™ technology to the data center for professional visualization workloads.

View Pricing   Signup


The NVIDIA A30 Tensor Core GPU, delivers versatile performance supporting a broad range of AI inference and mainstream enterprise compute workloads, such as recommender systems, conversational AI and computer vision. The A30 supports MIG technology, delivering superior price/performance with up to 4 instances containing 6GB of memory, perfectly suited to handle entry-level applications. Cirrascale’s accelerated cloud server solutions with NVIDIA A30 GPUs provide the needed compute power -- along with large HBM2 memory, 933GB/sec of memory bandwidth, and scalability with NVIDIA NVLink® interconnect technology -- to tackle massive datasets and turn them into valuable insights.

View Pricing   Signup

Ready for NVIDIA Ampere in the Cloud?

Ready to take advantage of our flat-rate monthly billing, no ingress/egress data fees, and fast multi-tiered storage with NVIDIA Ampere GPUs?