Cerebras IPO 2026: $3.5B Raise, $26.6B Valuation, WSE-3 vs NVIDIA H100

Abhishek GautamMay 5, 20266 min read

Cerebras IPO 2026: $3.5B Raise, $26.6B Valuation, WSE-3 vs NVIDIA H100

Quick summary

Cerebras filed for IPO in May 2026 targeting a $3.5B raise at $26.6B valuation (CBRS). WSE-3 chip runs inference 20x faster than H100 for large models.

What the WSE-3 Actually Does

The core insight behind Cerebras is that the bottleneck in large language model inference is memory bandwidth, not raw compute. A standard GPU cluster runs models by distributing weights across many GPU memory banks and communicating between them via NVLink or InfiniBand. That inter-GPU communication is latency and bandwidth constrained.

The WSE-3 eliminates inter-chip communication for models that fit on a single wafer. The chip has 44 GB of on-chip SRAM directly adjacent to the compute cores — not HBM stacked on a package, but actual on-die SRAM with memory bandwidth measured in petabytes per second rather than terabytes per second. For a model running on a single WSE-3, the weights load once and inference runs continuously without the inter-GPU coordination overhead.

The catch: models must fit on the chip. The WSE-3 can hold models up to approximately 70B parameters natively. For larger models (GPT-4 scale, Claude Opus, Gemini 1.5 Ultra), Cerebras uses a technique called Weight Streaming — model weights are streamed from external DRAM while the chip processes tokens. Weight Streaming allows inference on models much larger than on-chip memory permits, but reduces the throughput advantage compared to native fit.

The 20x claim applies to models in the 7B-70B range running in native mode. For enterprise deployments of Llama 3 70B, Mistral 8x22B, or similar models, the benchmark is credible. For GPT-4 class models in Weight Streaming mode, the advantage narrows.

The OpenAI Partnership

The most significant business development in the IPO filing is a $20 billion multi-year compute partnership with OpenAI. Under the deal, OpenAI is purchasing inference capacity from Cerebras to serve ChatGPT and API traffic — specifically for workloads where Cerebras inference throughput translates directly to user-facing response latency.

OpenAI has historically been entirely NVIDIA-dependent for both training and inference. The Cerebras deal does not replace that dependence — NVIDIA remains OpenAI's primary compute provider — but it introduces a second vendor specifically for inference, which is where the majority of OpenAI's operational compute spend now sits.

The strategic implication for the broader industry: OpenAI validating Cerebras as production inference infrastructure de-risks the Cerebras bet for every other enterprise evaluating alternatives to NVIDIA. If OpenAI's production traffic runs on Cerebras, the technical and operational risk argument against Cerebras adoption weakens substantially.

The AWS Marketplace Integration

Cerebras announced general availability on AWS Marketplace in conjunction with the IPO filing. Enterprise AWS customers can now provision Cerebras inference endpoints directly from the AWS console and bill Cerebras usage through their AWS account, including against existing enterprise discount agreements (EDPs) and committed use credits.

This is the same distribution move that has accelerated enterprise adoption for other AI hardware vendors. AWS Marketplace integration removes the procurement friction — separate vendor contracts, separate billing, separate security reviews — that causes enterprises to default to AWS-native options even when third-party hardware performs better.

Cerebras is available via marketplace for both on-demand inference API access (pay per token) and dedicated instance reservations (hourly rate for reserved WSE-3 capacity). The dedicated instance model is the more competitive option for high-volume inference workloads.

The Financial Metrics Behind the IPO

From the S-1 filing:

Revenue grew from $78M in fiscal 2024 to $248M in fiscal 2025, driven almost entirely by the CSP-1 (Cerebras Cloud) inference service rather than hardware sales. The business has largely shifted from selling chips to selling compute time on chips — a subscription model rather than a one-time hardware sale.

Gross margin on the cloud inference product is 62%, which is below AWS and Google Cloud margins but comparable to earlier-stage infrastructure companies at this revenue scale. The R&D cost base is high — $340M in fiscal 2025 — as Cerebras funds WSE-4 development and the chip fabrication pipeline at TSMC.

Operating losses remain significant: net loss was $295M in fiscal 2025. The path to profitability requires either continued revenue growth outpacing R&D spend, or a reduction in R&D intensity once WSE-4 is in production.

The $26.6B valuation implies approximately 107x trailing revenue. That is expensive by traditional semiconductor multiples but comparable to where Palantir and Snowflake traded at their IPOs. Investors are pricing Cerebras as an AI infrastructure platform, not a chip company.

The Original IPO Attempt and Saudi Arabia Issue

Cerebras had previously filed to go public in 2024 but withdrew the filing. The block was a CFIUS review related to G42, an Abu Dhabi-based AI company that held a significant stake in Cerebras. CFIUS raised national security concerns about foreign ownership of US AI chip infrastructure.

Cerebras resolved the CFIUS issue by restructuring the G42 stake — G42 reduced its ownership below the threshold triggering review — and refiled in 2025. The 2026 IPO is the successful completion of an attempt that was delayed two years by geopolitical considerations. That history is worth noting: Cerebras's cap table and foreign investor relationships were material risk factors that had to be actively managed before the company could go public.

Key Takeaways

IPO filing May 2026: $3.5B raise target, $26.6B valuation, CBRS ticker on Nasdaq; S-1 filed; previous 2024 attempt was withdrawn over CFIUS concerns about G42 stake
WSE-3 chip: 4 trillion transistors, 44 GB on-chip SRAM, wafer-scale die — eliminates inter-GPU communication overhead; 20x faster than H100 for 7B-70B models in native mode; Weight Streaming for larger models narrows advantage
OpenAI $20B partnership: OpenAI buying Cerebras inference capacity for ChatGPT and API traffic — production validation that de-risks enterprise adoption of Cerebras hardware
AWS Marketplace integration: Generally available; enterprise customers provision Cerebras inference via AWS console, bill against EDPs; pay-per-token and dedicated reservation options
Revenue: $248M in fiscal 2025, up from $78M in fiscal 2024; gross margin 62% on cloud inference; net loss $295M; R&D $340M for WSE-4 development
Valuation: 107x trailing revenue — priced as AI infrastructure platform, not semiconductor company; comparable to early Palantir/Snowflake comps

For the broader AI chip supply chain, read TSMC Q1 2026: Record Profit, HBM4 Sold Out, OpenAI Titan Chip Tape-Out. For the GPU infrastructure context, read Anthropic Leases SpaceX Colossus 1: 220K GPUs, Claude Rate Limits Doubled.

FAQ

Frequently Asked Questions

What is the Cerebras IPO valuation and when is it going public?

Cerebras filed its S-1 in May 2026 targeting a $3.5 billion raise at a $26.6 billion valuation, with the ticker CBRS on Nasdaq. The filing follows a previous 2024 IPO attempt that was withdrawn due to CFIUS concerns about a significant ownership stake held by G42, an Abu Dhabi AI company. Cerebras resolved the CFIUS review by restructuring the G42 stake before refiling. The $26.6B valuation implies approximately 107x trailing revenue — priced as an AI infrastructure platform rather than a traditional semiconductor company.

How does the Cerebras WSE-3 chip compare to NVIDIA H100 for AI inference?

The WSE-3 (Wafer Scale Engine 3) runs inference 20 times faster than a comparably priced NVIDIA H100 cluster for models in the 7B-70B parameter range when running in native mode. The advantage comes from 44 GB of on-chip SRAM with petabyte-per-second memory bandwidth — eliminating the inter-GPU communication overhead that limits H100 cluster throughput. For models larger than 70B parameters, Cerebras uses Weight Streaming (streaming weights from external DRAM), which reduces but does not eliminate the performance advantage versus GPU clusters.

What is the Cerebras and OpenAI partnership deal?

OpenAI signed a $20 billion multi-year compute partnership with Cerebras to purchase inference capacity for ChatGPT and API traffic. OpenAI is using Cerebras WSE-3 inference specifically for workloads where throughput directly translates to user-facing response latency. This is the first time OpenAI has publicly deployed a non-NVIDIA inference vendor for production ChatGPT traffic. The deal is significant as enterprise validation — if OpenAI's production traffic runs on Cerebras, the technical risk argument against adopting Cerebras for other enterprises weakens substantially.

Can I access Cerebras inference on AWS?

Yes. Cerebras is generally available on AWS Marketplace as of May 2026. Enterprise AWS customers can provision Cerebras inference endpoints directly from the AWS console and bill Cerebras usage through their AWS account, including against existing Enterprise Discount Program credits. Options include pay-per-token inference API access and dedicated WSE-3 instance reservations billed hourly. The Marketplace integration removes separate vendor contracts, billing, and security review overhead that previously made Cerebras adoption friction-heavy for large enterprises.

Free Weekly Briefing

The AI & Dev Briefing

One honest email a week — what actually matters in AI and software engineering. No noise, no sponsored content. Read by developers across 30+ countries.

No spam. Unsubscribe anytime.