OpenAI Sidesteps Nvidia with Unusually Fast GPT-5.3-Codex-Spark Coding Model on Cerebras Chips

Table of Contents

Model Release and Performance

On Thursday, OpenAI released its first production AI model to run on non-Nvidia hardware by deploying the new GPT-5.3-Codex-Spark coding model on chips from Cerebras. This model delivers code at more than 1,000 tokens—chunks of data—per second, reported to be roughly 15 times faster than its predecessor.

For comparison, Anthropic's Claude Opus 4.6 in its new premium-priced fast mode reaches about 2.5 times its standard speed of 68.2 tokens per second, though it remains a larger and more capable model than Spark.

Cerebras has been a great engineering partner, and we’re excited about adding fast inference as a new platform capability. — Sachin Katti, head of compute at OpenAI

Availability and Specifications

Codex-Spark operates as a research preview available to ChatGPT Pro subscribers at $200 per month through the Codex app, command-line interface, and VS Code extension. OpenAI is rolling out API access to select design partners.

The model includes a 128,000-token context window and handles text only at launch.