Table of Contents
Model Release and Performance
On Thursday, OpenAI released its first production AI model to run on non-Nvidia hardware by deploying the new GPT-5.3-Codex-Spark coding model on chips from Cerebras. This model delivers code at more than 1,000 tokens—chunks of data—per second, reported to be roughly 15 times faster than its predecessor.
For comparison, Anthropic's Claude Opus 4.6 in its new premium-priced fast mode reaches about 2.5 times its standard speed of 68.2 tokens per second, though it remains a larger and more capable model than Spark.
Cerebras has been a great engineering partner, and we’re excited about adding fast inference as a new platform capability.
Availability and Specifications
Codex-Spark operates as a research preview available to ChatGPT Pro subscribers at $200 per month through the Codex app, command-line interface, and VS Code extension. OpenAI is rolling out API access to select design partners.
The model includes a 128,000-token context window and handles text only at launch.





