OpenAI and Broadcom Launch Jalapeño, Custom ASIC with 50% Cost Advantage Over GPUs

OpenAI's first chip was designed to tape out in nine months and is set to run inference workloads by 2026, marking the fastest cycle ever seen in advanced ASICs, according to Hock Tan.
OpenAI and Broadcom unveiled Jalapeño on Wednesday, the first AI accelerator designed by OpenAI specifically for inference workloads of language models. The chip is an ASIC, tailored for a single class of workloads, and transitioned from initial design to tape-out in nine months, at a pace that Hock Tan, CEO of Broadcom, described as the fastest cycle ever achieved in a high-performance advanced semiconductor. The first units are expected to start operating by the end of 2026 as a foundational step for a multi-generational platform.
The number that weighs for the CIO is different. In an interview following the announcement, Tan stated that early tests indicate cost savings of around 50% compared to traditional AI GPUs for the same inference tasks. As an ASIC, Jalapeño does not attempt to compete with the flexibility of a GPU: it covers fewer use cases but with performance per watt that, according to the company, substantially exceeds the state of the art. For an operation that spends billions on computing quarterly just to serve ChatGPT at scale, cutting the unit cost of the inference layer in half changes the margin equation for OpenAI’s paid products before altering any external roadmap.
Why it Matters that the Cycle was Nine Months
An ASIC of this magnitude typically takes between 18 to 24 months from RTL to validated silicon. The nine-month timeline of Jalapeño, according to both companies, was made possible because OpenAI utilized its own models to accelerate design and optimization stages, particularly verification and design space exploration. This detail is non-trivial: it implies that the time-to-market advantage, historically a costly barrier to entry in the accelerator market, is yielding to those who operate their own models and possess symbolic computing capacity. Nvidia maintains an advantage in general-purpose architectures; however, it loses in this new axis- the race for verticalized and disposable silicon.
Broadcom enters as an execution partner. The company provides the expertise in physical implementation, manufacturing via TSMC, and the network and SerDes IP stack that connects thousands of these pieces in a coherent inference pod. The contract follows a model that Broadcom already executes with Google and Meta, but for the first time, it has a client whose addressable portion of the business relies almost entirely on the silicon it is purchasing.
Who Loses with Jalapeño
Nvidia loses less than the headline suggests and more than its own narrative implies. Jalapeño does not replace training, which continues to rely on H100, H200, and Blackwell GPUs. However, inference is the side of the load that grows with product adoption, and where the 50% savings mentioned by Tan compound across each token generated. For Nvidia, the most crucial internal takeaway is that its largest declared customer has begun purchasing part of its own inference stack externally, which alters the duration calculation of the data center revenue supercurve.
AMD and Intel find themselves even more exposed. Both MI400 and Gaudi 3 promote better cost-effectiveness in inference. This pitch shrinks in the face of an OpenAI capable of designing its own ASIC within a nine-month window while cutting TCO in half without outsourcing architecture design.
Global Takeaway: Two Markets Where the Effect Hits Quickly
In the United Kingdom, OpenAI maintains the second-largest engineering contingent outside the United States, with a dedicated office in London focusing, among other things, on enterprise applications. Banks such as Barclays and HSBC, which have already contracted dedicated ChatGPT Enterprise inference capacity for internal workflows, are likely to see Jalapeño arrive as a price reduction per token in corporate plans before reaching any end customers. In Singapore, a regional hub where OpenAI processes demand from Southeast Asia, the same trend alters the economic viability of paid subscriptions for government and central banks that previously rejected 2025 pricing.
Brazil feels it as a second-order effect, via pass-through. Itaú, Bradesco, Stone, and Cosan's AI operations run inference on OpenAI within BPO and customer service pipelines; half the cost per token is half of a cost curve that had been the main argument for local CTOs to slow negotiations.
The product decision announced on Wednesday is significant: OpenAI has transitioned from being an exclusive customer of its chip suppliers to also becoming a competitor to them in part of the stack.