OpenAI’s Jalapeño chip aims to cut LLM serving costs
OpenAI and Broadcom built the Jalapeño ASIC for LLM inference. Design reached tape-out in nine months; initial data-center deployment is planned by the end of 2026.
OpenAI and Broadcom developed the Jalapeño processor, an application-specific integrated circuit intended to lower the cost of running large language models. The design reached tape-out in nine months and initial deployment into data centers is scheduled to begin by the end of 2026. TSMC will manufacture the wafers and Celestica will assemble boards and rack systems.
OpenAI calls the chip its first ‘Intelligence Processor’ and says it is optimized for LLM inference rather than general-purpose AI tasks. Broadcom handled silicon engineering and high-performance networking integration, and the design embeds Broadcom’s Tomahawk networking silicon to support communication across large clusters.
OpenAI reported early lab samples running frontier workloads, including an unreleased GPT-5.3-Codex-Spark model, at the company’s target frequency and power levels. Richard Ho, head of OpenAI’s hardware program, noted the architecture “minimizes data movement to push realized utilization closer to its theoretical peak performance.”
Infrastructure costs are a stated driver for the project. OpenAI reported it spent $8.4 billion last year to keep ChatGPT servers responsive. With about 900 million weekly users, the company projects roughly $14 billion in operational costs this year and has committed about $1.4 trillion to computing power over the next eight years. OpenAI reports about $25 billion in annual revenue and keeps approximately 33 cents of profit on each dollar after expenses. The company contrasts those margins with estimates that high-end processor vendors such as Nvidia earn roughly 75% profit margins.
OpenAI says the chip reduces data movement and balances compute, memory and networking to improve realized utilization. The company now manages chip architecture, software kernels, memory systems, network scheduling and application-layer serving to align hardware with its model roadmaps. Greg Brockman, president and co-founder, described Jalapeño as part of a “long-term full-stack infrastructure strategy to make compute more abundant” and added that designing more of the stack allows the firm to “serve more intelligence with greater efficiency.”
The engineering teams shortened the development timeline by using OpenAI’s own language models to automate parts of the hardware design process. OpenAI says that approach created a feedback loop in which models used in production help optimize future infrastructure.
OpenAI enters a market where other large firms have invested in custom silicon. Google began deploying Tensor Processing Units in 2015 and now controls about a quarter of global AI computing capacity outside Nvidia’s supply chain. Amazon has deployed over one million custom chips, while Meta and Microsoft continue to scale their infrastructure. Broadcom CEO Hock Tan confirmed the Jalapeño rollout will expand with infrastructure partners, including Microsoft, to prepare for gigawatt-scale data center integration.
OpenAI plans initial deployment by the end of 2026, with broader scaling to follow as rack-level systems and partners are brought online. The company expects lower serving costs to free capital for reinvestment in future generations of infrastructure.
Content on BlockPort is provided for informational purposes only and does not constitute financial guidance.
We strive to ensure the accuracy and relevance of the information we share, but we do not guarantee that all content is complete, error-free, or up to date. BlockPort disclaims any liability for losses, mistakes, or actions taken based on the material found on this site.
Always conduct your own research before making financial decisions and consider consulting with a licensed advisor.
For further details, please review our Terms of Use, Privacy Policy, and Disclaimer.








