IT Brief New Zealand - Technology news for CIOs & IT decision-makers
New Zealand
OpenAI & Broadcom unveil Jalapeño AI inference chip

OpenAI & Broadcom unveil Jalapeño AI inference chip

Thu, 25th Jun 2026 (Yesterday)
Sean Mitchell
SEAN MITCHELL Publisher

OpenAI and Broadcom have unveiled Jalapeño, an inference chip for large language models. It is the first processor in a multi-generation computing platform the companies are developing together.

The chip marks OpenAI's first move into custom silicon, extending its infrastructure work beyond models and products into hardware design. Broadcom worked with OpenAI on silicon implementation and networking, while Celestica contributed board, rack and system integration.

Engineering samples are already running machine learning workloads in the lab at target production frequency and power levels, according to the companies. Those tests include GPT‐5.3‐Codex‐Spark, although OpenAI said it is still measuring final performance.

Early testing indicates Jalapeño will deliver performance per watt above the current state of the art. The companies said the architecture is designed to reduce data movement and balance computing, memory and networking resources so actual utilisation is closer to theoretical peak performance.

Jalapeño was designed specifically for large language model inference rather than adapted from older AI workloads. OpenAI said the design reflects how it runs the systems behind ChatGPT, Codex, its API and planned agentic products, while remaining flexible enough to support current and future large language models across the wider industry.

Stack strategy

The launch gives OpenAI a clearer role in the hardware needed to serve its models at scale. It said it is now working across chip architecture, kernels, memory systems, networking, scheduling, deployment systems and product design, arguing that tighter control of each layer should improve efficiency and lower the cost of serving AI.

That is a significant step for a company better known for model development and consumer-facing applications. Custom chips could also reduce dependence on established AI accelerator suppliers if the programme reaches full-scale deployment.

Greg Brockman, President and Co-Founder of OpenAI, outlined the company's broader rationale for the effort.

"The world is moving to a compute-powered economy," said Greg Brockman, President and Co-Founder of OpenAI. "Jalapeño is part of our long-term full-stack infrastructure strategy to make compute more abundant, resulting in AI which is faster, more reliable, more affordable for people and businesses, and can be used to solve more important problems. By designing more of the stack ourselves, we can serve more intelligence with greater efficiency and keep pushing advanced AI toward broader access."

Richard Ho, who leads OpenAI's hardware programme, said the chip was built around close analysis of the demands of advanced AI inference.

"Jalapeño was designed from the ground up for LLM inference using detailed insights from our close collaboration with OpenAI researchers," said Richard Ho, who leads OpenAI's hardware programme. "We optimized the architecture around the kernels, memory movement, networking, and serving patterns that matter most for frontier AI models. Based on early testing, Jalapeño will efficiently execute our most important workloads close to the hardware's theoretical limits."

Nine-month cycle

One of the more striking details is the reported pace of development. OpenAI and Broadcom said the chip moved from initial design to manufacturing tape-out in nine months, which they described as the fastest ASIC development cycle yet achieved in advanced semiconductors.

The companies attributed that speed to close software and hardware co-development, Broadcom's chip design expertise, and the use of OpenAI's own models in parts of the design and optimisation process. In effect, AI tools used in customer products were also applied to improve the infrastructure needed to run later generations of those models.

Broadcom said its role extended beyond implementation to include networking technologies such as Tomahawk silicon, which will support production deployment. It is positioning the partnership as part of a longer-term AI infrastructure build-out rather than a one-off chip project.

"Our collaboration with OpenAI represents a fundamental commitment to scaling the physical infrastructure required for the next decade of AI," said Hock Tan, President and CEO of Broadcom. "This is just the beginning of a multi-generation roadmap. By co-developing our industry-leading silicon directly with OpenAI, we are enabling the deployment of gigawatt-scale data centres with Microsoft and other partners beginning in 2026."

Deployment plans

Jalapeño is intended as the first generation of a broader computing platform to be deployed with data centre partners at gigawatt scale. OpenAI said the goal is to improve the speed, reliability and cost of inference, the stage at which trained AI models generate answers for users.

That focus reflects a shift in the economics of AI. As usage grows across chatbots, coding tools and application programming interfaces, inference workloads are becoming a larger share of computing demand. That is prompting model developers to look more closely at custom hardware that can reduce operating costs and improve responsiveness.

OpenAI linked the chip directly to user-facing services, saying efficiency gains could translate into faster ChatGPT responses, lower-cost API products and more dependable access during periods of high demand. It also said stronger infrastructure would support a cycle in which lower compute costs improve model training and serving, which in turn supports better products and more revenue for reinvestment.

For Broadcom, the project adds another high-profile partner in the race to supply the infrastructure underpinning the AI market. For OpenAI, it signals a deeper move into the physical layer of AI computing, with the company saying inference is where AI reaches people.