OpenAI and Broadcom Reveal Jalapeno Custom AI Chip for Efficient Large Language Model Inference Performance Launching in Late 2026
OpenAI and Broadcom have revealed the planned creation of Jalapeno, a custom silicon processor to optimize inference use cases for in aid large language models. As disclosed by a statement by both companies, this will be the 1st generation of a multi generation compute platform. The chip is designed to enhance the performance, efficiency and efficiency of running advanced AI models, for physical launch in datacentres by late 2026.
The custom Application Specific Integrated Circuit was designed from architecture to manufacturing tape out in only 9 months. This is regarded as one of the rapid timescales achieved by any industry leaders for advanced high performance semiconductors. OpenAI used its own artificial intelligence to help engineers design and optimize the silicon layout, while Broadcom contributed hardware and network silicon implementation and networking technologies, including their Tomahawk networking silicon. Celestica integrated the board and rack system into scalable production environments.
The chipped engineered samples are now operating as a machine learning processor performing the lab test at target foundry frequency and power. It has the next generation GPT 5.3 Codex Spark model integrated into these test workloads. Although final benchmarks are yet to be completed, early lab tests have shown that The chip motor executes a performance watt machine far more efficiently than the current state of the art accelerators available. Its architecture is optimized to have a combination of compute, memory and networking resources balanced to every corner of the chip.
Compared to generic GPU technology that has been adapted for Ai work loads, Jalapeno is fresh design all together from architects who are building for today's inference. Constructing the hardware in front of their programs, OpenAI intends to minimize the index cost to operate consumer items like ChatGPT or developer API instruments. According to Greg Brockman, President of OpenAI
Jalapeno is part of our long term full stack infrastructure strategy to make compute more abundant, resulting in AI which is faster, more reliable, more affordable for people and businesses, and can be used to solve more important problems.
According to Richard Ho, who is heading the hardware program at OpenAI, the team optimized the chip architecture focusing directly on the kernels, memory movement and serving patterns that drive the performance of frontier models. Hock Tan, a CEO of leading chip maker Broadcom, told that the silicon alliance will enable achievement of gigawatt scale data center facilities, which are to be deployed in partnership with Microsoft and other operators starting 2026.
