OpenAI and Broadcom Reveal Jalapeno LLM Inference Chip

OpenAI and Broadcom Reveal Jalapeno Custom AI Chip for Efficient Large Language Model Inference Performance Launching in Late 2026

OpenAI and Broadcom have revealed the planned creation of Jalapeno, a custom silicon processor to optimize inference use cases for in aid large language models. As disclosed by a statement by both companies, this will be the 1st generation of a multi generation compute platform. The chip is designed to enhance the performance, efficiency and efficiency of running advanced AI models, for physical launch in datacentres by late 2026.

The custom Application Specific Integrated Circuit was designed from architecture to manufacturing tape out in only 9 months. This is regarded as one of the rapid timescales achieved by any industry leaders for advanced high performance semiconductors. OpenAI used its own artificial intelligence to help engineers design and optimize the silicon layout, while Broadcom contributed hardware and network silicon implementation and networking technologies, including their Tomahawk networking silicon. Celestica integrated the board and rack system into scalable production environments.

The chipped engineered samples are now operating as a machine learning processor performing the lab test at target foundry frequency and power. It has the next generation GPT 5.3 Codex Spark model integrated into these test workloads. Although final benchmarks are yet to be completed, early lab tests have shown that The chip motor executes a performance watt machine far more efficiently than the current state of the art accelerators available. Its architecture is optimized to have a combination of compute, memory and networking resources balanced to every corner of the chip.

Compared to generic GPU technology that has been adapted for Ai work loads, Jalapeno is fresh design all together from architects who are building for today's inference. Constructing the hardware in front of their programs, OpenAI intends to minimize the index cost to operate consumer items like ChatGPT or developer API instruments. According to Greg Brockman, President of OpenAI

Jalapeno is part of our long term full stack infrastructure strategy to make compute more abundant, resulting in AI which is faster, more reliable, more affordable for people and businesses, and can be used to solve more important problems.

According to Richard Ho, who is heading the hardware program at OpenAI, the team optimized the chip architecture focusing directly on the kernels, memory movement and serving patterns that drive the performance of frontier models. Hock Tan, a CEO of leading chip maker Broadcom, told that the silicon alliance will enable achievement of gigawatt scale data center facilities, which are to be deployed in partnership with Microsoft and other operators starting 2026.

Technetbook | The Tech Experts

OpenAI and Broadcom Reveal Jalapeno LLM Inference Chip

OpenAI and Broadcom Reveal Jalapeno Custom AI Chip for Efficient Large Language Model Inference Performance Launching in Late 2026

About the author

Join the conversation

Newsletter Subscription

NVIDIA Launches Halos for Robotics Safety System with Taiwanese Hardware Partners

Electronic Arts Files Ultima Trademarks as Richard Garriott Prepares Legal IP Reclaim

SK Hynix Adjusts Production Strategy as Standard DRAM Profit Margins Surpass HBM

TSMC Imposes Broad Price Hikes Across All Advanced Nodes

Samsung 9100 PRO PCIe 5.0 SSD Sale Cuts Prices on 1TB to 8TB Drives