Microsoft Toolkit Enables NVIDIA CUDA Models to Run on AMD AI GPUs for Cheaper AI Inference

Microsoft Builds Toolkits for Running NVIDIA CUDA Models on AMD AI GPUs

Microsoft is actively involved in making its toolkits that allow the running of NVIDIA CUDA-based models on AI GPUs of AMD. This is an attempt to lessen dependency on many aspects of the software ecosystem owned by NVIDIA, which is further driven by a tremendous surge in demand for AI inference workloads and cost-efficient hardware solutions.

The Dilemma Behind Dominance by CUDA of NVIDIA

Definitely, NVIDIA holds in the AI market with the so-called "CUDA lock-in" by which site use in the AI industry is heavily dependent on the CUDA software ecosystem that nearly limited the adoption among developers and cloud service providers to NVIDIA for better performance. Breaking it is definitely difficult.

Microsoft's Resolution A Toolkit for CUDA-to-ROCm Transformation

According to a senior employee of Microsoft, the company has built tools to address the issue. The toolkit is called the ''CUDA-to-ROCm Translation Toolkit,'' which translates the CUDA code to be in a version compatible to run on AMD hardware through AMD's software ROCm stack.

He added, "We built some toolkits to help convert like CUDA models to ROCm so you could use it on an AMD, like a 300X. We have had a lot of inquiries about what is our path with AMD, the 400X and the 450X."

There could also be runtime compatibility layer methods to achieve such translations. One such example is the ZLUDA tool, which intercepts CUDA API calls and translates them to ROCm on the fly without requiring a complete rewrite of the source code.

Difficulties and Limitations Yet

This is not a trouble-free exercise. AMD's ROCm is considered not as mature a software stack as that of CUDA. Thus, some CUDA codes or API calls have no direct equivalent in ROCm, which can significantly degrade performance, a very risky terrain in large-scale footfalls in datacenter environments. It appears that Microsoft's toolkits are not yet intended for general use.

Cost Underlines Economical Inference with AI

Primarily, these changes that Microsoft is creating pertain to the changing face of workloads for all kinds of AI. The company is in a serious increase in demand for inference (running trained AI models) as opposed to in-demand training. For such applications, AMD's AI chips are way cheaper alternatives than their NVIDIA counterparts in GPUS. Given that most inference environments are built around CUDA models, creating a reliable behavior of such models through translation to ROCm is an important part that Microsoft needs to finish in order to optimize AMD's hardware and lower costs.

Technetbook | The Tech Experts

Microsoft Toolkit Enables NVIDIA CUDA Models to Run on AMD AI GPUs for Cheaper AI Inference

Microsoft Builds Toolkits for Running NVIDIA CUDA Models on AMD AI GPUs

The Dilemma Behind Dominance by CUDA of NVIDIA

Microsoft's Resolution A Toolkit for CUDA-to-ROCm Transformation

Difficulties and Limitations Yet

Cost Underlines Economical Inference with AI

About the author

Join the conversation

ASUS ROG GR70 Mini-PC Launched with AMD Ryzen 9 9955HX3D and NVIDIA RTX 5070 for High-Performance Gaming

Machenike GTS Mini PC with Foldable Status Display and Intel Core Ultra 9 CPU Unveiled

AYANEO NEXT 2 Gaming Handheld Confirmed with AMD Strix Halo and 16-Core Zen 5 APU Details

Delisted Xbox 360 Games Reappear on Microsoft Store as Coming Soon Sparking Community Speculation

NVIDIA Rubin AI Chips Target TSMC 3nm Production with Jensen Huang Negotiating for a 30% Share in Taiwan