Qualcomm Unveils AI200 & AI250 Chips to Redefine Rack-Scale AI Inference
The AI200 will be available commercially in 2026, with the AI250 following in 2027.
Qualcomm Inc. is making a bold entry into the data-centre AI inference market with the launch of its next-generation solutions, the AI200 and AI250, designed to deliver rack-scale performance with high memory capacity, efficiency and reduced cost of ownership. The AI200 will be available commercially in 2026, with the AI250 following in 2027.
Built on Qualcomm’s neural-processing-unit (NPU) expertise, the AI200 solution offers 768 GB of LPDDR memory per card and is engineered for large-language-model (LLM) and multimodal-model inference workloads.
The AI250, arriving in 2027, introduces a “near-memory-computing” architecture that delivers more than 10× higher effective memory bandwidth and lower power consumption.
“With Qualcomm AI200 and AI250, we’re redefining what’s possible for rack-scale AI inference. These innovative new AI infrastructure solutions empower customers to deploy generative AI at unprecedented TCO, while maintaining the flexibility and security modern data centres demand,” Durga Malladi, Senior Vice President & General Manager at Qualcomm Technologies said in a press release.
Both rack solutions support direct liquid cooling, PCIe scaling, Ethernet connectivity, and confidential computing for secure workloads, and they consume approximately 160 kW per rack. The AI200 will be available commercially in 2026, with the AI250 following in 2027.
Qualcomm’s move places it in direct competition with dominant AI-chip players such as NVIDIA and AMD, as the data-centre market shifts toward full-rack solutions built for generative AI. In one sign of traction, Saudi Arabia-based AI company HUMAIN has committed to deploying approximately 200 MW of Qualcomm’s rack solutions starting next year.
Despite the hype, Qualcomm’s announcement left analysts unimpressed — the company revealed no FLOPS metrics, pricing, chip-per-rack details, or benchmarks.
Critics noted that the launch was “all vibes,” offering buzzwords like “low TCO” and “confidential computing” without real performance data.
Comments ()