close
close

What’s Next for AI on Devices? Ask Qualcomm

Qualcomm held a one-day analytics event in San Diego where we all learned about their AI research. Pretty awesome, but the big news is yet to come — a new Oryon-based Snapdragon is expected this fall, and maybe a new Cloud AI 100 next year.

Qualcomm was pretty clear last week that its AI lead in, well, lead, is strong and getting stronger. Most of what they discussed was familiar, and we’ve covered a lot of it here and elsewhere. But there were also a few hints that big updates were coming.

Qualcomm AI: Bigger is better if it can be made smaller.

While the frenzy of making AI even bigger, with a mix of expert models exceeding a trillion parameters, Qualcomm has been busy squeezing these massive models to fit on a mobile device, a robot, or a car. They say you can always go back to the cloud if you need it for bigger AI, but the pot of gold is in your hand: your phone.

Qualcomm has invested in five areas that allow these massive models to slim down. While most AI developers are familiar with quantization and compression, distillation is newer and really cool, where a “student model” mimics the larger teacher model but runs on a phone. And speculative decoding is also gaining popularity. In short, smaller models can be much more affordable than massive models while still delivering the quality you need.

Qualcomm has presented data showing that the optimized Llama 3 model with 8 billion parameters can deliver the same quality as the GPT 3.5 Turbo model with 175 billion parameters.

So all of this AI is available to developers in the Qualcomm AI Hub, which we wrote about here , and it runs really fast on phones with the Snapdragon 8 Gen 3 processor. Advocates have said that this tiny chip, which uses less power than an LED light bulb, can generate AI images 30 times more efficiently than cloud data center infrastructure.

More interestingly, Qualcomm has confirmed that it will announce its next step up in Snapdragon SoCs in the fall, and that it will be based on the same Oryon cores that power its laptop offering, the Snapdragon X Elite. Stay tuned!

Data Center: Qualcomm is Just Getting Started

The Cloud AI 100 Ultra has been seeing a lot of wins lately, with nearly every server company providing support, as well as public clouds like AWS. Cerebras, the company that brought us the Wafer Scale Engine, is partnering with Qualcomm as its preferred inference platform. NeuReality also chose the Cloud AI100 Ultra for its Deep Learning Accelerator in its CPU-less inference device.

The reason for all this attention is simple: The Cloud AI 100 runs all the AI ​​applications you could possibly need, with very little power consumption. And the PCIe card can support up to 100 B-parameter models, thanks in part to the card’s larger DRAM. Net-net: The Qualcomm Cloud AI 100 Ultra delivers two to five times more performance per dollar than its competitors in generative AI, LLM, NLP, and compute workloads.

And for the first time we know of, a Qualcomm engineer has confirmed they’re working on a new version of the AI100, likely using the same Oryon cores as the X Elite the company acquired when it bought Nuvia. We expect this third generation of Qualcomm’s data center inference engine to focus on generative AI. Ultra laid a solid foundation for Qualcomm, and the next-generation platform could be a significant additional business for the company.

Automotive

Qualcomm recently said its automotive pipeline had grown to $30 billion thanks to its Snapdragon Digital Chassis, up more than $10 billion since it reported third-quarter results last July, and more than double the size of Nvidia’s automotive pipeline, which the company said would be about $14 billion in 2023.

Conclusions

We recently said that Qualcomm is becoming an AI powerhouse on the edge, and a session with Qualcomm executives last week solidified our position. The company has been researching AI for more than a decade, with the foresight to recognize that it could leverage AI to its advantage over Apple. Now, it’s turning that research into products in silicon and software, and has made it available in a new AI Hub for developers. The addition of Automotive Application Processing and Data Center helps drive revenue into new markets and diversifies the company from its roots in modems and Snapdragon mobile businesses. Finally, the company doubled down on its Nuvia plant, despite a two-year licensing dispute with Arm.

Disclosures:This article expresses the views and

should not be taken as advice to buy from or invest in the companies mentioned. Cambrian-AI Research is fortunate to have many, if not most, semiconductor companies as clients, including Blaize, BrainChip, Cadence Design, Cerebras, D-Matrix, Eliyan, Esperanto, GML, Groq, IBM, Intel, NVIDIA, Qualcomm Technologies, Si-Five, SiMa.ai, Synopsys, Ventana Microsystems, Tenstorrent, and dozens of investment clients. We do not have any investment positions in any of the companies mentioned in this article and do not plan to have any in the near future. For more information, visit our website at https://cambrian-AI.com.