What’s Next for AI on Devices? Ask Qualcomm

Qualcomm held a one-day analytics event in San Diego where we all learned about their AI research. Pretty awesome, but the big news is yet to come — a new Oryon-based Snapdragon is expected this fall, and maybe a new Cloud AI 100 next year.

Qualcomm was pretty clear last week that its AI lead in, well, lead, is strong and getting stronger. Most of what they discussed was familiar, and we’ve covered a lot of it here and elsewhere. But there were also a few hints that big updates were coming.

Qualcomm AI: Bigger is better if it can be made smaller.

While the frenzy of making AI even bigger, with a mix of expert models exceeding a trillion parameters, Qualcomm has been busy squeezing these massive models to fit on a mobile device, a robot, or a car. They say you can always go back to the cloud if you need it for bigger AI, but the pot of gold is in your hand: your phone.

Qualcomm’s vision is to reduce costs and increase personalization using hybrid AI.

Qualcomm

Qualcomm has invested in five areas that allow these massive models to slim down. While most AI developers are familiar with quantization and compression, distillation is newer and really cool, where a “student model” mimics the larger teacher model but runs on a phone. And speculative decoding is also gaining popularity. In short, smaller models can be much more affordable than massive models while still delivering the quality you need.

Qualcomm is pushing generative AI into its Snapdragon mobile phones, and now PCs, in five areas … (+) devices.

Qualcomm

Qualcomm has presented data showing that the optimized Llama 3 model with 8 billion parameters can deliver the same quality as the GPT 3.5 Turbo model with 175 billion parameters.

Smaller models today can deliver higher quality results using only 8B parameters. Yes, you can run … (+) 8B model in a smartphone equipped with a Qualcomm Snapdragon processor.

Qualcomm

So all of this AI is available to developers in the Qualcomm AI Hub, which we wrote about here , and it runs really fast on phones with the Snapdragon 8 Gen 3 processor. Advocates have said that this tiny chip, which uses less power than an LED light bulb, can generate AI images 30 times more efficiently than cloud data center infrastructure.

More interestingly, Qualcomm has confirmed that it will announce its next step up in Snapdragon SoCs in the fall, and that it will be based on the same Oryon cores that power its laptop offering, the Snapdragon X Elite. Stay tuned!

Snapdragon 8 Gen 3, now available in flagship smartphones worldwide, could be up to 40 times faster … (+) more efficient than a server in a data center.

Qualcomm

Data Center: Qualcomm is Just Getting Started

The Cloud AI 100 Ultra has been seeing a lot of wins lately, with nearly every server company providing support, as well as public clouds like AWS. Cerebras, the company that brought us the Wafer Scale Engine, is partnering with Qualcomm as its preferred inference platform. NeuReality also chose the Cloud AI100 Ultra for its Deep Learning Accelerator in its CPU-less inference device.

The reason for all this attention is simple: The Cloud AI 100 runs all the AI applications you could possibly need, with very little power consumption. And the PCIe card can support up to 100 B-parameter models, thanks in part to the card’s larger DRAM. Net-net: The Qualcomm Cloud AI 100 Ultra delivers two to five times more performance per dollar than its competitors in generative AI, LLM, NLP, and compute workloads.

And for the first time we know of, a Qualcomm engineer has confirmed they’re working on a new version of the AI100, likely using the same Oryon cores as the X Elite the company acquired when it bought Nuvia. We expect this third generation of Qualcomm’s data center inference engine to focus on generative AI. Ultra laid a solid foundation for Qualcomm, and the next-generation platform could be a significant additional business for the company.

Cloud AI 100 Ultra has been getting a lot of attention lately. Second generation expected … (+) in 2025.

Qualcomm

Automotive

Qualcomm recently said its automotive pipeline had grown to $30 billion thanks to its Snapdragon Digital Chassis, up more than $10 billion since it reported third-quarter results last July, and more than double the size of Nvidia’s automotive pipeline, which the company said would be about $14 billion in 2023.

The Snapdragon car prototype was likely designed by BMW, one of their partners in the automotive industry … (+) market. The car has more screens than the average Cineplex.

Author

Conclusions

We recently said that Qualcomm is becoming an AI powerhouse on the edge, and a session with Qualcomm executives last week solidified our position. The company has been researching AI for more than a decade, with the foresight to recognize that it could leverage AI to its advantage over Apple. Now, it’s turning that research into products in silicon and software, and has made it available in a new AI Hub for developers. The addition of Automotive Application Processing and Data Center helps drive revenue into new markets and diversifies the company from its roots in modems and Snapdragon mobile businesses. Finally, the company doubled down on its Nuvia plant, despite a two-year licensing dispute with Arm.

Disclosures:This article expresses the views and

should not be taken as advice to buy from or invest in the companies mentioned. Cambrian-AI Research is fortunate to have many, if not most, semiconductor companies as clients, including Blaize, BrainChip, Cadence Design, Cerebras, D-Matrix, Eliyan, Esperanto, GML, Groq, IBM, Intel, NVIDIA, Qualcomm Technologies, Si-Five, SiMa.ai, Synopsys, Ventana Microsystems, Tenstorrent, and dozens of investment clients. We do not have any investment positions in any of the companies mentioned in this article and do not plan to have any in the near future. For more information, visit our website at https://cambrian-AI.com.