Potato Codex

May 4, 2025

Reading: LMSudio for Ryzen AI

So I've been down a rabbit hole today.

It started with a simple question: what's the best way to run local LLMs on Windows without an NVIDIA GPU? One search led to another, and before I knew it I had fifteen tabs open, all about AMD Ryzen AI and LM Studio.

How It Started
I already tried LM Studio on a basic Intel machine. CPU-only inference. It worked, technically — but "technically working" and "actually usable" are two different things. Waiting six seconds between each sentence gets old fast.

So I started looking at what better hardware options look like. And AMD Ryzen AI kept coming up.

What I Found While Reading
AMD and LM Studio have an actual partnership. There's a dedicated portal — lmstudio.ai/ryzenai — which I did not expect. This isn't just "it works on AMD hardware." They've built something together, optimized specifically for Ryzen AI chips.

The more I read, the more I realized how different this is from generic CPU inference. The Ryzen AI chips have three compute engines working together — the CPU, the integrated Radeon GPU, and a dedicated NPU. LM Studio on Windows uses the Vulkan backend to push work onto the iGPU, and the NPU handles the first-response latency. It's an actual architecture decision, not just "more cores go faster."

The Numbers That Got My Attention
On a Ryzen AI 9 HX 375 — a laptop chip — LM Studio hit over 50 tokens per second on a 1B model. That's readable speed. That feels fast in real use.

And then there's the Max+ 395 with 128GB unified memory. That machine can run 70B models. On a laptop. On Windows. That kind of number stops you mid-scroll.

The Part That Made Me Think
Reading about Variable Graphics Memory (VGM) — where the iGPU can dynamically borrow from system RAM — made me realize how much headroom the AMD platform has compared to what I was using before. On my Intel machine, the model either fits or it doesn't. On Ryzen AI Max+, you're working with up to 96GB of graphics-addressable memory. That's a completely different problem space.

Where I Am Now
Still reading. Still planning. No new hardware yet — but the research is getting serious.

If you're in the same situation — running local LLMs on Windows, frustrated with CPU-only speeds, not ready to drop NVIDIA GPU money — the AMD Ryzen AI platform is worth a serious look. The ecosystem around it on Windows is more mature than I expected.

More updates as this rabbit hole continues.

4 May 2025
Potato Codex

About Potato Codex

I'm Vicky, solutions manager. Robotics, AI & EV builder. Researcher entrepreneur 🇮🇩