Oliver Servín

January 29, 2026

Why I pay for imperfect AI (and you should too)

I'll be honest, I was staring at my dashboard today, watching the token speed fluctuate between 25 and 45 tokens per second. It was peak hours, and I could feel the frustration building.

The thought crossed my mind again: Why do I put up with this? Why not just pay for Claude Code's $200/month plan and get consistent 40+ tokens per second?

It's a fair question. But here's the thing: after months of using GLM 4.7 with coding agents, I've realized something that most developers aren't talking about.

The perfect model you can't afford is worth less than the imperfect one you can actually use.

What I wish someone had told me earlier

I used to believe I needed the "best" model. I thought anything less was settling. But then I stumbled across some benchmarks that surprised me.

GLM 4.7 scores 84.9% on LiveCodeBench V6. Claude Sonnet 4.5? 64.0%.

On LiveBench, GLM 4.7 hits 58.09 while Claude Opus 4.1 sits at 54.45.

Here's what I wish someone had told me earlier: the narrative that "you need proprietary models" is based on marketing, not reality. Open source models aren't catching up anymore - they're already here. And in some benchmarks, they're winning.

The 90% rule I've learned

I'll be real with you, GLM 4.7 isn't perfect. There are moments when it struggles with complex custom rules, just like that test case where a developer tried to rebuild a Hawaiian Rummy game and the model kept reverting to standard rummy rules. After 38 prompts and several hours, the game logic was still broken.

But here's what I've found in my day-to-day work with coding agents:

90% of what I do is routine: generating files, refactoring code, writing documentation, creating tests, prototyping features. And in that 90%, GLM 4.7 performs beautifully.

The 10% where it struggles? I can handle that manually. But I can't manually handle 90% of my workload.

I've learned to stop obsessing over edge cases that might never happen. The perfect model that costs $200/month and I can't afford to use freely helps no one. The imperfect model that handles my actual work every day? That's invaluable.

The mental game of API pricing

I'll admit something that might sound silly: API pricing stressed me out.

When I was using GitHub Copilot at $10/month, I only had access to GPT-4.1 with limited usage. Every time I thought about prompting the AI, I'd do a quick mental calculation: Is this worth it? Will I hit my limit?

That anxiety kills creativity. It makes you second-guess yourself. It turns "let me try this idea" into "is this idea worth $0.03?"

When I switched to Z.ai's subscription model, something changed. I paid $113.40 for a full year during Black Friday - that's $9.45/month. Unlimited GLM 4.7 usage.

The mental overhead of counting tokens disappeared.

I've found that I build differently when I don't have to worry about cost. I iterate faster. I test more ideas. I let the coding agent take on more tasks because I'm not constantly calculating the price tag of each prompt.

Here's the thing: I write more code, get more value, and pay less - all because I stopped counting tokens.

Why open source matters to me

I'll share why this isn't just about price for me.

I started my career as a software developer. Along the way, I met some incredible people in the open source community. People who built tools, shared code, and helped others without expecting anything in return.

That experience shaped how I think about software. When I pay Z.ai, I'm not just buying access to a model. I'm supporting a team that develops their own AI models AND open sources them. That matters.

I'm not saying open source is perfect. But I'm paying today so tomorrow's developers have choices. If everyone pays for $200/month proprietary models, open source dries up. And then who can afford to build?

I've come to see it this way: supporting open source isn't charity or ideology. It's investing in competition. And competition is what keeps AI at $6/month instead of $200/month.

What really matters

After months of using this setup, here's what I've learned:

GLM 4.7 hits 25-45 tokens per second during peak hours. Claude Max might hit constant 40+. That speed difference matters less than you think. What matters more is that I can use GLM 4.7 all day, every day, without checking my bank account or watching a usage meter.

Coding agents amplify your ideas. They speed up your work. They help you build things faster than you could alone. But amplification costs shouldn't limit your ideation.

I get about 95% of Claude's value at roughly 5% of the cost. The speed fluctuates, but my momentum doesn't.

My takeaway

I'll be honest, the speed issues can still be frustrating. There are moments when I wish I had the absolute best model. Sometimes I wonder if I'm making the right tradeoff.

But here's what I know: I can actually use GLM 4.7. Every day. For everything I need to build. I can iterate freely, prototype quickly, and let the coding agent handle the heavy lifting without anxiety.

You don't need perfect AI. You need AI you can afford to use.

If you're tired of API pricing anxiety, if $200/month is out of reach, if you want to build without constantly calculating the cost of every idea - try GLM 4.7 with a subscription.

Sometimes the best tool isn't the perfect one. It's the one that lets you build.

About Oliver Servín

Working solo at AntiHQ, a one-person software company.