How a Food Delivery Giant Built a $0.30 AI Model to Rival Claude

Two months ago, a nameless model called “On Alpha” appeared on OpenRouter. No company attached. No description. Just a ghost in the machine that started writing code better than almost everything else—and doing it for roughly one-tenth the price of Claude’s strongest models. Developers like me started using it quietly. Usage spiked 242%. It reached #1 on the leaderboards. Everyone asked the same question: who built this thing?

Yesterday, the mask dropped. The creator wasn’t a Silicon Valley lab. It was Meituan—yes, the Chinese food delivery giant. Think of it as their equivalent of Yemeksepeti or Deliveroo. Through their AI division Kyutai, they had secretly trained one of the world’s largest open-source models: 1.5 trillion parameters, 1 million token context window, trained entirely on Chinese Cambricon chips—not a single Nvidia GPU involved.

This matters for anyone building with AI, and I’ll explain exactly why—including how to use it for pennies.

Key Takeaways

1.5 trillion parameters with 1 million token context—quality near Claude Sonnet at roughly 1/10th the price
Trained on 50,000 Cambricon cards, making it the first trillion-parameter model built without American hardware
MIT licensed—fully open source, free for commercial use
Input tokens cost roughly $0.30 per million versus Claude’s ~$3.00
Free context caching: re-read the same project files without paying again
Available via OpenRouter, Kyutai’s own API, or through Kimi’s Client multi-agent system
Weights not yet downloadable—API-only for now, with local deployment coming

The Mystery Model That Took Over OpenRouter

I noticed On Alpha the same way most developers did: it just appeared. No announcement, no blog post, no corporate branding. On OpenRouter—a platform I use regularly to compare models—it started climbing the rankings with disturbing speed.

Here’s what caught my attention:

Code generation quality that matched or exceeded top-tier models
Pricing that seemed like a bug—not a feature
1 million token context window, meaning it could ingest entire codebases, hundreds of pages of documentation, or massive project files in one go

Usage jumped 242% as word spread through developer channels. Meanwhile, Claude—previously dominant on the platform—slipped to #2. People were unknowingly adopting a Chinese model, and nobody knew who to thank (or blame).

The reveal came when Moonshot AI (the company behind Kimi) publicly confirmed what investigators had suspected: On Alpha was built by Kyutai, Meituan’s AI research lab. A food delivery company had created one of the most capable open models in existence.

Why the Hardware Story Changes Everything

The technical achievement here goes deeper than model architecture. Kyutai trained this 1.5 trillion parameter system on 50,000 Cambricon AI chips—Chinese-designed, Chinese-manufactured processors. No Nvidia H100s. No A100s. No American hardware at any stage.

This is historically significant. For years, US export controls on advanced semiconductors were treated as a hard ceiling on Chinese AI development. When Fablo 5 (apparently a reference to a previous model or service) was restricted, American users lost access for three weeks. The assumption was that without Nvidia chips, you couldn’t compete at the frontier.

Kyutai just proved that assumption wrong. A food delivery platform’s research team built a trillion-parameter model using domestic alternatives. The rules of this game are rewriting themselves in real-time.

For entrepreneurs like me, based in the UK but working globally, this means hardware diversification is accelerating. More training pipelines, more model providers, more resilience against single-point-of-failure restrictions. The monopoly on cutting-edge AI is cracking.

Real Numbers: What This Costs vs. Claude

Here’s where this becomes immediately practical. I run multiple automation workflows and AI-assisted development projects. Token costs are a real line item in my monthly expenses.

Model Tier	Approx. Cost per Million Tokens
Claude 3.5 Sonnet (strongest)	~$3.00 (input), up to $30 for extended thinking
Kyutai On Alpha	~$0.30 (discounted launch pricing)

That’s roughly a 10x price difference for comparable quality. When you’re running thousands of API calls monthly—processing documents, generating code, analyzing data—this isn’t marginal savings. It’s the difference between a $300/month AI bill and a $30 one.

Kyutai’s current promotion drops their already-low pricing even further. Their token packages (prepaid, 30-day validity) beat pay-as-you-go API pricing for consistent usage. During launch, they’re offering what was normally a $299 tier for $60.

But the feature that genuinely impressed me: free context caching. When you’re working on the same project repeatedly—refining code, iterating on documents—the model doesn’t charge you again for re-reading the same context. This is how it should work, and frankly, Western providers should take note. For long-running projects with extensive context, this alone can cut costs by 30-50%.

How I’m Actually Using It: Three Pathways

Option 1: OpenRouter (Simplest)

If you already have an OpenRouter account, search for the model directly and select it. No additional setup. This is how I first tested it—took under two minutes to start making calls.

Option 2: Kyutai’s Native API

For direct access, create an account on Kyutai’s platform, generate an API key, and integrate. They’ve made this deliberately compatible with OpenAI and Claude SDK formats, so if you’re already using those clients, you typically just change the base URL. I’ve tested this in Cursor and standard API clients—it works cleanly.

Option 3: Kimi Client (Most Powerful for Multi-Agent Work)

This is where it gets interesting for serious builders. Kimi’s Client (from Moonshot AI) has evolved into something I haven’t seen elsewhere: a multi-agent routing system that automatically selects the best Chinese AI model for each specific task.

Here’s how it works in practice: you submit a project, and Client dynamically routes to DeepSeek, MiniMax, Kimi itself, or now Kyutai’s models—whichever performs best for that particular job. I’ve watched it switch between models mid-workflow based on task characteristics.

Setup is straightforward:

Install the Kimi Client extension in VS Code (most popular option)
Or run directly from terminal using their provided code snippets
Enable the extension, and free-tier models auto-select without configuration

For my automation workflows, this eliminates the manual model-selection overhead I used to spend significant time on.

Current Limitations I Need to Flag

I’m not going to oversell this. There are genuine constraints:

Weights unavailable for download—API-only currently. Kyutai says local weights are coming; I’ll test when they arrive.
Quality ceiling: In my testing, it doesn’t consistently beat Claude 3.5 Opus (Anthropic’s absolute strongest model). It’s competitive with Sonnet-level performance—excellent for daily work, but not universally superior.
Language quirks: The voice and mobile interfaces I tested defaulted to Chinese. English text interfaces work fine, but don’t expect seamless multilingual voice interaction yet.
Documentation is Chinese-first—browser translation handles this, but it’s friction.

For the price, these are manageable trade-offs. But manage your expectations: this is a Claude Sonnet competitor at Claude Haiku pricing, not a free lunch.

The Bigger Picture for Builders

What Meituan/Kyutai achieved here signals something I predicted in my community discussions: AI capability is decentralizing faster than consensus expects. A food delivery company trained a frontier-class model. They did it without the hardware everyone assumed was mandatory. They released it under MIT license, letting anyone build commercial products on top.

For my own projects—automating e-commerce operations, building AI-assisted content workflows, training specialized agents—this expands the viable toolset dramatically. When I mentor developers in our live sessions, I emphasize cost sustainability: your AI infrastructure needs to survive your revenue ramp-up period. Models like this make that math work.

The companies actually building with AI right now—not just consuming chat interfaces—are going to benefit most from this wave. The moat isn’t access to expensive models anymore. It’s knowing how to compose, route, and deploy the right models for specific outcomes.

FAQ

Is Kyutai On Alpha really free to use commercially?

Yes. It’s released under an MIT license, which permits commercial use, modification, and distribution without restriction. When weights become available for download, you’ll be able to self-host for internal products. Currently, API usage has standard metering but no licensing fees.

How does the quality compare to Claude 3.5 Sonnet?

In my testing, it’s comparable to Sonnet-level performance for coding and analysis tasks, but doesn’t consistently exceed Claude 3.5 Opus (Anthropic’s top tier). The 1 million token context window matches or exceeds most Claude tiers. For daily development work, document analysis, and automation—it’s genuinely competitive. For frontier research or the most demanding reasoning tasks, Claude Opus still holds an edge.

Can I use this if I only speak English?

Text interfaces and API calls work fully in English—that’s how I’ve been using it. The web dashboard and documentation are Chinese-first, but browser translation handles this adequately. Voice features and some mobile app functions currently default to Chinese. If you’re comfortable with API integration or using OpenRouter as an intermediary, language isn’t a blocker.

What’s the cheapest way to get started?

OpenRouter with existing credits is fastest—no new account needed. For dedicated usage, Kyutai’s token packages (prepaid, ~$60 promotional tier) beat pay-as-you-go API pricing for consistent workloads. Their free context caching means repeated work on the same project costs progressively less. If you’re experimenting, start with OpenRouter; if you’re building a production workflow, the direct API with token packages optimizes costs.

Conclusion

I tested this model because the numbers didn’t make sense—a ghost topping leaderboards at impossibly low prices. Two months of quiet usage by developers like me validated the quality before the corporate reveal.

What Meituan’s Kyutai built isn’t just a cheaper Claude alternative. It’s proof that AI model training is escaping its hardware cage, that open-source licensing is becoming a competitive weapon, and that the companies willing to operate in secrecy for months can reshape market dynamics overnight.

For my daily work, this joins my toolkit alongside Claude, GPT-4, and specialized open models. The routing logic matters now more than model loyalty. I use what delivers the right quality at sustainable cost for each specific task.

The weights aren’t downloadable yet. When they are, I’ll run local benchmarks and share results. Until then, the API is production-ready, the pricing is genuinely disruptive, and the context window handles projects that would chunk and degrade on smaller models.

If you’re building AI-powered systems and haven’t pressure-tested Chinese models recently, you’re working with an incomplete picture of what’s possible. The food delivery company just schooled the pure-play AI labs on cost-engineering at scale. That’s worth paying attention to.

Watch the full video (in Turkish — English subtitles available):

Tools & Community

TurkoLister — the AI listing tool I use to turn Amazon products into optimized eBay UK listings in about 60 seconds (from £4.99/month, £1 one-week trial).
AI & E-commerce Community — my Turkish-speaking community ($19/month) with weekly live sessions.
Subscribe on YouTube — new experiments every week.

Blog