3.2 billion language app downloads. Almost zero speakers. The first wave gave us flashcards. The second gave us gamified quizzes. Both optimised for the wrong thing. It is time for a third wave that starts with what actually works: speaking.
Over a billion people are learning a new language right now. Almost none of them will learn to speak it.
That should shock us more than it does. We've had language learning apps for over a decade. They're beautifully designed, wildly popular, and backed by billions in venture capital. Duolingo alone has over 100 million monthly active users. And yet: fewer than 10% of language learners reach conversational fluency.
If you look back at how language learning technology has developed over the past 30 years, a pattern emerges. Every generation has solved one problem while leaving a bigger one untouched.
Rosetta Stone, CD-ROMs, textbooks. For the first time, you could study a language without a classroom. But the experience was static, one-size-fits-all. You consumed content. You never produced language.
Duolingo, Babbel, Busuu. Streaks, points, leaderboards. Brilliantly addictive. But fundamentally recognition-based: you tap, match, and swipe your way through lessons. The most desirable language skill is speaking, yet these apps barely teach it.
Real-time voice AI enables learners to practice actual conversation at near-zero marginal cost. No scheduling, no judgment, no $50-per-hour tutor.
But technology alone isn't enough. 50 years of language acquisition research tells us what actually works. At its core, learning a language requires three things:
Matching words to meanings, translating sentences, picking the right answer. This is what most apps are built around, and they do it well.
Hearing the language spoken in real context. Podcasts, audio lessons, native speakers. Valuable, and increasingly available.
Actually producing language. Forming your own sentences, out loud, in real time. The hardest skill, the most desirable, and the one almost no app teaches.
Apps have gotten remarkably good at the first two. But speaking, the skill that matters most when you're standing in front of a real person, is almost entirely absent.
Anyone who's studied a language recognizes this. You know the words. You've passed the quizzes. But when it's time to actually talk, everything locks up. That's not a lack of knowledge. It's a lack of practice in the right mode.
The research backs this up consistently. Learners who only comprehend without producing plateau quickly. Concrete words stick faster than abstract ones because the brain anchors vocabulary to things it can picture. And learning happens at the edge of what you can already do: too easy and you coast, too hard and you shut down. Every lesson needs to sit in that productive middle ground.
This is established science, built up over decades across multiple disciplines. And almost none of it has made it into the products a billion people use every day.
I believe the next generation of language learning must be built on this research. Not just AI-powered, but research-grounded. Not just conversational, but curricularly structured. A system where every lesson maps to a communicative outcome, every exercise is chosen based on what the learner needs to practice, and every session adapts to what the learner actually knows.
This is what we're building with eevi. Not because the technology is interesting (though it is), but because speaking a language changes your life. It connects you to people, cultures, and parts of the world you couldn't access before. The tools to learn and preserve languages should be accessible to everyone, not gated behind $50-an-hour tutoring fees.
The third wave isn't about better apps.
It's about better outcomes.
It's about the moment a learner opens their mouth
and realizes: I can actually do this.