In 2013, Andrej Karpathy took a flawless ride in a self-driving Waymo. Yet in 2025, we’re still working on autonomy and driving agents. It’s one thing to demo intelligence, another to deliver reliability.

Here’s what I learned from Karpathy’s recent talk at YC, and why it matters—especially for students and developers building in this evolving space.

Software 1.0 — Code by Hand

Traditional hand-written code that explicitly instructs computers what to do.

Software 2.0 — Code by Gradient

Neural network models trained on data; code learned rather than explicitly written.

Software 3.0 — Code by Conversation

Programming large language models (LLMs) using English prompts, bridging natural language and machine instructions.

Karpathy draws an interesting parallel to the 1960s time-sharing era:

LLM compute remains extremely costly, forcing models to be centralized in the cloud.
We’re still waiting for the equivalent of a “personal computing revolution” for AI. Today feels similar to the early mainframe era, where computation is remote and centralized.

Lessons from Tesla Autopilot

Karpathy highlighted Tesla Autopilot as a cautionary tale for AI implementation:

As neural networks grew more capable, traditional C++ code usage significantly decreased.
Achieving production readiness with AI takes considerable time and iteration, far longer than initial demos suggest.

“2025 isn’t the year of AI agents—it’s the decade.” — Andrej Karpathy

How LLMs differ from humans

Despite their capabilities, LLMs have distinct cognitive deficits:

They possess vast encyclopedic knowledge but significant cognitive limitations.
They still hallucinate occasionally, despite improvements.
Their “jagged intelligence” makes them brilliant in specific areas but prone to mistakes no human would make (e.g., misunderstanding numerical comparisons like 9.9 < 9.11).
“Anterograde amnesia”: inability to retain information beyond their immediate context.
Susceptibility to prompt injection attacks due to inherent gullibility.

Effective developers recognize these limitations and strategically engineer around them, rather than expecting flawless, human-like behavior.

Partial-autonomy apps are the way forward

Software should keep a human-in-the-loop, gradually delegating repetitive tasks to AI. Successful examples include:

Cursor for coding, offering quick validation through code diffs.
Perplexity for search, providing easy verification and transparency.

Today’s software remains mostly human-focused, indicating substantial opportunities to design applications specifically with LLMs in mind.

Developers cheat-sheet: Start with partial autonomy, design tight verification loops, and version-control your prompts.

Karpathy emphasizes patience, recalling his flawless 2013 Waymo self-driving demo. Even in 2025, fully autonomous vehicles still aren’t ubiquitous and often require human oversight. Reliable autonomy grows incrementally—small steps at a time.

Your turn:

What’s one AI demo or idea you’re excited to see become reliably production-ready?

If you enjoyed this article and are exploring the gaps between AI demos and practical products, follow for more insights!