TechDwarkesh Podcast

What does the next training paradigm look like?

With Dwarkesh Patel, Dario · Sunday, June 28, 2026

Dwarkesh Patel explores the current AI training paradigm, focusing on the "big research bet" on scaling RL in verifiable environments. He critiques its limitations in generalizing to real-world, non-grindable tasks and the inefficiency of current inference, advocating for advanced continual learning techniques like On-Policy Self-Distillation and "dreaming" to enable AIs to learn on the job and improve through broad deployment.

Watch on YouTube ↗More from Tech →

AI's Next Edge: OPSD and 'Dreaming' for Continual Learning

Dwarkesh Patel reveals advanced AI techniques like On-Policy Self-Distillation and 'dreaming' that let models learn on the job and scale beyond fixed datasets.

Read article →

Dwarkesh Patel: AI's 'Ephemeral' Learning Waste

Dwarkesh Patel exposes a huge waste in AI: models forget in-context lessons. Is your product caught in the same 'ephemeral' learning trap?

Read article →