AI Agents: No Excuse for Lack of Rigor

Key Takeaways

AI agents are solving deep technical challenges like database query optimization by exhaustively trying different column store formats and execution engines, a task impractical for humans.
Anker Goyel's team at Brain Trust uses coding agents to reproduce slow database query patterns and then tests “a bunch of ideas from database literature” to fix them.
Claire Vo echoes this, applying agents to complex, large-scale data migrations by programmatically testing against “pretty longtail data structures.”
The critical shift is that AI agents allow engineers to achieve a level of experimental rigor and problem-solving scale previously impossible, especially for high-cost, high-risk platform changes.

The Method: Exhaustive Rigor with AI Agents

Imagine a persistent, almost obsessive, staff engineer who never tires, never gets distracted, and can cycle through thousands of complex technical solutions in hours. That's the power Anker Goyel, CEO of Brain Trust, and Claire Vo are seeing in AI agents for critical infrastructure problems. Forget AI for simple code generation; this is about using agents to tackle the kinds of technical debt and performance bottlenecks that make engineers sigh.

Goyel describes how his team uses coding agents to optimize slow database queries. Instead of relying on manual inspection or limited human-driven experimentation, they first "reproduce those things" – the specific slow query patterns. Then, a coding agent gets deployed to systematically try out a vast array of solutions. “[The agent tries] a bunch of ideas from database literature,” Goyel explains, exploring different column store formats, execution engines, and more, far beyond what any human engineer could attempt in a reasonable timeframe. It’s an exhaustive, scientific approach to a problem traditionally solved by intuition and limited trial-and-error.

Claire Vo corroborates this, sharing her experience using agents for large-scale data migrations. She emphasizes that pairing code execution environments with advanced GPT models creates a unique setup: “it has been the only setup where I have been able to set up a very similar process which is the outcome I want is XYZ. We need to programmatically test against pretty longtail data structures to figure out which of these potential solutions are going to get us closer to the outcome we want.” This isn't just about finding a solution, but about finding the optimal solution by running an unprecedented volume of rigorous tests against diverse, real-world data.

The core insight? AI agents strip away any excuse for lacking rigor in high-stakes engineering. “There's just no excuse to not have rigor,” Goyel says. “There's no staff engineer who is running as many rigorous benchmarks and trying out different algorithms and analyzing ideas manually than someone who's using an agent.” The practical quality of engineering solutions for hard problems skyrockets because agents can "run at the problem" with unparalleled consistency and endurance.

Where This Breaks Down

This method shines brightest for complex, high-cost technical problems where the solution space is vast, and the impact of failure is significant. It's less suited for simple bug fixes or well-understood feature development where the overhead of setting up agent-driven experimentation would outweigh the benefits. The initial investment in building robust evaluation environments, defining precise measurable outcomes, and orchestrating agents requires a specific skillset and a willingness to commit resources to a new, rigorous workflow. It's not a magic bullet; it's a powerful tool that demands careful setup and definition of the problem at hand.

What to Do With This

Identify one stubbornly high-cost, high-risk technical problem in your current infrastructure – a database query that consistently causes performance bottlenecks, a complex data migration, or a critical platform refactor you've been delaying. Can you define a clear, measurable outcome for success? Now, brainstorm how you could programmatically test potential solutions. This means sketching out a sandbox environment, defining the edge cases, and outlining the variables an agent could systematically explore. Don't build the agent yet; just define the problem in a way that could be handed off to an exhaustive, tireless machine. The next step is to research tools that allow you to create such an environment for programmatic testing.