Key Takeaways
- Early LLMs (pre-GPT-5.2) were useless for complex reasoning tasks like building a poker bot; they simply couldn't get started. GPT-5.2 could help, but required constant oversight.
- GPT-5.5 can build complex solvers, like a full poker bot, almost zero-shot, in domains with deep academic research but scarce open-source code.
- The true superpower of advanced LLMs lies in optimization: GPT-5.5 made Noam Brown's PhD-era poker algorithms 10-100x faster, revealing how inefficient his original code was.
- While LLMs are brilliant at refining and accelerating known solutions, they still lack "research taste"—the ability to invent truly novel algorithms or approaches.
The Poker Bot Sandbox: A New Arena for AI Reasoning
If you want to push the limits of LLM reasoning, forget generic benchmarks. OpenAI research scientist Noam Brown found his ideal test ground in building poker bots. He saw it as a perfect challenge because, as Brown noted, “there is very little um open source code for making poker bots and there's a lot of published papers on it but you really have to reason through everything and it's like requires a lot of just reasoning and iteration and like a lot of small gotchas.” It's a field rich with established theory but demanding practical application, requiring models to truly understand and implement complex strategies, not just regurgitate code.
Brown tracked the progression. Early models were essentially useless, “they could not basically do anything.” But with GPT-5.2, he could work with the model to construct a river solver, a crucial component of a poker bot. This wasn't a hands-off experience, though. Brown described having to carefully manage it: “The downsides with 5.2 is I felt like it was gaslighting me a lot and I always had to be very careful checking it and making sure like, okay, is it actually doing what it said it did?” The experience was a mix of assistance and meticulous verification.
The 100x Optimization Engine (Not the Inventor)
By the time GPT-5.5 rolled around, the game changed entirely. Brown found it “way better. It was able to basically do it zero shot. And in fact, I've been working on just doing a full scale poker solver.” This wasn't just about faster coding; it was about an exponential leap in capability. The most striking discovery wasn't that the LLM could build the solver, but how it optimized what Brown had already created.
Brown was candid about the experience: “I was really impressed with the model's ability to optimize the algorithms that I had developed in my in my PhD. It was honestly it it was it was shocking to see how inefficient I was um in retrospect and they were able to make it like you know 10 100x faster.” This isn't just a minor tweak; it's a profound performance multiplier on existing logic. This means that while LLMs can't yet invent the next breakthrough algorithm from scratch—they still lack that "research taste" for novel paradigm creation—they are unparalleled at taking your existing, well-defined (but perhaps clunky) solutions and supercharging their efficiency. Think of them as the ultimate code refiner, capable of finding optimizations humans would miss or spend weeks discovering.
What to Do With This
Stop asking LLMs to invent your next big strategy from scratch. Instead, feed them your existing bottleneck code, messy internal processes, or clunky but proven algorithms. Task them with specific optimization goals: "Make this 10x faster," or "Refactor this for maximum memory efficiency." Expect significant speedups on these refinement tasks, but always treat initial outputs as drafts, especially with less mature models.