Anthropic's Fable 5: Is 'Safety' Just Gatekeeping?

Key Takeaways

Anthropic's Fable 5 AI model explicitly blocks requests related to biology, cybersecurity, and frontier LLM development, triggering a sharp debate about its true intent.
While framed as safety measures, critics like Ben Thompson point out these restrictions also serve Anthropic's business interests by preventing competitors from leveraging their tech or avoiding potential liability.
The model's rejection threshold is notably low; John Coogan cited instances of biologists getting "kicked down to opus" just by saying hello.
Critics like Dean Ball and Doug Olaflin argue these guardrails are anti-competitive gatekeeping by highly compensated AI developers, eroding trust in the AI safety community.
A subtle degradation of research answers without clear disclosure further fuels concerns about transparency and could invite heavier government regulation.

The Disagreement

Anthropic's Fable 5 AI model arrived with impressive capabilities, but it also came with a stern set of explicit guardrails: it flat-out rejects requests concerning biology, cybersecurity, and frontier large language model (LLM) development. Anthropic positions these as essential safety measures, designed to prevent misuse and ensure responsible AI deployment. This stance has its sympathizers, who acknowledge the need for caution in developing powerful new technologies.

However, a vocal chorus of critics argues these restrictions reek of something far less altruistic than pure safety. As John Coogan put it, while these rules align with Anthropic's safety focus, “as many people have pointed out, it's also just good business. Uh you don't want competitors using your products to create directly create competitors.” The line between safety and self-interest, for critics, is blurring into oblivion. The rejection threshold itself is a flashpoint. Coogan recounted hearing “tons of examples on the timeline of a biologist just saying hi to the model and getting kicked down to opus” – Anthropic's less capable model. This isn't just about controversial use cases; it's about basic research being kneecapped.

Critics like Dean Ball and Doug Olaflin pull no punches. They view these guardrails as anti-competitive gatekeeping, imposed by a select group of "highly compensated AI developers" who are out of touch with the broader research community. The biggest casualty, in their eyes, is trust. Coogan emphasized, “My last observation reanthropic secret sabotage safety policy is that it undermines actually good safety policy.” When guardrails look more like moats, the integrity of the entire AI safety movement takes a hit. The lack of transparent disclosure about these limitations, particularly the "subtle degradation of AI research answers without explicit disclosure," only deepens the suspicion. Coogan even suggested it is “very plausible to describe this as anti-competitive behavior. Even if you are maximally sympathetic to enthropic here, you must admit this.”

Who's Right (and When They're Wrong)

Both sides have a point, but the critics carry more weight here. Yes, frontier AI models present real risks, and responsible development requires guardrails. No one wants to see powerful AI misused in biology or cybersecurity. However, Anthropic's current implementation feels less like measured risk management and more like strategic market control masquerading as safety.

The critical error is the opaque, overzealous application of these restrictions. A blanket ban on an entire field like "biology" – to the point of a researcher getting downgraded for a simple query – is an impediment to legitimate scientific inquiry, not just a safeguard against malevolence. It stifles innovation and concentration of power, especially when combined with a lack of clear disclosure about how answers are being degraded or censored. As John Coogan pointed out, if Anthropic's CEO Dario (Amodei) is concerned about inequality, "he has to realize that he is the inequality and the unilateral gatekeeping feels whack as hell."

While Anthropic is within its rights to define its product's capabilities, the broader implication for the AI ecosystem is dire. When a leading model limits access to critical research areas, it starves the ecosystem of diverse perspectives, slows collective progress on genuine safety issues, and centralizes control over the future of AI. This approach breeds distrust and inadvertently strengthens the case for heavy-handed government intervention, which ironically, neither builders nor responsible safety advocates want.

What to Do With This

Founders, when evaluating AI tools for your own products, don't just check the marketing materials for "safety" claims. Dig into the specifics: What are the explicit limitations? What isn't it doing, and why? Push your vendors for transparency, especially around how their models handle sensitive or frontier topics. If you're building your own AI models, adopt a "safety.txt" approach: publish your guardrails, explain the rationale, and invite scrutiny, rather than letting restrictions emerge as a suspected byproduct of business strategy."

strategy. This builds trust, attracts talent, and insulates you from accusations of gatekeeping."