Your User Signal Is the Moat for Independent AI Applications

Key Takeaways

Sarah Guo, during her conversation with Baseten CEO Tuhin Srivastava, challenged the core premise of independent AI application layers, asking if they can truly exist against powerful frontier model labs.
Srivastava argues that the key to survival for these independent applications lies in proprietary “user signal” — data specifically encoded in customer workflows.
This unique user signal allows application companies to post-train models and develop sophisticated, long-horizon agentic systems that foundational models cannot easily replicate.
Examples like Abridge, which acts as an ambient scribe for physicians, illustrate how deep integration into specific workflows generates this defensible data advantage.
Frontier model companies lack direct access to this highly specific workflow data, creating a natural moat for specialized AI application builders.

The “Existential Question” Facing AI Apps

Sarah Guo cut straight to it on a recent podcast, posing an “existential question” that haunts every founder building on top of large language models: can the independent application layer even survive? With giants like OpenAI and Anthropic constantly pushing the boundaries of what foundational models can do, it’s easy to wonder if their capabilities will eventually swallow up specialized applications. Why build a niche tool when a frontier model can do most of it out of the box?

Tuhin Srivastava, CEO of Baseten, doesn't just believe the application layer will exist; he sees it as essential. His core argument isn't about raw model power, but something far more subtle and defensible: user signal. This isn't just any data; it's the rich, contextual information embedded in how real users interact with an application within their specific, daily workflows. Without this, even the most advanced frontier model struggles to understand the nuances that make an application truly valuable.

Your Workflow Data Is Your Deep Moat

Srivastava laid out his case clearly: what becomes valuable to a company is the “user signal that they can gather that only they can gather.” This signal isn't floating in a generic data lake; it's meticulously "encoded in workflows." Think about it: a general-purpose model might write a decent email, but it won't understand the specific jargon, the implied context, or the intricate approval process of your company's sales team, unless it's been explicitly trained on your sales team's interactions.

This unique workflow data creates a powerful feedback loop. As users engage with the application, they generate signals — corrections, preferences, task completions, failures — that are specific to their professional context. Srivastava explains that this allows application builders to “start to post-train models on that reward signal” and develop “long long horizon agentic models running that.” These aren't just fine-tuned models; they're intelligent systems that learn and adapt over extended interactions, anticipating needs and executing complex tasks with a deep understanding of their environment. This is something frontier models, by their very nature, cannot easily replicate without direct, proprietary access to those specific workflow signals.

From Data to Defensibility: The Post-Training Advantage

The power of user signal isn't just about making a model slightly better; it's about building defensibility. Companies like Abridge, an ambient scribe for physicians, exemplify this. They sit deep within a doctor’s workflow, capturing subtle cues, medical terminology, and patient interactions that are unique to the healthcare system. This constant stream of highly specific, real-world data allows Abridge to train models that are orders of magnitude more accurate and context-aware for medical use cases than any general-purpose model could be.

Srivastava points out that Baseten serves companies like Abridge and Open Evidence, giving them a unique vantage point on this trend. By enabling these specialized applications, they see firsthand how deep integration and access to proprietary user data create a differentiated layer that frontier model companies simply can't penetrate. The competitive edge isn't in who has the biggest model, but who has the most relevant and deeply integrated user signal for a specific problem.

What to Do With This

Take a hard look at your own product. Pinpoint the specific workflows where your users generate unique, proprietary signal that no other company, especially a frontier model lab, could easily access. Then, aggressively build systems to capture, encode, and leverage that signal to post-train your models, developing agentic capabilities that deepen your moat.