Part 2: Cheap subsidised AI has made our products lazy
Part one ended on a question: what does thinking harder about AI actually look like? The recalibration is already underway. Here's the framework I keep coming back to.
Part one ended on a question. What does thinking harder about AI actually look like in practice? Before we get there, it's worth noticing that the shift is already underway. In places that aren't making headlines, some companies are pulling back. Teams that were encouraged to burn tokens on every available workflow are being asked to be more selective. The return on all that token spend isn't matching the bill, and the same companies that were beating the AI drum a year ago are starting to ask different questions. This is the recalibration. It's begun.
None of this should surprise anyone who was paying attention a year ago. In 2025, MIT's NANDA initiative published a study on enterprise AI called The GenAI Divide. The headline finding was that 95% of enterprise generative AI pilots had failed to produce a measurable return. Not most. Ninety-five percent.
What's interesting is the diagnosis. Executives in the study tended to blame the models. The researchers found something else. The problem wasn't the AI. It was what the report called a learning gap. People and organisations didn't understand which use cases AI was suited to, and they didn't know how to design workflows that captured the benefit while managing the downsides. The tools worked. The way we were using them didn't.
A year on, the study reads less like a warning and more like a forecast. The recalibration starting to happen now is the predictable response to what was already true then. The shift isn't being driven by AI getting worse. It's being driven by companies finally noticing that the way they've been building with it hasn't been working.
The learning gap shows up first in the process. Most product teams build the same way. Find a pain point. Diverge on solutions. Converge on a feature. It's a good process for most product work. It's the wrong shape for AI.
AI isn't best at solving fresh problems. It's best at optimising friction that's already there. Smoothing the slow parts of a workflow. Reducing the cognitive load on a step a user already does. Surfacing context they would otherwise have to dig for. When teams start from a pain point and ask "how can AI solve this", they end up reaching for a model whether or not the situation called for one. When teams start from an existing friction and ask "is there a capability that would make this lighter", they end up somewhere more useful.
The starting question matters. Most of the lazy features in part one came from the wrong one.
There's a second piece of the learning gap. Most people think of AI as one thing, and that one thing is usually a chatbot. It's the loudest, most expensive, most general form of the technology, and it's the one everyone reaches for by default. But AI isn't a single capability. It's a wide set of approaches with very different costs, speeds, and use cases.
Classification is AI. So is semantic search. So is intent detection, routing, ranking, summarisation, prediction. Some of these run cheaply on small models or even on-device. Some don't need a large language model anywhere near them. A lot of the friction worth removing in a product can be removed without ever calling the most expensive form of the tool.
Teams that treat AI as a single primitive end up reaching for the most expensive option every time. The teams that know the wider toolkit pick the right tool for the friction in front of them.
The framework I keep coming back to is simple, and it works because it puts the starting question in the right place. Context plus AI capability equals concept.
Context is the use case. The specific friction. The specific user, doing a specific thing, at a specific moment in a workflow. Not "users want to be more productive". Not "people get overwhelmed in long documents". Something concrete enough that you can describe what changes when the friction is gone.
AI capability is the specific tool you're reaching for, picked from the wider spectrum. You should be able to say which one, why, what value it gives the user, what it costs to run, and what its limitations are. If you can't answer those, you haven't picked one yet.
Concept is what emerges when context and capability are matched well. It isn't a feature you reverse-engineered from a model. It's the design move that lands when you put the right capability against the right friction. Most of the time it's smaller and quieter than the AI feature most teams would have shipped. That's usually a sign you've done it right.
Even with the right context and the right capability, four things have to be true for the concept to actually land.
The first is risk. AI is excellent on bounded, low-stakes friction. It is unreliable on high-stakes decisions, and dressing that unreliability in confident language doesn't make it less unreliable. If the cost of getting it wrong is significant, the capability either needs guardrails, automated checks, and human review built into the design, or it needs to not ship.
The second is integration. Most AI features fail not at the capability layer but at the touch point where the capability meets the user's behaviour. The model can do the thing. The user has to want to do the thing with the model in the loop. Designing that touch point is the most underrated work in AI right now, and it's almost never given the time it needs.
The third is measurement. Teams need to know, before they ship, what success looks like and how they'll know they've reached it. Most don't. AI outputs are messy enough that without a clear evaluation method, every result becomes a matter of opinion. That's how lazy features survive long after they've stopped earning their place.
The fourth is judgement. I've written elsewhere about this, but it's worth repeating here. Knowing when not to build is part of the work. The equation lets you check whether a concept is worth shipping. The harder discipline is being willing to walk away from a concept that isn't, even when leadership wants you to ship something.
The recalibration won't be kind to teams that spent the cheap era bolting on. It will reward the teams who were already thinking like this. The ones who started from context, picked the right capability for the job, and shipped concepts that earned their cost from day one.
None of this is news. Product teams have gotten complacent with the basics, because cheap AI gave us permission to. The teams that snap out of it now will be a long way ahead of the ones who don't.
If you take one thing from these two pieces, let it be the question. Not "how can we add AI to this?" Ask, "what's the friction worth removing, what capability fits, and is the concept that emerges worth what it costs?" That's the shift. It's not bigger than that.