I am a strategist. I cover Technology, Creativity, Product, and AI: the four forces rewriting how organisations work. Patient Comet is where I do that thinking in the open, one researched argument at a time.
Most of my work is about the same thing regardless of what we call it: helping organisations tell the difference between a shift that will matter and one that won’t. In AI, that means knowing which capability is genuine and which is a demo in a good suit. In product, which interface survives the AI transition and which doesn’t. In creativity, which tool gives you leverage and which just gives you more output. Closing that gap is worth doing carefully.
The mission
Why this exists
Most writing about technology is reactive. It chases the launch and the share price, and by Friday it is landfill.
This is built the other way round. Each week I take one shift and stay with it. A shift in AI. In how products are built. In how creativity works. In how organisations make decisions. I stay with it until it is useful to you: where it stands now, where I think it is heading, and the one move worth making.
Researched. Openly opinionated. Honest about what we do not yet know. Written for people who need more than a headline and have no patience for hype.
The metaphor
The name
Every major technological wave is the comet: vast, cold, fast-moving, and very easy to be swept along by. Mobile was one. Social was one. AI is the current one. The comet changes. The patient instinct doesn’t.
Strategy is the opposite instinct. The deliberate position you hold while everyone else reacts. Hold that position. Let the noise burn off, and the impact arrives the way a comet does. Rare, and impossible to miss.
Patient Comet is a standing reminder: patience and precision outlast speed and panic.
Patience and precision outlast speed and panic.
The method
How each article is built
Every piece follows the same discipline. It opens on something real and documented, never a hypothetical. It makes one argument, stated plainly.
Every load-bearing number is checked against a primary source and cited, and any figure that is shaky gets flagged as shaky rather than dressed up to look certain. Before I land my own view, I give you the strongest version of the case against it. Then it ends on one thing you can actually do on Monday.
Patient Comet · Software · 3 April 2026 · 15 min read
The Vibe Coder Fallacy
In February 2026, Moltbook, an AI-built social network, was fully breached three days after launch. In September 2025, Quittr, a vibe-coded habit app that reached one million dollars in revenue and an Oprah mention, exposed the private records of 600,000 people for months. In late 2025, a SaaS product watched users find and use a free bypass of its paid subscription within 72 hours of going live. All three looked completely finished. None were.
Most people building with AI tools today have never seen a production-ready product from the inside. They have seen the front end, what loads, what clicks, what responds in the demo. The eleven layers underneath are invisible to them. So is the design system that was never built. What actually happens between a vibe-coded demo and a product someone can safely use , and what you need to know before you ship.
N
Nadim A. Massih
Patient Comet · 3 April 2026 · 15 min read
4.75M
Records exposed by Moltbook (the AI social network), three days after launch and exploitable in under three minutes (Wiz, 2026)
91.5%
AI-built apps with at least one security vulnerability , in an independent audit of more than 200 apps (GuardMint, Q1 2026)
63%
of people building apps with vibe coding tools have no coding background (TechTimes, 2026)
What these three mean together: Most people building AI apps today have never seen the layers underneath what they are shipping. Nine in ten of those apps prove it , with security holes that no one was ever in place to find because nobody with the right background was ever involved.
The Best Demo Anyone Had Ever Seen
Vibe coding , the term coined by Andrej Karpathy (a researcher who co-founded OpenAI and previously led AI at Tesla) in early 2025, is a way of building software by describing what you want to an AI in plain English, rather than writing the code yourself. You tell the model what you want, it writes the code, and you refine by describing problems rather than debugging lines. The model types. You provide the vision.
There is a moment in every demo room, physical or digital, where something clicks.
The product loads. It responds. Someone types something and it does the right thing. Faces change. The questions stop. Somebody says “when can we have this?” and everybody else is already nodding.
That moment happened millions of times in 2025 and 2026. It happened with unprecedented frequency because anyone could now create a working, responsive, convincing product in an afternoon.
In February 2026, a man called Moltbook into existence this way. An AI social network (a place for AI agents to interact and post) built without a single line of human-written code. The founder said so publicly. Karpathy praised it. The internet arrived. It loaded. It animated. It responded.
It looked, by every measure a normal person would use, completely finished.
Three days later, a security firm called Wiz (a US cybersecurity company) opened the JavaScript that Moltbook’s server sent to every visitor’s browser. Inside it: a database key (a password to the entire database) sitting in plain view. Moltbook was built on Supabase (a popular cloud database service), which requires Row Level Security (the policy that controls which users can read which rows) to be explicitly switched on. It had not been. Default: everything open to anyone with the key. The key was in the code every visitor downloaded automatically.
A researcher exploited the full breach in under three minutes.
Behind those rows: 4.75 million records. One and a half million API keys (passwords to other connected services). Thirty thousand email addresses. Thousands of private messages (Wiz, 2026).
Three days from launch. Under three minutes to exploit.
The demo had been perfect. The product had never existed.
That gap has a name. And it is not a bug in the tools.
Frontend and Backend Is Not Full-Stack
The misconception at the centre of almost every vibe coding failure is not about the tools.
When a non-technical founder, product manager, or first-time builder thinks “full-stack” (the complete thing, front to back) they picture two layers: the interface the user sees, and the server that powers it. Frontend and backend. The AI builds both. The prototype demonstrates both.
A production-ready product requires all thirteen. Auth & Permissions (controlling who can access what). Security & RLS (the policies that enforce those controls at the database level). Rate Limiting (stopping a single user from overwhelming the system). CI/CD & Version Control (a pipeline of automated steps for updating the product safely without breaking what already works). Error Tracking & Logs (knowing when something fails, how, and why). Availability & Recovery (continuing to run , and returning to normal when it does not).
None of these appear in a demo. The demo is a performance of layers one and two. The other eleven have no role in it.
And the diagram does not show everything. It shows the technical stack. What it does not show is the design layer , the discipline that determines whether anyone actually wants to use what you built.
A vibe-coded interface assembles functional elements without a design intention running through them. It is like a room furnished from eleven different houses. The chair came from a kitchen, the desk from a bedroom, and the light was designed for a hallway. Each piece works on its own. The room never does. This is not a cosmetic problem. It shows up in conversion rates, churn, and the gut feeling every user has that something is slightly off, before they can name it.
In an audit of twelve production-ready vibe-coded apps, four had live payment credentials or database master keys shipped directly in the front-end JavaScript, visible to anyone who opens the browser developer tools (getautonoma.com, 2026).
These are not edge cases from one bad actor. Moltbook was not even the worst of them.
What vibe coders think full-stack is vs production reality
Frontend and backend are the top two layers of thirteen. The remaining eleven are where real products live, and where vibe-coded apps break. (Patient Comet / getautonoma.com, 2026)
The Gap a PM Can't See
Moltbook is the story everyone cites. The Quittr story is the one that matters more.
In September 2025, a habit-tracking app called Quittr launched with AI coding tools. Within ten days it reached one million dollars in revenue. Oprah Winfrey mentioned it. It was the kind of launch that makes investors reach for their phones at midnight.
What no one knew was that Quittr’s Firebase database (the cloud service where it stored all its user data) was publicly readable. Any authenticated user could access the full backend, where the records of more than 600,000 people were stored openly. The records included ages, triggers for behavioural habits, and personal confessions about users’ struggles (Cybernews, 2026).
A security researcher found it and contacted Alex Slater, Quittr’s co-founder, on 10 September 2025. Slater’s response: he would fix it “in the next hour.”
He did not fix it for months.
Quittr co-founder Alex Slater, whose app was earning $500,000 a month while exposing the private records of 600,000 users. He was told about the breach in September 2025 and promised to fix it in an hour. (404 Media, March 2026)
This is the product manager trap. As of May 2026, 63% of the people building applications with vibe coding tools have no coding background (TechTimes, 2026). The majority of people making build-and-ship decisions about AI-generated products cannot see the eleven layers that are missing. Not because they are careless. Because they have no frame for what is behind the wall.
Slater had built something that worked, that was making real money, that Oprah had mentioned. In his mental model, things were mostly working. The security misconfiguration was invisible to him. So was its severity.
The same blindspot hides design failures. A product manager without design experience cannot see that the first few screens of the app are losing eight in ten users before they ever reach the product. They cannot see that error messages are written in developer shorthand that no user understands. They cannot see that the mobile layout is technically responsive but requires a precision of thumb movement real users do not have.
A building inspector knows what to look for behind the wall. A designer knows what to look for in a user’s first sixty seconds. Someone who has only ever toured the show home sees neither.
The Quittr timeline: from demo to disclosure
Quittr went from demo-room success to an Oprah mention to a 600,000-record data exposure, all without the security layer ever being built. (Cybernews, 2026)
This Pattern Keeps Repeating
If Moltbook and Quittr were isolated incidents, you could call them bad luck.
In late 2025, Enrichlead (a software product sold on subscription) watched users find a complete paywall bypass within 72 hours of launch, by changing one value in a hidden developer panel built into every web browser. The check that verified whether a user had paid existed only in the front end , the layer the user controls. Anyone could simply remove it (getautonoma.com, 2026).
Also in 2025, Lovable’s AI generated access control logic across 170 production applications (the rules that decide who can touch what) and got it backwards. Users who had logged in were blocked. Everyone else had open access. The AI had implemented the security layer. It had simply inverted it (CVE-2025-48757, a formally recorded software vulnerability).
Amazon’s internal teams, deploying vibe coding tools with experienced engineers, produced four Sev-1 failures (the most critical level of production incident) in ninety days, including a six-hour outage.
And in April 2026, Lovable the platform (used by eight million people, valued at $6.6 billion) ran for forty-eight days with a broken authorisation flaw. Every user’s source code, database credentials, and AI chat histories were readable by any other logged-in user. A bug report was filed. Lovable’s team reviewed it, marked it “intentional behavior,” and closed it without escalation. They fixed it in two hours once the story went public (The Next Web, 2026).
The Next Web’s investigation into Lovable’s 48-day exposure: source code, database credentials, and AI chat histories readable by any logged-in user. (The Next Web, April 2026)
Across 200+ independent audits: 91.5% of vibe-coded apps contain at least one vulnerability (GuardMint, Q1 2026). More than 380,000 publicly accessible vibe-coded apps on the open web, 2,000+ actively leaking data , no attacker required (Red Access, 2026). AI-generated code now causes one in five enterprise security breaches (per industry analysis, 2026). These are not rare edge cases. They are the default outcome of shipping eleven unbuilt layers.
AI-generated code CVEs: published vulnerabilities, Jan-Mar 2026
CVEs attributed to AI-written code tripled in three months. The true count is estimated 5-10× higher. (Georgia Tech Vibe Security Radar, 2026)
AI-generated code now causes one in five enterprise security breaches (per industry analysis, 2026). The default outcome of shipping eleven unbuilt layers.
Four Failure Modes
These incidents look different. One founder, three days. Another founder, months. A platform, forty-eight days. Amazon, ninety days. But they all fail in the same four ways.
Technical failure is the most documented. Missing authentication, exposed database keys, access rules never configured , or configured backwards. These are fixable. The thirteen-layer framework tells you exactly where to look. Technical failure happens when the layers under the interface are never built by people who know what building them requires.
Design failure is less discussed and equally real. A vibe-coded interface assembles functional elements without a design intention running through them. The user journey was never mapped. Error states were never written in language users can act on. Onboarding was never tested against how a real person reads and moves. Mobile “works” but is not designed for how people actually use their thumbs. Users leave products that feel wrong before they can name why. And a non-technical PM cannot see this failure any more than they can see a missing security layer , because they have no frame for what a well-designed product experience looks like from the inside.
Organisational failure is quieter and more consequential. Slater was not negligent. He was operating in a mental model where the demo was the product. When the person making decisions about readiness has never built or designed any of the layers underneath, those layers are permanently invisible to the decision-making process. The same blindspot covers security and design.
Perceptual failure is the one that makes the other three inevitable. It is last on this list, but first in the chain, everything else flows from it. It happens in the demo room. The prototype loads, clicks, and does the right thing. The investor nods. The client says “when can we have this?” The PM says “soon.” Everyone has experienced the same convincing illusion: a working product. What they experienced was two layers of thirteen, with no design system behind it.
Prototypes used to look like prototypes. Now a vibe-coded prototype loads, responds, and delivers the product’s core value proposition completely. The gap between it and a real product is invisible. Invisible gaps are the most expensive kind.
Vibe coding’s honest deal: speed in, risk out
Vibe coding is 74% faster to build, and produces 2.74× more vulnerabilities per deployment. Speed in, risk out. (ortemtech / industry analysis, 2026)
What the Prototype Is Actually For
None of this is an argument against building fast. The speed is real and useful.
The prototype is the most precise brief you will ever hold.
It communicates to engineers, to designers, to security specialists, to stakeholders exactly what you are trying to build, with a fidelity that no slide deck or written spec can match. You built the interface. You can demonstrate the flow, the feel, the user journey. You have eliminated the most common cause of failed software projects: the gap between what the people building it imagine and what the people wanting it actually need.
That is where the prototype’s job ends.
Treating it as a brief resolves all four failure modes at once.
It fixes the technical failure, because the brief goes to specialists before launch, not after the breach. It fixes the design failure, because the prototype is handed to a designer who uses it as a reference for the actual design system: the user journey, the error states, the onboarding, the mobile experience, the copy. The prototype shows what is wanted. The designer builds what will actually work. It fixes the organisational failure, because the PM’s role shifts from “is this ready to ship?” to “have the engineers and the designer finished their layers?” Those are questions anyone can ask and verify. And it fixes the perceptual failure, because when the prototype is a brief, the demo room’s question becomes “is this what we want to build?” not “when can we launch this?”
The right workflow: prototype → brief engineers and a designer → engineers build layers 3-13 → designer builds the design system → security review → launch.
The wrong workflow is: prototype → someone who cannot see the gap says “it looks ready” → launch.
Most vibe-coded products that reach production fail. Not because the tool failed. Because the prototype was shipped as the product, without the engineers who build the technical layers, without the designer who builds the product experience, and without the security review that confirms both are done.
Right vs wrong workflow: what happens after the prototype works
The right workflow puts engineers and a designer between the prototype and the launch. The wrong workflow is the one most vibe-coded products follow. (Patient Comet analysis, 2026)
The Take
The Cost of Looking Finished
There are roughly 380,000 vibe-coded apps on the public internet. More than 2,000 are leaking sensitive data right now (as you read this), with no attacker required. The demo worked. The product never existed.
Most were built by people who never saw past the front end. Most were managed by someone whose entire experience of the product was the demo. Most had a demo room where everyone agreed it was ready , and nobody in the room had the background to know what they could not see.
Amazon used these tools with experienced engineers and still produced four critical failures in ninety days. Lovable built the tools and still ran inverted access control logic in 170 of the apps it generated. When the people who built the tools are running into eleven missing layers, the people using the tools to build their first product certainly will too.
The prototype is extraordinary. It is the best brief you have ever held. It is not the product.
The product is built by the people who know what the demo cannot show , the engineers who build the layers underneath, and the designers who build the system above. Use the prototype to brief them. Then let them work.
Where to start
Treat the prototype as a brief for two disciplines, not one. Hand it to an engineer for the technical layers. Hand it to a designer for the product experience , the onboarding, the error states, the mobile layout, the copy, the user journey. Both briefs come from the same prototype. Neither discipline can do the other’s job.
List your thirteen layers and mark which are actually built. Frontend · APIs & Backend Logic · Database & Storage · Auth & Permissions · Hosting & Deployment · Cloud & Compute · CI/CD & Version Control · Security & RLS · Rate Limiting · Caching & CDN · Load Balancing & Scaling · Error Tracking & Logs · Availability & Recovery. Walk each one. Ask plainly: built , or looks built?
Open your own front-end code the way a researcher would. Everything your server sends to a visitor’s browser is visible to that visitor. If there is a database key, a payment credential, or a service token in there, someone will find it. This takes ten minutes to check. Do it before you launch.
Bring in a security engineer and a backend engineer before launch, not after. A security engineer audits for the layer failures that do not appear in demos. A backend engineer confirms that authentication, authorisation, and access rules are correctly implemented , not just present, and not backwards. The review is the gate between a prototype and a product.
The question is never “how fast did they build it?” The question is who built the parts that do not appear in demos , and whether those people were ever in the room.
NWritten byNadim A. MassihAI & Tech StrategistMore articles
Common questions
Questions, answered first
Why do most vibe-coded apps fail in production?
Because they ship two layers and call it full-stack. The interface works. The eleven layers between the UI and a secure, stable, scalable product were never built by anyone who knows how to build them. In a 2026 audit of 200+ vibe-coded apps, 91.5% contained at least one vulnerability for exactly this reason (GuardMint, Q1 2026).
What does UX/UI failure in a vibe-coded app actually look like?
The interface assembles functional elements without a design system behind them. Buttons are in technically reasonable places, not the places users expect. Error messages are written in developer language users cannot act on. Onboarding functions but was never tested against how a real person reads and moves. Mobile works technically but was never laid out for how people use their thumbs. Users leave products that feel wrong before they can name why , and those users rarely come back.
What is the PM trap?
A non-technical project manager managing a vibe-coded prototype toward production cannot see the eleven missing technical layers or the missing design system , because they have no frame for either. The demo worked. In their mental model, things are mostly working. As of 2026, 63% of people building with vibe coding tools have no coding background (TechTimes, 2026).
Is vibe coding worth using?
Yes , for building the brief. The prototype is an extraordinarily precise communication tool. Use it to brief engineers and a designer. Let them build what the prototype cannot show.
What if I already shipped?
Audit immediately. Start with Auth & Permissions and Security & RLS. Check whether any credentials are in your front-end code. Confirm your database access rules limit each user to their own data. Bring in a security engineer for a pre-incident review. Have a designer walk your onboarding with fresh eyes. Both are faster and cheaper than the failure they prevent.
Receipts
Sources & references
Wiz, 2026
Moltbook exposed 4.75 million records (1.5M API keys, 30K email addresses, thousands of private messages) via database key in client-side JavaScript. Row Level Security not configured. Exploitable in under three minutes.
Cybernews, 2026
Quittr: 600,000 users’ data exposed via publicly readable Firebase database. Records included ages, behavioral triggers, and personal confessions. Co-founder contacted 10 September 2025; fix took months.
GuardMint, Q1 2026
200+ vibe-coded apps audited: 91.5% contained at least one security vulnerability.
Red Access / Shadow Builders, 2026
380,000+ publicly accessible vibe-coded apps on the open web; 2,000+ actively leaking sensitive data. “Default-public is the breach.”
getautonoma.com, 2026
Audit of 12 production-ready vibe-coded apps: 4 had live credentials in front-end code. Enrichlead: subscription bypass found within 72 hours of launch.
CVE-2025-48757, 2025
Lovable AI-generated access control logic inverted across 170 production applications. Authenticated users blocked; unauthenticated users had access.
The Next Web, 2026
Lovable platform: broken authorisation flaw, 48 days. Every user’s source code, database credentials, and AI chat histories readable by any logged-in user. Bug marked “intentional behavior,” not escalated. Fixed in two hours once public.
TechCrunch, 2026
Lovable: $6.6 billion valuation, approximately eight million users.
TechTimes, May 2026
63% of vibe coding users have no coding background.
Patient Comet · Economics · June 2026 · 14 min read
LLMflation: The Price of the Wrong Hands
In May 2026, Microsoft cancelled its Claude Code licences after engineers burned through the company’s entire annual AI budget in months. At the same time, one unnamed enterprise received a $500 million bill from Anthropic for a single month. Both made the same mistake: treating frontier AI as a universal tool rather than a professional resource.
The price of AI is falling faster than almost any technology in history. The bill is rising anyway , because ungoverned access to the most expensive models, combined with the multiplication effect of agentic AI (software that takes sequences of actions without a human approving each step), overwhelms any per-unit saving. This is about the economics behind both bills: the hidden meter running inside your infrastructure, why AI amplifies skill in both directions (and amplifies the absence of it even faster), and what the six per cent of companies that have solved this actually do differently.
N
Nadim A. Massih
Patient Comet · June 2026 · 14 min read
$500M
Spent by a single enterprise on Claude in a single month, one uncontrolled agentic deployment, no usage caps, access granted to the entire workforce (cybernews.com, May 2026)
6%
Companies that have meaningful AI cost control , the rest are watching a meter they cannot read (McKinsey, 2025)
1,000×
More tokens an agentic AI task burns compared with a standard conversation (Stanford Digital Economy Lab, 2025)
What these three mean together: Most companies have handed their entire workforce access to the most expensive AI tools available, with no skill requirements and no way to measure whether the output was worth what it cost. The meter is running. Six per cent of companies know what it is reading.
The Bill That Changed the Conversation
In May 2026, the story that changed how enterprise AI is discussed did not make every front page.
The number that changed every CFO conversation in 2026 surfaced quietly. An unnamed company had run up a $500 million bill on Anthropic’s Claude in a single month. Not through negligence, exactly , through deployment. They had given their entire workforce access to the model, with no usage caps and no governance. The agentic workflows (automated loops where the AI plans, acts, reads the result, and acts again, re-sending the full conversation at every step) had turned what looked like a modest per-query cost into a figure that exceeded most companies’ entire annual technology budget (byteiota.com, May 2026).
Microsoft’s story broke around the same time and was more specific. The company had invited thousands of engineers in its Experiences and Devices division to use Claude Code starting in December 2025. By May 2026, the tool had become the preferred choice inside the division. Token costs per engineer began outpacing what Microsoft paid those engineers in salary. The company set a June 30 cutoff and redirected engineers to Copilot CLI (Cybernews; The Next Web, 2026).
Uber had got there first. Its operations chief told The Information in April that the company had burned through its entire 2026 AI coding budget in four months , roughly 5,000 engineers on Claude Code, the heaviest users spending $500 to $2,000 each per month individually (Tom’s Hardware, 2026).
None of these are outliers. They are case studies in a structural problem: the price of AI has collapsed, so every new use case looks affordable. Then you do the arithmetic on volume.
The term for the dynamic: LLMflation, coined by a16z (Andreessen Horowitz, a major Silicon Valley venture fund) in 2024 to describe the paradox where AI costs per call keep falling while AI bills keep rising. A model good enough to pass a standard knowledge test cost roughly $60 per million tokens (each token is roughly three-quarters of a word) in late 2021. By 2024 it was $0.06 , a fall of roughly 1,000× in three years, faster than the PC era’s decline in compute. But AI bills per action, not per seat. Lower prices triggered far more actions, not lower bills. Token prices fell roughly 1,000× over three years while total enterprise AI spend rose more than 300% in two (a16z, 2024; Ramp, 2026).
Why Microsoft cancelled Claude Code: not because the tool failed, but because AI billing breaks the maths when engineers use it constantly. Click to watch on YouTube. (AI analysis, 2026)
The LLMflation paradox
Token prices fell roughly 1,000× over three years. Enterprise AI spend rose more than 300% in two. This is LLMflation. (a16z, 2024; Ramp, 2026)
The Meter Never Stops
Three things are happening simultaneously inside most enterprises. They look unrelated. They are the same problem.
The first: usage consistently outruns the forecast. In 2025, 85% of companies missed their AI cost projections by more than 10%; 24% missed by more than 50% (Mavvrik / Benchmarkit, 2025). Two years ago, 31% of finance teams actively managed AI spend. Today it is 98% , the number doubled not because companies got disciplined, but because the bills arrived and demanded attention (State of FinOps, 2026).
The second: agents turn the meter into a fire hose. A standard AI chat interaction uses a few thousand tokens. A million tokens is roughly a 750,000-word document , the equivalent of asking the model to read and write War and Peace from scratch, for every complex task. An agentic workflow re-sends the entire conversation context (the AI’s working memory) at every step. Industry data puts agent token use at 10-100× a standard chat. Stanford’s Digital Economy Lab measured the most complex agentic tasks at 1,000× a standard reasoning call. Simple tool-calling agents consume 5,000-15,000 tokens per task; complex multi-agent systems consume 200,000 to over 1,000,000 tokens per task (Stanford Digital Economy Lab, 2025). Goldman Sachs estimates agents will multiply total enterprise token demand 24× by 2030.
The third: the most expensive model is rarely the right model. The price gap between the cheapest production AI model and the most expensive frontier reasoning model is currently about 4,500×. The cheapest runs at around $0.04 per million tokens. The most expensive frontier reasoning models run at $180 per million tokens. Most enterprises have deployed the top tier to everyone. Analysis consistently shows that roughly 85% of enterprise queries could be handled by budget-tier models (the ones at the $0.04 end) with no meaningful quality loss (FinOps Foundation (the industry body tracking AI infrastructure costs) 2026).
Match the model to the task
The price gap between the cheapest and most expensive model tier is approximately 4,500×. Most enterprises deploy the top tier to everyone. (Industry pricing, 2026)
The hidden overhead compounds everything. OpsLyft (an AI cost analytics firm) found that most teams underestimate their true AI bill by 40-60%. A call that looks like $0.05 typically lands at $0.20 once embeddings (text to searchable numbers), retrieval, re-ranking, output validation, retries, and prompt overflow are counted (OpsLyft, 2026).
This Comes Straight Off Your Margin
Inference is not a technology line item. It is cost of goods sold, the costs that come directly out of gross margin, the number investors use to price companies.
84% of companies reported that AI was already eroding gross margins by more than 6% (Mavvrik / Benchmarkit, 2025). Bessemer’s portfolio data puts AI-native gross margins at 50-60%, against 70-90% for mature SaaS, software economics is quietly turning into manufacturing economics (Bessemer, 2025). Even OpenAI, the company whose products power much of this spending, ran $13B in revenue against $22B in costs in 2025 , a $9B net loss at scale (Fortune, 2025).
The pattern is consistent: the more AI gets used, the more it costs, and the less of that cost is recovered in measurable output. This is not a price problem. It is a volume and governance problem. The companies that have protected their margins are not the ones with cheaper models. They are the ones that controlled who was using which model for what. There is a reason for that, and it comes before usage.
The Access Problem
The piece most AI cost discussions miss is the one that makes every other problem harder to fix.
In 2026, Meta created internal leaderboards ranking employees by token consumption , a practice that quickly became known as “tokenmaxxing” (spending tokens competitively, not productively). Employees competed to spend the most tokens, throwing entire projects at agentic AI without any discipline about whether the output was good, whether a cheaper model would have worked, or whether the task was suited to AI at all. The leaderboard rewarded consumption, not outcomes (industry reporting, 2026).
Microsoft’s experience tells a version of the same story. The engineers who used Claude Code most heavily were not necessarily the ones producing the best software. Token costs outpaced salaries. And Faros AI found that code churn (lines of code deleted versus added, a measure of rework) increased by more than 800% in teams with high AI adoption. The teams spending the most on AI were producing the most waste (Faros AI, 2026).
The data points to something the economics alone do not explain: AI amplifies what you bring to it.
It is worth saying plainly that broad access to powerful tools has genuinely produced unexpected value. The marketer with no technical background who caught a product flaw through an unusual prompt. The junior analyst who used AI to try an approach a veteran would have ruled out on instinct, and was right. Ungated access is not without merit. The argument here is not for restriction; it is for matching model tier to demonstrated task competence, with a clear path to qualification for anyone willing to earn it.
That said, the amplification argument is real and the data is hard to ignore. A senior engineer with fifteen years of experience using Claude Code produces faster, better-reviewed code at lower cost, because they can evaluate the output, catch the errors the AI cannot catch, and prompt with precision. Evaluation means spotting when the model has hallucinated a citation (invented a source that does not exist), reversed a logical condition in code, or produced a plausible-looking answer that is wrong in a way only domain knowledge reveals. A junior developer with six months of experience using the same tool produces faster, worse code, because they cannot tell good output from plausible-looking output, they iterate blindly on prompts they cannot evaluate, and they burn tokens on approaches that an experienced engineer would immediately dismiss.
The 4,500× price gap between the cheapest and most expensive models is not the real cost driver. The real driver is the gap between what a skilled user produces per dollar and what an unskilled user produces per dollar. A budget model in the hands of someone with deep domain expertise costs almost nothing and produces real value. A frontier reasoning model in the hands of someone without the background to evaluate its output costs $180 per million tokens and produces fast, confident, unreviewed work that someone else will eventually have to fix.
This is not a controversial idea when applied anywhere else. We do not give junior surgeons operating theatres without supervision. We do not give junior analysts sole authority over financial models that go to a board. We understand, in every other context, that tools amplify skill, and amplifying the absence of skill at speed is not progress. We have simply forgotten to apply this to AI.
Access to frontier AI models (most powerful, most expensive) is a professional resource. It should be treated like one.
How agents multiply token cost
The shift to agentic AI did not change what the model charges per token. It changed how many tokens each task uses. (Stanford Digital Economy Lab, 2025)
Access to frontier AI models is a professional resource. The companies that figured this out first are the ones paying half as much for twice the output.
Govern the Usage and the Access
The companies that have escaped LLMflation, the roughly 6% McKinsey calls “AI high performers”, are not the ones with the cheapest models. They are running two governance layers simultaneously: usage controls and access controls.
Usage governance: four controls that compound
Route every task to the cheapest capable model. A cascade architecture (routing tasks to cheaper models) sends simple calls to the budget tier, escalates complex ones to frontier models only when the task requires it. Analysis consistently shows this cuts inference (the cost of running the model) cost 40-70%, because most queries never need to leave the budget tier (FinOps Foundation; arXiv, 2025).
Cache the answers you have already paid for. Semantic caching (reusing stored AI answers) removes about 31% of repeat queries before they reach the model. Prompt caching (reusing unchanged prompt sections) makes cached tokens roughly one-tenth the standard price. Add batch pricing on top (roughly 50% discount) and a call that would have run at $0.20 runs at about $0.01 , about 5% of standard cost (OpenAI; Anthropic API docs, 2026).
Cap the spend before it surprises you. Token budgets as hard ceilings per team and per workflow. Fix retry logic that can loop indefinitely. Define maximum context windows for each use case. The $500M bill happened because there were no ceilings.
Measure cost per outcome, not cost per call. “Cost per call” tells you almost nothing. “Cost per resolved customer query,” “cost per code review,” “cost per draft approved” , these are the numbers that tell you whether the spend is working. The vendors that price this way, Intercom at $0.99 per resolved query, Salesforce Agentforce at roughly $2 per conversation, already know their cost per outcome. The discipline turns the meter into a dial (Intercom, 2026; SaaStr, 2026).
Access governance: the layer most companies skip
Tier model access by role and demonstrated competence. Not everyone needs frontier reasoning. Not everyone should have it. The qualification for moving up a tier is not seniority, it is demonstrated domain knowledge. The ability to evaluate what the model produces in that domain and catch what it cannot catch.
The upgrade is available to anyone. A junior developer on the budget tier earns frontier access when they can demonstrate they can evaluate what a frontier model produces in their domain, spotting the errors, assessing the quality, catching the hallucinations. That is not hierarchy preservation. It is competence development with a measurable gate.
The advantage is real, and it is unequal: when skilled engineers have frontier access and unskilled ones are routed to budget tiers, the skilled engineers work faster and the budget recovers its balance. The companies with ungated access to everything get neither the speed nor the savings.
Run both governance layers and the cost reduction runs to 60-90% against ungoverned baseline, according to industry analysis (FinOps Foundation, 2026).
Before the verdict, the three honest positions deserve a fair hearing.
Bubble, or Breakthrough?
Three honest positions, all with something right in them.
The builder
“AI is cheap and getting cheaper. Building cost controls too early slows down experiments that might be valuable.”
This is half right , over-governing early-stage experiments can kill them. Where it fails: “too early” was 2021. In 2026, the bill has arrived.
The bear
“Do not build on prices propped up by a loss-making machine.”
OpenAI’s $9B net loss, roughly $1 trillion in AI infrastructure pledged across seven major vendors in overlapping commitments, and Gartner’s estimate that 25% of planned 2026 AI budgets will slip to 2027, these are real signals (The Register; Calcalist, 2025; Gartner, 2026). Two-thirds of companies already plan to move some workloads back to their own hardware.
The operator
“Both describe the same reason to run with controls.”
The strongest counter-argument to this article’s access governance prescription is worth hearing directly: skill-gated access preserves existing hierarchies rather than disrupting them. The companies that will win are the ones that train everyone to use AI well, not the ones that restrict it to people who already know what they are doing. That is a genuinely strong position. The answer to it is in the framing: this is not permanent restriction, it is a qualification path, open to anyone willing to demonstrate they can evaluate the output in their domain. If prices are artificially low, they will rise , and the companies that know their cost per outcome will adapt. Control wins under either future.
The Take
Ungoverned Access Is the Real Bill
LLMflation is not a price problem. It is a volume and access problem , and the companies solving it are the ones who decided, before deploying anything, who needs access to what and why.
Microsoft cancelled the Claude licences because the engineers used the tool. That is not a failure of the tool, it is a failure of the access model. When every engineer gets frontier AI access with no ceiling and no qualification requirement, the ones who use it most are not necessarily the ones producing the best work. They are the ones most comfortable spending tokens on tasks they cannot fully evaluate.
The $500M bill is the same story at enterprise scale: no caps, no tiers, no way to measure whether the output was worth what the month cost.
The price of AI is falling faster than almost any technology in history. The bill is rising because ungoverned access at scale, combined with the multiplication effect of agentic AI (software that takes sequences of actions without a human approving each step), overwhelms any per-unit saving.
The companies that solve this are not the ones negotiating better token rates. They are the ones that have decided, before deploying anything, who needs access to what and why. They measure cost per outcome. They route tasks to the cheapest capable model. They gate frontier access to people with the domain competence to use it properly , and they make the qualification path clear and earnable.
Six per cent of companies have built this. The other ninety-four are paying for the gap (McKinsey, 2025).
Where to start
Get one honest cost-per-outcome number this quarter. Pick one use case, customer query resolution, code review, document drafting. Measure what it costs per completed outcome. Not per call, not per seat. That one number will tell you more about your AI economics than any tool audit.
Audit who has access to what model tier. List every frontier model deployment in your organisation. For each one, ask: does the person using this have the domain expertise to evaluate what it produces? If the answer is no, move them to the standard tier and define what qualification looks like for the upgrade.
Route before you spend. Before your next AI deployment, define which tasks go to which tier. Budget for summarising and drafting. Standard for analysis. Frontier only for tasks where the specific capability of the frontier model is the difference between a good outcome and a bad one.
Cap before you surprise your board. Set token budget ceilings per team and per workflow before you deploy, not after the bill arrives. Fix any retry logic that can loop. Define maximum context windows. The ceiling is not a constraint on productivity, it is the thing that makes AI spend predictable.
The right AI tool in the right hands costs almost nothing and produces real value. The wrong AI tool in the wrong hands costs $180 per million tokens and produces confident, fast, unreviewed work that someone else will fix.
NWritten byNadim A. MassihAI & Tech StrategistMore articles
Common questions
Questions, answered first
If AI is getting cheaper, why is the bill going up?
Because AI prices per action, not per seat. When each call gets 1,000× cheaper, companies find 1,000 new things worth doing , and the total volume climbs far faster than the price falls. Token prices fell roughly 1,000× over three years while total enterprise AI spend rose more than 300% in two.
What is LLMflation?
The paradox where the price of each AI call keeps falling but the total cost of AI keeps rising, driven by volume, agentic multiplication, and the tendency to use the most expensive model for every task regardless of whether it is needed.
Why did Microsoft cancel Claude?
Cost. Engineers used it heavily, it became the preferred tool over Microsoft’s own Copilot , and token costs per engineer began outpacing what Microsoft paid those engineers in salary. The company cancelled most licences in its Experiences and Devices division with a June 30, 2026 cutoff and redirected developers to Copilot CLI, which Microsoft also has a strategic interest in building adoption for.
Why does experience matter for AI access?
AI amplifies what the user brings. A domain expert using AI produces faster, better, more reliable work , because they can evaluate the output, catch the errors, and prompt with precision. Evaluation means spotting when the model has hallucinated a citation (invented a source that does not exist), reversed a logical condition in code, or produced a plausible-looking answer that is wrong in a way only domain knowledge reveals. A person without domain background produces output they cannot evaluate, burns tokens on approaches an expert would immediately dismiss, and ships fast, confident, unreviewed work. The cost is tokens. The risk is what those tokens produced.
What is the cheapest way to cut an AI bill?
Route (send simple tasks to budget models), cache (stop re-paying for queries already answered), cap (set hard ceilings before deploying), and measure cost per outcome. Combined effect: 60-90% cost reduction against ungoverned baseline. None of these require new models or new vendors.
Receipts
Sources & references
byteiota.com, May 2026
$500M spent by one enterprise on Anthropic’s Claude in a single month following uncapped employee access and agentic loop deployment.
Cybernews / The Next Web, 2026
Microsoft cancels Claude Code licences in Experiences & Devices division (Windows, Microsoft 365, Outlook, Teams, Surface), June 30, 2026. Token costs outpaced engineer salaries. Engineers redirected to Copilot CLI.
Tom’s Hardware; eeNews Europe, 2026
Uber burned through 2026 AI coding budget in four months; ~5,000 engineers on Claude Code; heavy users $500-$2,000/month individually. GitHub Copilot moved to usage-based billing June 2026.
Faros AI, 2026
Code churn (lines deleted vs added) increased 800%+ under high AI adoption. More AI tokens did not produce better code.
Ramp, 2026
Average business spends 13× more on AI than at start of 2025.
a16z “LLMflation,” Nov 2024; Epoch AI
Roughly 1,000× cost decline per million tokens 2021-2024; $60 to $0.06.
OpsLyft, 2026
True AI bills run 40-60% higher than visible token cost once hidden overhead is counted.
Agentic tasks consume 10-100× standard chat tokens; most complex tasks roughly 1,000×.
Mavvrik / Benchmarkit, 2025
85% of companies miss AI cost forecasts by more than 10%; 24% by more than 50%; 84% see gross margin erosion more than 6%.
Bessemer, 2025
AI-native gross margins 50-60% vs mature SaaS 70-90%.
FinOps Foundation; arXiv, 2025
Routing / cascade architecture cuts inference cost 40-70%; all four controls combined: 60-90%.
OpenAI; Anthropic API docs, 2026
Prompt caching roughly 90% off cached tokens; Batch API roughly 50%; stacked to about 5% of standard cost. Semantic caching removes about 31% of repeat queries.
Intercom, 2026; SaaStr, 2026
Fin at $0.99 per resolved query; Salesforce Agentforce roughly $2 per conversation.
McKinsey, State of AI, Nov 2025
About 6% of companies are AI high performers; more than 80% see no clear enterprise bottom-line effect.
Goldman Sachs, 2026
AI agents will multiply enterprise token demand roughly 24× by 2030.
Gartner, 2026
25% of planned 2026 AI budgets will slip to 2027.
The Register; Calcalist, 2025
About $1T in AI infrastructure pledged across the same seven vendors in overlapping commitments.
Meta internal reporting, 2026
“Tokenmaxxing” leaderboards; employees ranked by token consumption.
Patient Comet · Infrastructure · 30 April 2026 · 12 min read
Own Your AI
AI is shifting from a service you subscribe to, to a feature you ship. For a decade, that meant plugging into OpenAI, Google, or Anthropic, paying for every small unit of text processed (a token), sending your customers’ data to their servers, renting intelligence you did not own. That model just broke.
DeepSeek trained a world-class AI model for $294,000 and released it free. Google’s Gemma 4 runs on a single computer card. Apple ships AI that never sends data anywhere. Together, these three events mean that AI is shifting from a service you subscribe to, to a feature you ship. This is what that means for anyone who builds software , and for the organisations they build it for.
N
Nadim A. Massih
Patient Comet · 30 April 2026 · 12 min read
$294K
to train a world-class open AI model, then released free , the era of renting intelligence is ending (Nature, 2025)
1 GPU
is all the hardware Google’s Gemma 4 needs to beat models ten times its size (Google, 2026)
$0
on-device inference (running directly on the device, not in a remote cloud) that never sends data anywhere , the cheapest path is now the safest (Apple, 2025)
What these three mean together: training a world-class AI model now costs less than a family car; running it takes one piece of hardware you can own; and the cheapest option is also the safest one for your data. Together, they mark the moment AI stopped being something you rent from three companies in three data centres , and became something you can build into the products you ship.
The AI Subscription Is Breaking
For the past five years, shipping an AI feature meant one thing.
You built a product. You had an idea for an AI feature (a smart search, an automated summary, a document assistant). You signed up for an API (a connection to someone else’s AI server), and it worked. Impressively, immediately, well.
Your customers loved it.
What they did not see was what happened on every query. Their data (their documents, their messages, their records) left your product, travelled to a server owned by Microsoft, Google, or Anthropic, got processed by a model you did not own, and came back as an answer. You paid for every token, every word, every interaction. The cost scaled with your users. The faster you grew, the larger the bill.
The intelligence was rented. Your product was the front end. The AI lived somewhere else.
For most teams, this was the only option. The models worth using required infrastructure so large and expensive that building or owning them was simply not realistic. You connected to the cloud because the cloud was where the intelligence lived.
That assumption just broke. And it broke on three separate occasions, between January 2025 and April 2026.
Three Events That Changed the Math
Each one dismantled a different part of the old model.
January 2025: DeepSeek releases R1. DeepSeek (a Chinese AI research lab) released a reasoning model (an AI system built to think through complex problems) and published the full method openly. When the training cost emerged in a peer-reviewed Nature paper, it landed like a small bomb: $294,000(Nature; CNN, 2025). Not $294 million. Less than the price of a family car. For a model that matched the best in the world, then given away free.
The markets understood immediately. The first R1 release had already wiped roughly $589 billion off Nvidia (the company that makes the specialist chips AI runs on) in a single day , the largest single-day loss in US market history (CNBC, 2025). Investors were not frightened of one lab. They were frightened of the assumption under their entire position: that only the very largest companies could build intelligence worth using.
April 2026: Google releases Gemma 4. Google’s Gemma 4, at 27 billion parameters (a rough measure of how much the model has absorbed, like a proxy for its depth of knowledge), beats models ten times its size on human-preference testing (including Llama 405B, Meta’s large open model , and does this on a single GPU (the specialist processor AI uses) (Google, 2026). A model you could host yourself, on one card you own, winning blind taste-tests against rivals that need a computing cluster.
Ongoing 2025: Apple ships AI on the device. Apple builds a roughly three-billion-parameter model directly into its devices, available to every developer, with inference that costs nothing and runs locally , meaning the data never leaves the phone (Apple, 2025). No contract. No server. No log sitting somewhere waiting for a legal request.
Three events. One implication. The ingredient that made powerful AI expensive and remote has become something a product team can own, fine-tune, and ship. The question is no longer whether you can build AI into your product without the cloud. The question is whether you are going to.
Training a world-class AI model: what it cost
The cost of training a frontier-class AI model has fallen from over $100 million in 2020 to $294,000 in September 2025, and inference cost is falling at roughly the same rate. (Nature/CNN 2025; a16z, 2025)
What Builders Can Now Do
Until recently, shipping an AI feature meant one thing: connect to an API and rent someone else's intelligence. An API (a connection that lets your product borrow intelligence from an external service) was the only viable route. Your product was the shell; the intelligence was always somewhere else, always rented, always metered.
Open models (AI models whose design and weights are published freely) change the product equation entirely.
A software team can now take one of these models, fine-tune it (adapt its behaviour by training it further on their specific domain and data), and ship it as a permanent, built-in part of their product. The model travels with the software. When a customer buys the product, they get the AI too.
Not a subscription to the AI. The AI itself.
The product model is changing
The old model required every AI interaction to leave the product and reach a third-party server. The new model puts the intelligence inside the product. Same feature, different architecture, and a fundamentally different business.
Think carefully about what this removes.
No per-query token costs at scale. Once the model is built in, the marginal cost of an AI interaction drops to the compute the customer already owns. No external API dependency. The product works offline, in environments where data cannot leave the building: hospitals, law firms, government offices, banks. And no third-party subscription invisibly embedded in your pricing.
The customer owns what they paid for. Completely.
What happens to your margins as you scale
Under a cloud API model, AI costs scale directly with your user base, compressing margins as you grow. With a built-in model, AI cost is largely fixed. Growth stops working against you.
Now think about what it creates.
A product with a fine-tuned model built in is structurally harder to replicate than one that connects to a shared API. A competitor cannot switch to a better API endpoint and close the gap overnight. The model, trained on your domain knowledge, shaped by your users’ actual needs, integrated into your product’s logic, becomes part of what you ship, and part of what makes it yours.
The pricing model changes too. SaaS products with AI features charge recurring subscriptions partly because they pass through API costs. When the model is built in, that cost disappears. You could sell the software once. Or with a simpler subscription. The AI is included, like a camera in a phone, not a streaming service on your phone.
One honest note: fine-tuning a model for production is a genuine engineering effort. Tools like Hugging Face and Unsloth have made it achievable without a research lab, but it requires a competent ML engineer, proper evaluation, and a realistic timeline. It is not a weekend project. It is, however, now within reach for any well-resourced product team, which would not have been true two years ago.
Apple understood this at the operating system level. The on-device model in every device is not an add-on you pay extra for. It is the product. Every software builder now has the same option at the application level.
That option opens markets that the subscription model could not reach at all.
What This Unlocks for Regulated Industries
There is a version of the builder opportunity that is not just about economics. It is about which markets you can serve at all.
For a significant and fast-growing portion of the software market, cloud AI is not a choice. It is off the table.
A medical device company cannot sell a diagnostic tool that sends patient data to a US cloud server under GDPR and HIPAA. A legal technology firm cannot win enterprise contracts in regulated jurisdictions if their AI feature sends every query to OpenAI. A government software supplier cannot pass a security review if the intelligence in their product lives in a data centre they do not control.
For these markets, the product that wins is the one where the AI runs locally, the data never moves, and the intelligence ships with the software.
This is not a compliance headache. It is a competitive opening, and it just got significantly larger.
In June 2025, the legal counsel of Microsoft France was asked under oath at a French Senate hearing whether he could guarantee that data stored in France by Microsoft would never be passed to US authorities without French approval. His answer was four words.
“Non, je ne peux pas le garantir.” No. I cannot guarantee that (The Register, 2025).
Under the US CLOUD Act (a 2018 law giving American authorities the right to demand data from US-headquartered companies regardless of where that data physically sits), an EU data region gives you lower latency and a reassuring label. It does not give you jurisdiction. A Microsoft executive said so, on the record, to a parliament.
The legislative response has followed. On 27 May 2026, the European Commission proposed restricting Microsoft Azure, AWS, and Google Cloud from processing financial, judicial, and healthcare data across all 27 EU member states (CNBC, 2026). Those three providers control roughly 70 per cent of Europe’s cloud market. The proposal carves out the exact categories of data where the most valuable enterprise software operates.
A product maker who ships with a local, fine-tuned model is not just removing an API dependency. They are entering markets that their cloud-dependent competitors structurally cannot. That is a durable advantage, because the legislative direction is accelerating, not reversing.
The subscription model worked when intelligence was scarce. It is not scarce anymore.
That said, local models do not win every situation. The honest version of this decision has four camps.
What runs where: the four tiers of AI in 2026
Tier
Model
Runs on
Approx. cost
Data in-house?
Best for
On-device
Apple (~3B params)
Your device
Free
Yes
Mobile apps, sensitive consumer data
Self-hosted open
Gemma 4 / Llama 4 (27-70B)
One GPU you own
£15-40K hardware
Yes
Most business tasks, document processing
Mid-tier cloud
GPT-4 class APIs
Cloud (shared)
Per-token
No
General reasoning, low-volume tasks
Frontier closed
o3, Gemini Ultra
Cloud (proprietary)
Premium per-token
No
Hardest agentic work, frontier reasoning
Source: Google DeepMind, 2026 · Apple, 2025 · a16z, 2025. “Data in-house” means the workload data never leaves your infrastructure. PATIENT COMET
When Cloud Still Wins
Local models do not win everything. The honest version of this decision has four camps.
The owner says
“Sovereignty stopped being optional. Every cloud call is a copy of the crown jewels leaving the building. Parity has arrived for most of what we do: the disciplined move is to stop renting our own confidentiality back.”
They are right about the risk, right about parity for most everyday tasks, and right that the default needs to be challenged. The Microsoft Senate testimony is not an abstract legal warning. It is a documented fact about the present.
The renter says
“The gap that matters has not closed. The frontier still leads on the hardest work.”
On deep multimodal reasoning and complex multi-step agentic tasks, the hardest, highest-value analysis, closed frontier models still lead. What you rent from a cloud provider includes reliability guarantees, enterprise support, and someone else’s engineering team on call at three in the morning. Below serious volume, a cloud API almost always wins on price.
The router says
“Hybrid is the only honest answer , but be clear-eyed about what it costs.”
Self-hosting is not a binary switch. A single server capable of running a production-grade open model costs between £15,000 and £40,000, and IDC research suggests hidden costs add another 40 to 60 per cent on top (IDC, 2025). Below roughly £2,000 to £3,000 per month in API costs, the cloud almost always wins. Above roughly 100 million queries per month, self-hosting saves millions annually (Silverthread Labs, 2026).
The compliance-mandated mover
“Our regulator has already decided. Our job is execution.”
For organisations in regulated European sectors, and for the product makers who serve them, the debate is close to resolved by law. If the European Commission’s Tech Sovereignty Package passes as proposed, the routing decision for financial, judicial, and healthcare data will have been made by legislation. Move deliberately, and move early.
When self-hosting beats the cloud on cost
Below roughly £2,000-3,000 per month in API spend, the cloud almost always wins on price. Above around 100 million queries per month, self-hosting can save millions annually. (Silverthread Labs, 2026; IDC, 2025)
Where I stand
The router wins the argument, but only when the routing is designed rather than defaulted.
The owner is right that the old assumption has expired. The renter is right that the frontier gap is real on the hardest work. Both observations are correct and neither is a complete policy on its own. The mistake is letting either one become the answer for everything.
The organisations, and the product teams, that come out ahead will be the ones that make a genuine per-workload decision: sensitivity, volume, capability required. Write it down. Apply it consistently. Do not revisit it every time a new model is announced. That one-page document is worth more than almost any model selection you make this year.
Four moves do most of the work once you decide to act on this.
Which workload goes where: a routing framework
A simple routing framework. Sensitive, high-volume workloads go local first. Occasional, complex reasoning stays cloud. The framework is the same whether you are consuming AI or building it into a product.
Four Moves for Builders
The engineer’s deliverable, before and now
When the code is the cheap part, shipping the code is not the job. The new deliverable is the reasoning chain: the spec that defines it, the decision record that explains it, and the verification that proves it did what you meant.
The moves below are written for anyone running or shaping a team. They are all expressions of the same idea: move human attention from items 1-10 to items 11-15.
1
Fine-tune on your domain and ship the model with your product
The generic open model is the starting point, not the destination. Fine-tune it on your specific domain (legal clauses, medical terminology, financial documents, customer support patterns) and it becomes a meaningfully better product for your users, at no additional per-query cost. Budget for it as a proper engineering project: a competent ML engineer, several weeks of work, and a rigorous evaluation process. The payoff compounds as your user base grows.
Product engineering
2
Build retrieval into the product: the model reads, not copies
Retrieval means the model queries your customer’s documents at the moment they ask a question, rather than those documents being stored or copied anywhere. The customer’s data stays on their infrastructure. The model reads it in place, returns an answer, and nothing leaves. This architecture is what makes your product viable in legal, medical, and financial markets, and worth building correctly from the start.
Data architecture
3
Know which markets need local: go there first
European regulated sectors are the clearest immediate opportunity: financial services, healthcare, government, legal. These markets are where cloud AI is increasingly constrained by law, and where a locally-running, data-sovereign product wins on architecture before the sales conversation even starts. The EU Tech Sovereignty Package and the CLOUD Act exposure of US cloud providers are moving this market in your direction. Position deliberately.
Go-to-market
4
Ship with a model you can upgrade, not one you are married to
The open-model release cadence is fast: Gemma 4 succeeded Gemma 3 in months; Llama 4 succeeded Llama 3. Fine-tune in a way that keeps you portable: build your prompting and retrieval layer so the underlying model can be swapped when a better one arrives. Teams that fine-tune so deeply they cannot switch will spend 2027 maintaining a model that has already been superseded. Stay portable.
Engineering strategy
The Take
The Era of Renting Intelligence Is Ending
In 2025, a lab trained a world-class model for the price of a family car, then gave it away. In 2026, a Microsoft executive told a national parliament he could not protect data stored in his company’s European buildings. The European Commission responded by proposing to restrict three of the world’s largest cloud providers from the most valuable categories of enterprise data. These are not predictions. They are the current situation.
I think the shift underneath both of these facts is the one most product teams have not yet acted on. AI is transitioning from a service you subscribe to, to a feature you ship. That transition does not happen overnight and it does not happen for every use case: the cloud still wins on the hardest frontier work, and still wins below serious volume. But for most of what most software products actually do, the transition is already technically possible.
The builders who move first will find three things waiting for them: lower costs at scale, access to regulated markets that their cloud-dependent competitors cannot enter, and a product that is structurally harder to replicate because the intelligence is theirs.
The subscription model worked when intelligence was scarce. It is not scarce anymore.
What kind of AI would you ship inside your product if tokens cost nothing and the model was yours?
Where to start
Identify one AI feature you currently pay per-token for. Pick one that runs frequently on predictable inputs and handles sensitive data. That is your first candidate for bringing in-house.
Estimate what it costs you today. Pull three months of API invoices, attribute the cost to that feature, then project it as your user base doubles. That number is what changes with a built-in model.
Talk to one ML engineer this week. Ask: how long would it take to fine-tune an open model on our domain for this specific use case? Get a real estimate. Most teams are surprised by how achievable it has become.
Map your regulated-market opportunity. If you sell to healthcare, legal, financial, or government customers in Europe, find out specifically whether your current cloud AI architecture creates compliance exposure for them. Start that conversation before your competitors do.
What kind of AI would you ship inside your product if tokens cost nothing and the model was yours?
NWritten byNadim A. MassihAI & Tech StrategistMore articles
Common questions
Questions, answered first
Can a fine-tuned open model really match a frontier cloud model for my use case?
For domain-focused tasks (document processing, structured data extraction, customer support in a defined context), a well-fine-tuned open model frequently outperforms a generic frontier model. On open-ended complex reasoning and long agentic tasks, frontier closed models still lead. The only way to know for your specific workload is to run the benchmark. Do that before committing either way.
How much does it cost to fine-tune and host an open model?
Hardware for a production-grade open model server: £15,000 to £40,000. Engineering for a proper fine-tuning project: four to eight weeks for a small ML team. Ongoing hosting and maintenance: estimate 40-60% of hardware cost annually (IDC, 2025). Below roughly £2-3K per month in current API spend, cloud almost always wins on total cost. Above that, run the numbers for your situation.
Does an EU cloud data region protect our customers from US legal demands?
Not reliably. The US CLOUD Act allows American authorities to demand data from US-headquartered providers regardless of where the data sits. In June 2025, Microsoft France confirmed this under oath at a French Senate hearing. Genuine sovereignty requires a locally operated provider, or data that never leaves the customer’s infrastructure in the first place.
What is fine-tuning and do we actually need it?
Fine-tuning means continuing a model’s training on your specific data so it becomes better at your particular tasks. You do not always need it. For many use cases, a well-designed retrieval architecture works better and is cheaper to maintain. Fine-tuning makes most sense when you need the model to consistently follow domain-specific patterns or terminology. Start with retrieval. Fine-tune when retrieval is not enough.
What exactly is the CLOUD Act?
The Clarifying Lawful Overseas Use of Data Act (a 2018 US law giving American authorities the right to demand data from US-headquartered technology companies, regardless of where that data physically sits. It applies to Microsoft, Google, Amazon, and every other major US cloud provider, including when operating in Europe.
Can a model I run myself really compete with the big cloud ones?
For most real-world business tasks, yes, meaningfully so. A 27-billion-parameter open model on a single GPU now beats much larger cloud-only rivals on human-preference testing (Google, 2026). At the genuine frontier (complex reasoning, long agentic tasks) closed models still lead. Run the benchmark on your specific use case. That number, not the benchmark chart, is the one that matters.
Receipts
Sources & references
Nature / CNN, 2025
DeepSeek-R1 peer-reviewed on the cover of Nature; core reasoning training run approximately $294,000; became the most-downloaded open model in the world.
CNBC, 2025
Nvidia lost roughly $589 billion in a single day after the first R1 release; the largest single-day market loss in US history.
Google DeepMind, 2026
Gemma 4 released April 2026 under Apache 2.0 licence; beats much larger models including Llama 405B on human-preference testing; runs on a single GPU.
Apple, 2025
On-device model (~3B parameters) with free local inference; data stays on the device; available to all developers.
a16z, 2025
Inference cost falling approximately 10x per year; open-model enterprise adoption concentrated at larger, regulated firms driven by on-premise and compliance requirements.
The Register / French Senate, 2025
Microsoft France confirmed under oath at a French Senate hearing (June 2025) that it cannot guarantee data sovereignty for data stored in France against US authority demands.
CNBC / TechRadar, 2026
EU Tech Sovereignty Package proposed 27 May 2026; proposes restricting Microsoft Azure, AWS, and Google Cloud from processing financial, judicial, and healthcare data across all 27 EU member states.
IDC, 2025
Hidden costs of on-premise AI infrastructure represent 40-60% of total cost of ownership beyond hardware purchase.
Silverthread Labs, 2026
Self-hosting break-even: below ~£2-3K/month in API spend, cloud wins; above ~100M queries/month, savings of £5M-£50M annually.
Patient Comet · Discovery · 3 June 2026 · 12 min read
The Last Human Reader
In May 2026, Condé Nast (one of the world’s largest magazine publishers, home to Vogue, GQ, and Wired) CEO Roger Lynch sent a directive to every brand in his portfolio: plan as if Google search traffic will be zero. Not declining. Zero. He had been forecasting search cuts for three consecutive years, and each year Google had fallen faster than his model predicted. Google once sent a majority of Condé Nast’s readers to their pages. By 2026 it was roughly a quarter. Where it ends up, Lynch told his teams, is the low single digits.
The pages you publish are no longer primarily read by people. They are read first by machines: by crawlers, retrievers, and language models that synthesise an answer and decide whether to send a visitor your way. For most queries, for most publishers, that decision is increasingly “no.” This is about the structural inversion now underway in how the web works: what changed, why it happened faster than anyone predicted, and what the sites that are still standing actually do differently.
N
Nadim A. Massih
Patient Comet · 3 June 2026 · 12 min read
48%
of all Google searches now return an AI Overview instead of sending visitors to pages , up from 34.5% three months earlier (Digital Applied, 2026)
90%
of global search still flows through Google , the threat is not a rival engine, it is Google absorbing its own replacement (StatCounter, March 2026)
−58%
click-through rate drop when an AI Overview appears, measured across 300,000 keywords (Ahrefs, Dec 2025)
What these three mean together: Google has not lost its grip on search. It has changed what search does. AI Overviews answer the query before a click is needed. The platform that built the click-through economy is now the platform ending it. For publishers built on Google traffic, the intermediary is no longer neutral.
When the Reader Was Always Human
The web had always been built for a simple transaction: a person looks, a page loads, a person reads. The entire architecture of search engine optimisation was built around that transaction. SEO (search engine optimisation) is simply the discipline of making your page findable: visible to the right person at the right moment.
That model lasted thirty years, which, in technology, is an eternity.
What ended it was not ChatGPT. It was not Perplexity, or Google’s AI Overviews (AI-generated answer summaries that appear above search results), or any single product announcement. It was something quieter: the slow accumulation of a new primary reader.
Search engines have always had crawlers. Crawlers read your pages constantly, invisibly. They understood what you had written and surfaced it to humans who searched for it. The crawler was always there; the human was always the destination. The crawler served the human.
What changed is that the machine stopped serving. It started answering.
Google’s AI Overviews now appear on 48% of all search queries as of March 2026, up from 34.5% just three months before. In those queries, Google no longer sends the user to a page to read. It synthesises an answer from many pages, presents it at the top, and the user never clicks through. They have their answer. They are gone.
The click-through rate for pages that appear in AI Overview queries has dropped 58% according to Ahrefs (an SEO analytics and research platform) in their December 2025 study of 300,000 keywords. Seer Interactive (a digital analytics agency) measured 61% in their September 2025 research. The precise number depends on who measures it, but the direction is not ambiguous.
The machine is no longer a courier. It is the reader. That particular reader still has a name you know well.
AI Reads First Now
Google still controls 90% of the global search market. As of March 2026, StatCounter (a web service that tracks global search market share) puts it at 90.01%. All AI search engines combined, including ChatGPT, Perplexity, Claude, and Gemini Search, account for somewhere between 0.28% and 0.9% of referral traffic, depending on how you measure. The death-of-Google narrative is empirically wrong.
What is happening is not Google dying. It is Google absorbing its own replacement.
Google is building the machine reader into itself. AI Mode, the full conversational search interface Google launched at I/O 2026, crossed one billion monthly users, with queries more than doubling every quarter. The product that replaced the blue links is Google. The company cannibalising the click-through economy is the same company that created it.
The distinction matters practically. Google’s AI features run on RAG, or retrieval-augmented generation (a technique where Google’s AI searches its own index first, then composes an answer from those pages, rather than inventing one from scratch). In plain terms: it reads real pages before it writes a summary. Being well-indexed by Google still matters for being cited by Google’s AI.
But ChatGPT and Perplexity run their own independent crawlers (the software bots that read websites) and do not use Google’s index. Good Google SEO does not automatically translate to citation by those systems. The machine reader is plural, and each instance of it has its own appetite.
The structural shift, stated plainly: the pages you publish are now read, primarily, by a machine that decided your content is worth summarising to a human, rather than worth sending the human to read directly. The human may never arrive. The machine took what it needed.
For thirty years, the relationship was: search engine helps human find page. The page is the destination.
The relationship is now: AI engine helps human get answer. The page is the source material.
That shift, from destination to source material, is not a minor update to how search works. It is a different model of what the web is for.
The citation paradox: cited by AI, not visited by humans
Major publishers like Reuters are among the most-cited sources in AI systems. They receive less than 1% of referral traffic from those systems. Citation and traffic have decoupled. (Goodie, 2026)
Source: Goodie 2026 AI Search Traffic Report
The Traffic Number That Changes the Math
That data has a specific implication for anyone running a content-dependent business.
Roger Lynch’s “plan for zero” directive deserves to be understood in its full weight.
Lynch did not say Google traffic will become unimportant. He said his teams should build their editorial and revenue strategies as if it does not exist, because counting on it will produce a plan that fails. You plan for zero not because zero is certain, but because every year you plan for less-than-zero the actual number undershoots your model.
He is not alone. Chartbeat (a web analytics platform that tracks traffic for major publishers) data shows search referrals fell 60% for small publishers, 47% for medium publishers, and 22% for large publishers over two years. News publishers collectively lost more than 600 million monthly visits. Chegg (an online homework and tutoring company) declined 49%. Forbes, HuffPost, and Business Insider each lost roughly half their search-driven traffic.
These are not struggling operations that made wrong bets. These are established publishers with large audiences, real editorial teams, and decades of SEO expertise. They did everything right by the old model. The old model changed.
What Lynch named the shape of what remains: a barbell. Heavy at both ends. Nearly empty in the middle.
Large, authoritative brands with genuine audience trust are holding up. Small, niche publications with devoted direct subscribers are holding up. The middle, the vast category of SEO-dependent content built on traffic arbitrage, ad revenue, and Google as the primary distribution channel, is collapsing.
This is not a media story. It is a business model story. The barbell predicts the shape of every knowledge business over the next five years. The question for anyone who publishes, teaches, advises, or sells expertise on the web is: am I the authoritative end of the barbell, the loyal-niche end, or the exposed middle?
The practical answer requires understanding what the machine is actually reading, and why most of what you currently publish is invisible to it.
Zero-click searches now account for 60% of all Google queries. Publisher traffic fell 33% globally in 2025, with some news sites reporting drops as steep as 89%. (The Next Web, May 2026)
The machine does not read for keywords. It reads for credibility signals. Being cited is the new ranking signal.
What the Machine Actually Reads
The practical question, what do you do about this, runs directly into a problem: the machine reader does not rank things the way Google did.
With Google, you could build a strategy. You had data: which keywords brought traffic, how you ranked, how competitors ranked, where you had gaps. It was imperfect, but it was measurable. You could spend Tuesday afternoon in Search Console and come out with an action list.
There is no equivalent for AI citation. According to practitioner tracking across AI platforms, forty to sixty per cent of the sources cited by AI systems change month to month. There is no stable “position one” in ChatGPT or Perplexity. There is no dashboard showing how often AI Overview cited you. There are no reports showing which questions users asked conversational AI, every conversation is unique and private.
The most counterintuitive finding from all of the available data:
Reuters and The Guardian are two of the most frequently cited sources in ChatGPT and Perplexity. They receive less than 1% of referral traffic from those systems. They are cited. They are not visited. The machine read their work, extracted what it needed, and kept the visitor. Being cited by AI does not mean being visited. Citation is an awareness signal. Traffic is a separate transaction, and for most publishers, it is not happening.
This breaks the old logic entirely. You cannot optimise your way from “cited” to “visited” using the tools that worked in search. The game has changed.
Google’s own guidance, published in May 2026 as the company’s first official AI optimisation guide, is notable for what it debunks. Specifically: llms.txt files (a proposed standard for declaring your content to AI crawlers, similar in concept to robots.txt, the file that tells web crawlers which pages to access); “chunking” content into small pieces; and rewriting pages specifically for AI. Google says to ignore all three, and independent testing confirms no major AI crawler reads llms.txt. The consultants selling these services are selling solutions to problems that do not exist as described.
What Google says does help is simpler and harder: create content that a human would find genuinely useful, that could not be produced by an AI summarising other people’s work. The word they use repeatedly in the guide is “non-commodity.” Generic how-to articles, roundup posts, and content that merely restates what others have written will not be cited, because the AI can produce it itself without needing your page.
The Princeton KDD 2024 study (the only large-scale academic empirical test of AI citation optimisation, covering 10,000 queries) found three things that increase citation probability. Statistics improve AI visibility by 41%. Named expert attribution with quotation marks improves it by 28%. Inline citations to primary sources improve it by 115% for pages that do not already rank in the top results. Keyword stuffing degrades it. The machine reads for credibility signals, not keyword frequency.
What makes AI systems more likely to cite your content
The three content signals that measurably increase AI citation probability, from the only large-scale academic study of this question. (Princeton KDD 2024)
Source: Aggarwal et al., Princeton / IIT Delhi / Georgia Tech / Allen AI, KDD 2024
There is also a technical layer that matters: entity clarity (naming and declaring a specific, identifiable thing, your brand, your author, your publication, in a way that machines can reference with confidence) is becoming the primary competitive moat. A site that has declared its author name, its publication name, its topic area, and the relationships between its content using JSON-LD (a small code block that tells AI systems exactly who you are and what you publish) is easier for AI systems to cite with confidence than a site that is merely indexed. The gap between “we have schema markup” and “we operate a coherent content knowledge graph” is, according to SchemaApp (a structured data platform that measures AI citation rates), the primary factor in whether AI systems deeply understand a brand or just parse fragments of it.
But none of this is a ranking system. None of it produces a predictable number you can optimise toward. You are not building a strategy for a search engine. You are building a reputation with a reader that makes probabilistic judgements from large amounts of evidence. The play is identity, not tactics.
With the machine reader understood, the investment priorities become clear.
The Four Moves
The investment has four destinations. If you run a publication, a brand, or any knowledge-based operation that depends on being found and read, all four apply.
01
Build a machine-readable identity
Declare who you are in structured data using JSON-LD: Organisation schema naming your publication, Person schema naming your author, Article schema on every piece you publish, FAQ schema on the questions each piece answers. This is a one-time technical task (an hour for a developer, or a plugin for WordPress if you run one). Machines that want to cite you correctly need to know who you are. Most sites do not tell them.
Technical · One-time · High impact
02
Write in self-contained, extractable units
Every paragraph should be able to stand alone as a cited passage. Open each paragraph with its claim. Support it with a sourced number. Close with the implication. AI systems extract passages, not pages. A 2,000-word piece that buries its key claim in paragraph fourteen will be cited less frequently than one that opens each section with a quotable sentence.
Editorial · Ongoing · Immediate
03
Put named expertise on the page
Named authors, named credentials, named sources. The machine is biased toward content that looks evidentiary. First-hand accounts from identified humans with identifiable expertise outperform anonymous content on every citation metric. This is not about celebrity; it is about legibility. A named expert says “someone is accountable for this claim.”
Editorial · Ongoing · High impact
04
Measure citation presence, not traffic volume
The right question is no longer “how many people visited?” but “when someone asks an AI engine about my topic, does my name appear in the answer?” There are no perfect tools for this yet, but manually querying ChatGPT, Perplexity, Google AI Mode, and Claude on your core topics takes twenty minutes and gives you a real signal. Run it monthly. Track changes. Treat it as your new impressions metric.
Strategy · Monthly · New metric
None of this is new. It is a return to what good journalism always required: expertise, attribution, sourced claims, a clear identity. The difference is that now the first audience those things need to satisfy is a machine.
Three Ways to Think About This
None of this settles the argument entirely. There are three coherent positions on what all of this means, and they do not easily resolve , but all three are worth hearing before committing to any of them.
The optimiser
“Citation converts better. A smaller, motivated audience is worth more than a large passive one.”
AI engines send very little traffic by volume, 0.28% to 0.9% of total referrals, depending on who measures. But the visitors they do send convert at four to five times the rate of standard organic search traffic. Ahrefs’ internal data shows that AI-referred visitors, who were 0.5% of their total traffic, drove 12.1% of paid signups, a 23x conversion multiple for their specific B2B SaaS product (business software sold to other companies). The argument: a smaller audience of genuinely motivated visitors is worth more than a large audience of people who landed by accident and left in twelve seconds.
The resister
“You are cited. You are not visited. Build what AI cannot steal.”
Major publishers like Reuters and The Guardian are among the most frequently cited sources in ChatGPT and Perplexity, yet they receive less than 1% of referral traffic from those systems. The machines are reading their work, extracting the value, and returning almost none of the audience. The rational response is not to optimise for AI citation but to build the things AI cannot disintermediate: subscriber relationships, paid communities, proprietary data, live events, direct email. Condé Nast’s counter-move is concrete: 29% digital subscription revenue growth, treating direct paid relationships as the strategic replacement for search referrals.
The machine cannot steal a subscriber.
The sceptic
“Google has 90% of search. AI engines send under 1% of referrals. Do not panic.”
The disruption is real, but it is concentrated on informational and research queries, and it has hit certain content categories, including recipes, how-to guides, and educational explainers, far harder than others. Brand discovery queries, local search, commercial-intent searches, and navigational queries are largely unchanged. The advice to “plan for zero” makes strategic sense for a media conglomerate whose entire business model runs on Google referrals. It does not make the same sense for a local services business or a direct-to-consumer brand. Do not let the collapse of one traffic model become a panic about all of them.
Each of these is partly right. The converter data supports the first. The citation-without-traffic paradox supports the second. The market share numbers support the third. The mistake is picking one and ignoring the others.
The Take
The Play Is Identity
The sites that will survive this are not the ones with better SEO tactics. They are the ones with a clear enough identity that a machine can represent them accurately, and a loyal enough audience that losing the machine’s referral does not end the business.
The web is not ending. Reading is not ending. But the transaction at the centre of web publishing, a person searches, a page loads, a person reads, has acquired a new intermediary that was not there before. That intermediary has its own preferences, its own limitations, and its own criteria for what gets surfaced.
The barbell is the destination, not the threat. Build toward one end of it. Either become genuinely authoritative on a topic, the publication or the expert that AI systems cite because there is no credible alternative, or build a direct relationship with a specific audience that values what you do and will seek you out regardless of what any search engine shows them.
The middle is where being “pretty good at SEO” and “reasonably popular” and “broadly about AI” gets you. The middle is what Roger Lynch told his teams to stop depending on. The middle was always more fragile than the traffic numbers made it look.
The question is not “how do I optimise for AI search?”
The question is: what am I that a machine can trust and cite consistently, and what is my relationship with a human audience that does not require a machine’s permission?
Those are the only two questions that matter now.
Where to start
Run the 20-minute citation audit. Open ChatGPT, Perplexity, Google AI Mode, and Claude. Ask each one a question your publication or brand should be the answer to. Check whether your name appears. Do this monthly.
Add entity schema to every page. Install JSON-LD with Organisation, Person, and Article types. One-time task: an hour for a developer, or a plugin for WordPress. It declares your identity to every AI crawler permanently.
Rewrite one piece in claim-first structure. Take your best existing piece and rewrite so each section leads with a sourced, attributable claim: one paragraph, one claim, one source. Track whether your AI citation rate changes over 30 days.
Build one direct channel. Email, paid newsletter, or community. One channel where you reach your audience without a machine’s permission. Not as a hedge against AI. As the foundation that should always have been there.
The machine is no longer a courier. It is the reader. And it has decided, for most queries, that it already has what it needs.
NWritten byNadim A. MassihAI & Tech StrategistMore articles
Common questions
Questions, answered first
Is Google search actually going away?
No. Google controls 90.01% of global search as of March 2026 (StatCounter). What is changing is that Google is building the intermediary layer into itself: AI Overviews and AI Mode answer queries without sending visitors to pages. The threat to publishers is not a rival search engine. It is Google absorbing the “answer” function into its own product.
Does schema markup actually help with AI citation?
Yes, with an important nuance. Google’s own May 2026 guide says structured data is not required for AI visibility and there is no special schema that unlocks AI Overviews. But SchemaApp’s research shows that tier-one schema types (Organisation, Person, Article, FAQ) produce a 3:1 improvement in AI citation rate versus unstructured content. Schema helps, but it is a foundation, not a shortcut. The real advantage is building a coherent, linked entity layer across your entire site, not just adding a type to individual pages.
What is llms.txt and should I implement it?
No. An llms.txt file is a proposed standard for declaring website content to AI crawlers. As of October 2025, only 951 domains had implemented one, and independent testing showed zero visits from any major AI crawler (GPTBot, PerplexityBot, ClaudeBot). Google’s official May 2026 guidance explicitly says to ignore it. It is a solution without an adopter.
Why does being cited by AI not always lead to more traffic?
Two reasons. First, AI systems answer the query directly, so users who would previously have clicked through now get their answer without leaving the AI interface. Second, when AI systems do cite sources, they typically surface a link as a footnote rather than as the primary result. Reuters and The Guardian are among the most-cited sources in ChatGPT and Perplexity; they receive less than 1% of referral traffic from those systems (Goodie, an AI search analytics research firm, 2026). Citation is an awareness signal. It is not, for most publishers, a traffic driver.
Receipts
Sources & references
Search Engine Journal, 2026
Condé Nast CEO Roger Lynch directed every brand to plan as if Google search traffic will be zero. Google went from a majority of their traffic to a quarter, heading to low single digits.
Ahrefs, December 2025
AI Overviews correlate with 58% lower CTR across 300,000 keywords. Worsening from 34.5% decline in April 2025.
Aggarwal et al., Princeton KDD 2024
First large-scale empirical study of AI citation optimisation across 10,000 queries. Statistics +41%, quotation +28%, inline citations +115% for lower-ranked pages.
Google Search Central, May 2026
Official guide to optimising for generative AI features. Confirms RAG pipeline, debunks llms.txt and chunking. AEO/GEO are “still SEO.”
Goodie, 2026
2026 AI search traffic report. Reuters and Guardian receive less than 1% referral traffic from AI despite being top-cited sources. AI referral traffic: 0.28-0.9% of total.
Ahrefs, 2026
AI-referred visitors drove 12.1% of paid signups while representing 0.5% of total traffic , a 23x conversion multiple for their B2B SaaS product.
Digiday, October 2025
WTF are GEO and AEO. Recipe publishers earn almost zero traffic from AI systems despite frequent citation. Publisher strategy perspectives.
SchemaApp, 2025
Tier-one schema types produce 3:1 improvement in AI citation rate versus unstructured content. Entity layer vs per-page schema gap.
Patient Comet · Creative Production · 3 June 2026 · 14 min read
Anyone Can Make It Now
In February 2026, Google made its film studio free. The same month, WPP (one of the world’s largest advertising and communications groups) shed nearly ten thousand jobs. These two facts are the same story.
AI handed every marketing team a broadcast-quality studio. Most used it to produce more. A handful used it to produce better. The gap between those two decisions is the story.
N
Nadim A. Massih
Patient Comet · 3 June 2026 · 14 min read
9,389
Jobs shed by WPP in 2025 alone as brands moved production in-house (WPP SEC filing)
2×
Click-through rate Unilever recorded after building its AI studio across 18 markets (Digiday, Aug 2025)
0
AI-generated brand assets automatically protected by copyright, under the US Copyright Office’s January 2025 ruling (US Copyright Office, Jan 2025)
What these three mean together: The agencies are shrinking, the brands are building, and almost nobody has read the legal small print. Having the pipeline is not the competitive advantage. It is the floor.
What the Pipeline Actually Looks Like
Google Flow was not the starting gun. It was the confirmation.
The pipeline now fits in one window. In March 2026, ElevenLabs (a voice and creative AI platform used by 41 per cent of the Fortune 500) launched Flows: a single canvas connecting script, footage, voice, music, and distribution, without a single external hire. These launches were months apart. The capability gap that once separated a marketing team from a broadcast production facility had already closed.
Let us look at the mechanics. The speed of the change is still not fully understood.
Three years ago, a brand producing a 30-second video advertisement needed a creative brief, an agency, a production company, a shoot day, a post-production house, a sound designer, and a music licensing deal. Timeline: eight to twelve weeks. Budget: five figures minimum, climbing quickly.
Today, a marketing team at a mid-sized company can move from brief to published video in under a working day. The 2026 production stack looks like this.
Script. ChatGPT (OpenAI’s AI writing assistant) or Claude (Anthropic’s AI assistant) writes a first draft from a brief. Tone, length, platform format, all specified in the prompt. Turnaround: under a minute.
Images and storyboard. Midjourney v7 (from a San Francisco AI company of the same name) creates cinematic concept frames and editorial illustrations , the standard for artistic, mood-driven imagery. Flux 1.1 Pro (from Black Forest Labs, a German AI research company) leads on photorealism, producing images that look like actual photographs rather than generated artwork. For any image requiring readable text inside it (a poster, a product label, a social card) Ideogram v3 (from a New York AI company) is the specialist. It is the only major image tool that reliably renders legible typography; every other model still produces distorted or misspelled text.
Video. Google’s Veo 3.1 (Google’s AI video generation tool) produces broadcast-quality video clips from text descriptions, generating dialogue, ambient sound, and music in a single pass. It is the only major AI video tool that creates audio natively alongside footage without a separate step. Runway Gen-4.5 (from Runway, a New York AI company) is the professional alternative for work requiring precise creative control, consistent characters across multiple scenes, and post-generation editing. Seedance 2.0 (from ByteDance, the Chinese technology company behind TikTok) accepts text, images, video, and audio together as input (up to twelve assets simultaneously) and produces multi-shot, cinema-quality video with native audio sync and frame-level character consistency. Its April 2026 API launch made it directly accessible to brand production teams at scale. One notable absence from this list: Sora, OpenAI’s video generator, was shut down in April 2026. Per industry analysis, the product was generating significant operating losses against minimal revenue.
Voice. ElevenLabs generates voiceovers in more than 70 languages from text input. Voice cloning is available for brands that have established audio identities. Their technology is now embedded in IBM’s enterprise AI systems.
Music. Suno v5 (an AI music generation platform) creates complete original tracks with vocals, instruments, and lyrics from a text description, in seconds. Following Suno’s November 2025 settlement with Warner Music Group (one of the three largest recorded-music companies in the world), paid subscribers hold commercial rights, though as explained in §04, the legal picture is more complicated than that agreement suggests. Udio (another AI music generation platform) settled similarly with Universal Music Group (the world’s largest recorded-music company) in October 2025.
Assembly. ElevenLabs Flows, launched March 2026, connects all of the above in a single visual workspace: image generation, video, voice, music, and sound effects linked together in a pipeline. Non-destructive: change the voiceover without rerunning the video generation. The entire sequence can be triggered from a brand team’s form submission.
Distribution-ready design. Canva AI 2.0 reads your brand’s colour palette, fonts, and logo rules before generating anything. It has Google’s Veo 3 embedded directly for video, and outputs social-ready assets in every required format simultaneously.
That is the 2026 creative production stack , the generation layer. The tools that create the content.
There is a second layer that most brands discover only after they start producing at scale. It is the orchestration layer: the pipeline that takes a brief at one end and delivers a finished, distributed asset at the other, without a human copying files between applications.
Production economics: before and after AI
The production ceiling collapsed. The volume ceiling disappeared with it. (APR, 2026)
ElevenLabs Flows in action: the full creative pipeline from script to published asset in one canvas. Click to watch on YouTube. (ElevenLabs, 2026)Google Flow: mood board, image generation, and video animation in one unified browser window. Free to use. Click to watch on YouTube. (Google Labs, 2026)
APR (an advertising production consultancy that advises more than 70 global brands) named the scale of the shift plainly in March 2026: “the biggest restructuring of the agency and production world since the 1990s.” They identified three tracks that brands now run simultaneously.
Craft: high-touch artistry for hero content and long-form branded work. Human-led, AI-assisted. This is where premium shoots still happen , but once, with AI handling global versioning.
Maker: nimble, social-first content for rapid-turnaround engagement. A small team, running the stack, producing at volume for specific platforms and moments.
Content Engine: fully automated, globally versioned output at scale. One master asset. AI generates every market variant , language, imagery, format, local pricing, regional cultural context, without a human touching the downstream production.
The Content Engine requires no external agency. It runs on the stack above, available to any brand that can afford a set of software subscriptions.
The 2026 AI Creative Production Stack
Script
ChatGPT / Claude
Brief-to-draft in under a minute, any tone or format
Native audio / creative control / multi-input pipeline
Voice & Music
ElevenLabs / Suno v5 / Udio
70+ languages, full-track music with commercial rights
Assembly
ElevenLabs Flows / Google Flow
End-to-end canvas; all generation tools in one window
Orchestration
ComfyUI / Wireflow / Vilva
Pipeline automation; brief in, distributed asset out
“The biggest restructuring of the agency and production world since the 1990s.”
Every Brand Is a Studio Now
Unilever (the consumer goods company behind Dove, Lynx, and Hellmann’s) runs 18 AI creative studios across 18 markets. A brief goes in. Campaign-ready video assets , in local languages, featuring the product, formatted for every platform come out. No production company booked. No crew scheduled. No shoot. The system uses 3D digital models of each product, known as digital twins, to generate the imagery. Since deployment: creative assets produced 30 per cent faster. Video completion rate doubled. Click-through rate doubled. A parallel system called Sketch Pro, built with IPG Studios (the production arm of Interpublic Group), delivers testable assets in two hours at three times the previous production speed. Scaling to 21 markets by the end of 2026.
This is not exceptional. It is now the direction of travel.
82 per cent of members of the Association of National Advertisers (the main US industry body for major brands) now operate an in-house agency , up from 78 per cent in 2018. 60 per cent of US senior marketing leaders told Forrester Research (a global market research firm) in 2025 they spent less on external agencies as a direct result of AI. Forrester projects a 15 per cent job reduction across agencies in 2026, following 8 per cent cuts in 2025.
The numbers at the holding-company level tell the story directly.
WPP (one of the world’s largest advertising networks, owning agencies including Ogilvy and Grey) cut close to 10,000 jobs in a single year. Its “Elevate28” restructuring programme targets £500 million in annual savings by 2028.
Omnicom (another major global advertising group), following its $13 billion acquisition of IPG (Interpublic Group, also one of the world’s largest advertising companies), announced more than 4,000 job cuts and doubled its cost-saving target to $1.5 billion over 30 months. Iconic agency names were retired.
Nike built Nike Icon Studios in Culver City, Los Angeles: a single facility consolidating every global brand pre-production and post-production function under one roof. Not an external agency. A production house owned by Nike.
For a mid-sized brand, the equivalent is a small internal team running ElevenLabs Flows, Midjourney, Runway, and Canva AI, producing monthly content volumes that would previously have required a full-service agency retainer.
Every brand is a production house now. The question is what they are producing.
When Everyone Has the Same Pipeline
What happens when everyone has the same pipeline is not obvious until the feeds fill up.
McDonald’s Netherlands released its 2025 Christmas advertisement using generative video tools. It was pulled within days. Viewers described it as “AI slop” , the term internet users coined for AI-generated content characterised by warped movement, uncanny expressions, and a synthetic quality that is difficult to name but impossible to ignore. Comments said it “ruined Christmas spirits.” Tom Williams, global creative director at Incubeta (a digital growth agency): “It seemed to lack intent. The whole thing gave off Uncanny Valley vibes.” Valentino (the luxury fashion house), Skechers, Colgate, Samsung, and Burger King received similar audience backlash for AI imagery during the same period.
iHeartMedia (one of the largest radio and digital audio companies in the United States) launched a “guaranteed human” tagline after its own research found 90 per cent of listeners wanted their media made by people, not machines.
The Journal of Business Research (a peer-reviewed academic journal) confirmed in 2025: when consumers believe marketing communications are AI-written, they judge the content as less authentic, feel moral disgust, and show weaker engagement and weaker purchase intentions, even when the content is otherwise identical to human-made equivalents.
Europol (the European Union’s law enforcement agency) projected that 90 per cent of online content may be synthetically generated by 2026. Not bad content. Just content. Indistinguishable in volume, similar in texture, arriving from every brand simultaneously.
AI is a homogeneity engine. It produces what is statistically most likely. When every marketing team uses the same tools trained on the same data, output converges. The feeds fill up.
Having the pipeline is not the competitive advantage. It is the floor.
There is a second problem most brands running this pipeline have not yet found. It is quieter than the slop problem. It is, in some ways, more dangerous.
“When every marketing team uses the same tools trained on the same data, output converges. The feeds fill up.”
You Own Nothing
Most brands running an AI pipeline have not read the licence clause that defines who owns what the AI creates and what legal rights the company keeps.
In January 2025, the United States Copyright Office published the second part of its Artificial Intelligence and Copyright Report. The finding: writing a prompt into a generative AI tool does not make you the author of the output. Prompts are instructions, not expressions of creativity. The Supreme Court declined to hear the appeal in Thaler v. Perlmutter in 2026, leaving the lower court ruling in place. The rule stands: only human beings can hold copyright authorship in the United States.
This has a direct and serious consequence for every brand running an AI creative pipeline.
If a marketing team generates a campaign image in Midjourney, a video in Runway, and a voiceover in ElevenLabs , and a human simply reviewed and approved the output without substantially editing or modifying it those assets are not copyrightable. They enter the public domain. A competitor can use them. A stock agency can sell them. Anyone can copy them freely, and legally.
The tools themselves offer very different levels of protection.
IP Protection by Tool : What Brands Actually Own
IP protection by AI creative tool. (US Copyright Office, January 2025; Adobe, 2026)
Midjourney tells paying subscribers they “own” what they create commercially. But intellectual property lawyers have identified the gap: if the law does not recognise AI-generated output as copyrightable, Midjourney cannot grant a copyright it does not have. What it is actually offering is a promise not to assert its own rights against the user. There is no legal protection standing between a Midjourney-generated brand asset and a copyright claim.
Adobe Firefly (Adobe’s AI creative toolset, part of Creative Cloud) is the meaningful exception. Its training data was sourced from Adobe Stock (a library of commercially licensed images), alongside public domain content. Adobe explicitly guarantees commercial safety on Firefly outputs , and includes IP protection, meaning if someone sues you claiming your Firefly output infringes their copyright, Adobe will cover your legal defence costs up to $10,000 per claim for paid subscribers. It is the only major AI image tool with contractual legal protection built in.
Suno (the AI music generation platform) technically retains authorship even for paid subscribers. Subscribers receive a commercial use licence , not ownership. The music in last quarter’s campaign belongs, in a legal sense, to Suno.
The practical exposure is real. Brands are building creative libraries , social assets, campaign images, advertising videos, product photography, using AI tools without documenting the human creative decisions that would establish copyright. Those libraries may contain no legally protected intellectual property at all.
Reinvent IP (an intellectual property law advisory firm), in a March 2026 briefing for agencies, identified this directly: “the content created with AI tools might not actually belong to the organisation.”
The remedy is not to stop using AI. It is to document the process. The Copyright Office’s guidance: copyright protection applies where a human makes specific, substantial creative choices, editing, selecting, arranging, modifying in ways that express genuine creative judgment. Brands need to build what practitioners now call a Creative Audit Trail: a documented record of the human decisions that converted AI-generated output into a protected brand asset. Which elements were generated. Which were edited or restructured by a human. What creative choices were made during that process.
Disney, Universal, and DreamWorks filed a federal lawsuit against Midjourney in June 2025, alleging the platform functions as a “virtual vending machine, generating endless unauthorised copies” of copyrighted characters. Midjourney filed a fair use defence, arguing that training an AI on existing images is comparable to a human artist learning from existing work, not directly copying it. The case is ongoing. Its outcome will shape what brands are permitted to build with AI creative tools, and what they legally own when they do.
Any brand building significant creative output with AI tools should be having this conversation with legal counsel now, not next quarter.
The legal question and the competitive question are not separate. Both reduce to the same thing: whether any human being in the building made a real decision about what the brand is saying. A Creative Audit Trail protects the asset legally. A clear brief protects it commercially. The work is the same work.
The exposure
No Creative Audit Trail = no copyright. A brand that generates 500 social assets a month using AI, with a human only reviewing and approving each one, may have built a creative library containing zero legally protected intellectual property. Those assets are freely copyable by anyone, including direct competitors.
The exception
Adobe Firefly is the only major AI image tool with contractual IP protection. For brand-critical assets where copyright matters, it is the only tool in the 2026 stack that includes a legal guarantee. The $10,000-per-claim defence cover for paid subscribers is not symbolic. It is the gap between the tool that hands you a liability and the tool that shares it.
The Taste Gap
The pipeline is shared. The legal exposure is shared. The tools are identical. What is not shared is the judgment about what to make.
When production costs approach zero and the tools are available to every competitor, the resource that becomes scarce is the one the tools cannot supply: judgment about what to make, for whom, and why.
The VC Corner (a venture capital and technology newsletter), in an April 2026 analysis of what differentiates companies in the AI era, put it plainly: “When everyone has the same jet engine, speed is no longer a moat.” Rex Woodbury (a technology analyst and writer) called the current moment the “Costco era of software”, mass-produced, instantly forgettable output that looks like everything else because it came from the same engine as everything else.
Production skill still matters. The brands getting the best results from Runway and Veo 3 are not the ones who opened the tool for the first time this week. There is real expertise in building consistent characters, writing prompts that hold brand voice, and understanding which model does what. That expertise has value. But it has a ceiling. Because the expertise is learnable, and the tools are available to anyone willing to learn them. Production mastery is not the final differentiator; it is a cost of entry.
The brands solving this are not using different tools. They are making different decisions before they open the tools.
Heinz (the food brand known globally for its ketchup) ran a campaign using DALL-E 2 (an AI image generator from OpenAI) prompted with phrases like “ketchup in outer space.” Then it invited consumers to generate and submit their own AI ketchup artwork, displayed in a digital gallery and on physical advertising such as billboards and posters. The campaign generated over two billion impressions and 25,000 personalised videos. Heinz did not use AI to make an advertisement. It used AI to make an invitation. The tool was the same as every other brand’s. The judgment about what to do with the tool was not.
Coca-Cola’s Islam ElDessouky, global vice president for creative strategy and content, described the internal reality after the 2025 AI holiday campaign: “The masses, the audiences, do not necessarily look behind the technology. They just look at the story that they’re receiving.” By Coca-Cola’s own testing metrics, the AI-generated advertisement was one of the best-performing in the brand’s history. The online backlash came from people who noticed how it was made. General audiences responded to what it said.
The gap is not between brands with AI and brands without. It is between brands that know what they are saying and brands that are simply producing.
What taste actually is, how it is built, and what it means for creative strategy gets its full treatment in a separate piece. Here, the argument is narrower: direction precedes production. The brief is the competitive asset. The tools are not.
“When everyone has the same jet engine, speed is no longer a moat.”
The Brief Is the Product
The diagnosis is clear. What the new job looks like is different from the old one in every layer that matters.
The brief , the specific human argument about what a piece of content should make an audience feel, and why, is where the work is now. Not in execution. Execution is automated. The brief is where competitive advantage begins, where intellectual property can be established through documented human creative decisions, and where brand identity is either held or surrendered to the algorithm.
The New York Times, in a June 2025 analysis of creative work in the AI era, described the shift: writers becoming “article designers,” “story designers” emerging as a distinct role in film and television, “world designers” shaping marketing and gaming. The paper summarised: “More people will be tasked with making creative and taste decisions, steering the AI where they want it to go.”
That is the new job. Not making. Steering.
APR named the three roles a brand needs at the top of its AI creative operation. An AI Workflow Producer designs and maintains the pipeline, how the tools are connected, which workflow runs for which content type, and how governance works across the system. A Prompt Engineer embeds brand voice, audience context, and quality standards directly into the instructions the AI receives, turning the brief into a repeatable, brand-consistent specification. An AI Supervisor holds the standard across all output: the person who reads every finished asset against the brand before it publishes, and removes what does not meet it.
At the level of strategy, the question is: what does the brand stand for, and is the brief specific enough for an AI tool to reproduce that accurately? At the level of tactics: which workflow track is running (Craft, Maker, Content Engine), and does each have a human responsible for the brief? At the level of daily operations: is there a documented Creative Audit Trail for every asset produced, establishing the human creative decisions that make it protectable?
All three questions need answers before the tools are opened.
AI may be a multiplier. It is not a shortcut. A weak idea at scale is still a weak idea. The only thing a brand truly owns is its point of view , and the brief is how that point of view survives contact with the machines.
That shift from tool-access to creative judgment has a direct analogue in an earlier industry.
Music Production Got Cheap. Taste Won.
This has happened before.
In 1991, a Bay Area company called Digidesign launched Pro Tools: a digital audio workstation (software that replaced physical recording equipment, letting music be recorded, edited, and produced entirely on a computer) that destroyed the economics of professional recording studios within a decade. Artists including Billie Eilish and Clairo built careers recording in spare bedrooms. Production that required a £5,000-a-day studio could be done for the cost of a laptop and a software subscription.
What happened to the recording engineers?
The great ones became more valuable than ever. The ears that could distinguish between a bedroom recording that sounded genuinely professional and one that merely sounded decent were suddenly rare , because the tools no longer revealed the distinction automatically. The middle ground of competent-but-interchangeable engineers, who relied on expensive equipment to provide their edge, was hollowed out. The floor dropped. The ceiling rose.
A 2025 ScienceDirect (a major academic research publishing platform) analysis of technology’s impact on content production found the same pattern repeating across every creative technology wave: “each episode shares a common logic , the automation of a critical human task at scale, and the democratisation of that capability to the wider population.” The difference with AI is that it “simulates the expert labour itself,” not just enabling more people to participate, but potentially automating the creative functions that once defined professional advantage.
Pixar (the American animation studio behind Toy Story, Up, and WALL-E) has always had better rendering technology than most competitors. Pixar won because it never mistook the technology for the art. Its defining films were stories about longing, loss, and belonging. The technology served the story. The story came from people who had something to say.
The brands in the most exposed position right now are those in the middle, outsourcing creative direction to agencies whose primary value was production capacity, now watching those agencies restructure and contract. The brands have not yet built the internal creative direction capability to replace what those agencies provided.
The window to build it is now. Not when the tools are more advanced. Not when the legal framework is clearer. Now, while the advantage is still available to those who move.
“Pixar won because it never mistook the technology for the art.”
The Take
Where to Start
Every brand is now answering the same question, whether it knows it or not: are we building the capability to direct the machines, or just running them?
Four actions. One for each person who needs to hear a different version of this.
01
If you are a marketing director
Commission a Creative Audit Trail review for your current AI-generated asset library before it grows any larger. Establish which assets have documented human creative interventions and which do not , the distinction between protected intellectual property and content anyone can freely copy. For commercially sensitive work, switch from Midjourney to Adobe Firefly, the only major AI image tool that includes contractual IP protection. This is a legal conversation as much as a creative one.
Marketing Director
02
If you are a founder or brand owner
Invest in direction before you invest in production. The Content Engine (Midjourney, Runway, Seedance, ElevenLabs, Canva) can be operated by a marketing coordinator. What cannot be automated is the brief: the specific articulation of what your brand means, who it is for, and what you want people to feel. Write that down in enough detail that an AI tool could reproduce it accurately without a human in the room. That document is both your creative strategy and the foundation of your intellectual property.
Founder / Brand Owner
03
If you are a creative professional
The job has changed. The value is no longer in execution, it is in direction. Creative directors, art directors, and strategists who understand how to write precise briefs, build repeatable prompt frameworks, and maintain brand coherence across an AI-generated creative library are in significant demand. The ones who are not developing these skills are competing with the tools directly. That competition cannot be won.
Creative Professional
04
If you are building a team
Hire for the three roles APR identified: an AI Workflow Producer, a Prompt Engineer, and an AI Supervisor. These are not technical roles. They are creative and editorial roles that require deep knowledge of what the brand stands for. They are also the only roles in the pipeline that cannot be automated by the tools they oversee.
Team Builder
The closing argument
One strategy scales. The other fills a feed. Every brand is now a production house. Not every brand knows yet what it is producing , or why.
Common questions
Questions, answered first
What is generative AI creative production?
Generative AI creative production is the use of artificial intelligence tools , image generators, video generators, voice synthesisers, and music generators, to produce advertising and marketing content from written instructions. What previously required a team of specialists, a significant budget, and weeks of work can now be done by a small team in hours, using software available as monthly subscriptions.
Do brands own copyright in their AI-generated content?
Not automatically. The US Copyright Office confirmed in January 2025 that writing a prompt into a generative AI tool does not constitute authorship. For AI-generated output to be protected by copyright, a human must make documented, substantial creative decisions, editing, arranging, or modifying the output in ways that express genuine creative judgment. Adobe Firefly is currently the only major AI image tool that includes contractual IP protection for commercial use.
What is AI slop?
AI slop is the informal term for AI-generated content that is technically functional but aesthetically flat or visually uncanny, content that looks generated rather than made. Common characteristics include warped movement, synthetic textures, and a “not quite real” quality that audiences detect instinctively. Several major brand campaigns, including from McDonald’s Netherlands, were withdrawn after audiences described the content as AI slop.
What tools make up the 2026 AI creative production stack?
The stack has two layers. The generation layer creates content: Midjourney v7 or Flux 1.1 Pro for images, Google Veo 3.1, Runway Gen-4.5, or Seedance 2.0 for video, ElevenLabs for voice, Suno v5 for music. For commercially protected work, Adobe Firefly should replace or supplement Midjourney for brand-critical assets. The orchestration layer wires the tools together: ElevenLabs Flows, Google Flow, or ComfyUI (and its cloud commercial versions, Comfy Cloud and Comfy Enterprise) for technical teams; Wireflow or Vilva for marketing teams without programming resources.
What is a Creative Audit Trail?
A Creative Audit Trail is a documented record of human creative decisions made during AI-assisted content production. It records which elements were AI-generated, which were edited or substantially modified by a human, and what creative choices were made during that modification. It is the evidence that establishes copyright protection for AI-assisted brand assets, without it, those assets may be legally unprotectable and freely copyable by anyone.
What is digital twinning in creative production?
Digital twinning is the practice of creating a precise 3D digital model of a physical product, then using AI tools to generate campaign imagery and video from that model, eliminating the need for physical product shoots. Unilever deployed digital twinning across 18 markets in its AI Beauty Studio, using virtual product models to generate all paid social, programmatic (automated, targeted digital advertising), and e-commerce creative.
The Great Production Pivot, “biggest restructuring of the agency and production world since the 1990s”; three-track model (Craft, Maker, Content Engine); three new roles defined.
Forrester Research, 2025-2026
Predictions 2026: Marketing Agencies : 60% of US senior marketing leaders reduced agency spend due to AI; 15% agency job reduction projected for 2026.
NWritten byNadim A. MassihAI & Tech StrategistMore articles
Patient Comet · Product · 4 June 2026 · 12 min read
The Second Customer
On 9 March 2026, a federal judge blocked an AI agent from shopping on Amazon. The agent had the human’s permission. Amazon had not agreed to it. That gap , between what a user authorises and what a product was built to handle, is the interface problem every product now faces.
AI-sourced traffic to US retail grew 393% in the first quarter of 2026. It now converts 42% better than human traffic, an 80-percentage-point swing from a year ago. Agents are already on your product. Most products were built for the human who clicks. This is about the second user: the one you did not design for, already there, converting better than the one you did.
N
Nadim A. Massih
Patient Comet · 4 June 2026 · 12 min read
+393%
growth in AI-sourced retail traffic year on year: your fastest-growing buyer is already there (Adobe, 2026)
42% better
conversion rate for AI-sourced vs human traffic in March 2026, reversed from 38% worse a year prior (Adobe, 2026)
$262B
of global online spend influenced by AI and agents over the 2025 holiday season (Salesforce, 2025)
What these three mean together: The traffic is growing faster than anyone anticipated, it is outperforming human traffic on the metric that matters most, and the money is already real. Your product is already serving a second kind of user. The question is whether you built it that way, or whether it is working around you.
On 31 October 2025, Amazon sent a cease-and-desist letter to Perplexity AI. The accusation was not that Perplexity had built a competing product. It was that Perplexity’s Comet browser was helping people use Amazon’s own product, in a way Amazon had not designed for and had not authorised.
Comet is a shopping agent. You tell it what you want. It navigates the web, finds the product, and completes the purchase on your behalf. The person using Comet gave it explicit permission to act for them. Amazon, the platform Comet visited, had not agreed to that arrangement.
On 9 March 2026, a federal judge in California granted Amazon a preliminary injunction. US District Judge Maxine Chesney wrote that Amazon had provided strong evidence that Comet had accessed its website “at the user’s direction, but without authorisation” from the platform itself.
The legal proceedings may run for years. The distinction Judge Chesney drew is not a legal technicality. It is the structural problem every product now faces, whether they have been to court about it or not.
The user said: go to that product on my behalf. The platform said: I never agreed to that.
That gap, between what the user authorised and what the platform was built for, is the interface problem of 2026.
The scale of that problem became visible in Q1 2026, when the traffic data arrived.
The Traffic You Did Not Send For
Adobe Analytics, which covers more than one trillion visits to US retail sites annually, measured AI-sourced traffic through the first quarter of 2026. It grew 393% year over year. During the 2025 holiday season, the growth had been 693%. The rate is not levelling off.
Salesforce’s analysis of the same period found that AI and agents drove 20% of all retail sales and influenced $262B of global online spend over November and December 2025. Autonomous agent actions increased by 142% over the same window.
The most striking figure is the conversion reversal. In March 2025, AI-sourced traffic converted 38% worse than human traffic. In March 2026, it converted 42% better. Revenue per visit was 37% higher. Time on site was 48% longer.
The product did not change. The agent got better at finding what the human actually wanted, and sending only the sessions most likely to convert.
The Amazon vs. Perplexity case and what it means for AI agent commerce: the court ruling that set the terms for who can authorise an agent to act on a user’s behalf. Click to watch. (Front Page, May 2026)
The conversion crossover
In a year, agent-driven traffic went from converting 38% worse than humans to 42% better. Revenue per visit was 37% higher for AI-sourced sessions. Your product already has a second kind of user , and it is now your best-converting one. (Adobe Analytics, April 2026)
Two Users, One Interface
The human who uses your product and the agent acting on their behalf want the same outcome: the right thing at the right price, quickly. The paths they take are entirely different.
The human navigates. They land on the homepage, scan the headline, watch the video, read the pricing comparison, hesitate, close the tab, come back. The interface earns their trust across several visits. The copy matters. The personality matters. The design matters.
The agent parses. It arrives at a URL and extracts: price, specifications, availability, a countable proof signal. It compares across three competitors in the time the human was reading the first paragraph. It executes when the criteria match.
The agent does not watch the video. It does not respond to brand warmth. It does not care about the scroll animation. What it needs is information that is structured and labelled (organised so software can find and extract specific values), and readable by a machine.
Two users, one product page
The human earns the decision across multiple interactions. The agent extracts and executes in a single pass. Both arrive at the same interface. Most were designed for one of them.
What the Machine Cannot Read
Adobe’s 2026 analysis of US retail sites found that the average homepage is 75% machine-readable (structured so software can understand and extract it). The average product page: 66%.
The 25 to 34% that is invisible to an agent is, in most cases, the differentiating content: the main promotional video at the top of the page, JavaScript-rendered pricing comparisons (content that only appears after the page loads fully in a browser, invisible to anything reading the underlying document), CSS star ratings, interactive demos, scroll-triggered copy. Every design pattern that builds trust with a human reader is, to an agent, a void.
The 34% invisible on a product page frequently contains the specific differentiators that separate the product from its competitors: the copy written to justify the price, the proof that addresses the main objection, the specification that wins the decision. An agent that cannot read those sections defaults to price alone.
Early analysis of Shopify merchant data by Presta in 2026 found that stores optimised for agentic discovery saw 28% higher conversion from AI-driven traffic compared to stores that had not made those changes. These results come from a single study; real-world variation by category is expected. But the direction is consistent with Adobe’s broader data: the structured surface converts better for agents and, in most cases, better for humans too.
The product that loses the agent comparison on price loses the sale before the human ever sees the page.
Understanding how that comparison happens requires understanding how agents connect to products in the first place.
What the Protocols Require
MCP: The Interaction Layer
The Model Context Protocol (MCP), published by Anthropic in November 2024, is the agreed standard that lets AI agents interact with products without scraping webpages. In plain terms: rather than guessing what the price is from the HTML, the agent has a direct, structured conversation with the product itself. Within ten months, OpenAI, Google, and Microsoft had adopted it. By March 2026, over 10,000 active MCP servers were running in production, with 97 million monthly downloads of the SDK. MCP is infrastructure, not experiment.
ACP: The Transactional Layer
The Agentic Commerce Protocol (ACP), co-developed by OpenAI and Stripe, published as an open standard in September 2025, is the checkout flow for agents. If MCP is the door, ACP is the checkout. It handles how an agent identifies itself to a merchant, how payment credentials are handled securely, and how a purchase completes without a human clicking through a single screen. Salesforce, Etsy, and major Shopify merchants have adopted it. US ChatGPT users can already complete purchases from Etsy sellers inside the chat interface, without visiting Etsy’s website. The specification is open-sourced on GitHub.
The Amazon case turns on the Computer Fraud and Abuse Act (the US federal law making it illegal to access a computer system without the owner’s permission). The distinction: the user’s authorisation versus the platform’s authorisation. The Agentic Commerce Protocol is precisely what is designed to resolve that gap before it reaches a court.
The legal picture is still settling. The commercial case for acting now is not.
One Fix, Two Better Users
The changes that make a product agent-readable also make it a better human product. Optimising for agents is not a concession to a secondary user at the expense of the primary one. The improvements compound.
Clear information hierarchy (price stated plainly, specifications structured, social proof as text with a countable number, a single unambiguous action) is what an agent needs to parse and act. It is also precisely what reduces friction for the human buyer. The page that gets out of the way of the decision converts better for both.
The agent’s requirements are a forcing function for the human-readable improvements most product teams should have made already. The plays that follow serve both users.
Three Plays
1
The Surface Audit
Disable JavaScript in your browser and reload your primary product page. Try to extract four things from what remains: the current price, three key differentiators, one countable piece of social proof, and a clear primary action. If any are missing, your agent surface has gaps. This audit takes thirty minutes, requires no developer, and no specialist tools. The missing items are your work.
Founder-executable · 30 minutes · No tools required
2
Schema Markup
Schema markup is a set of small code blocks that describe your page content to machines in a standardised format: product name, price, availability, review rating; FAQ question-answer pairs; navigation breadcrumbs. These are already required for Google AI Overviews and are the first structured layer an agent reads. For most products, product schema and FAQ schema can be added through a plugin (Yoast SEO or Rank Math on WordPress). Custom builds need an engineer. One implementation, two audiences. Gartner projects 40% of enterprise applications will include task-specific AI agents by end 2026.
Plugin on standard platforms · Engineer on custom builds · Serves AI Overviews too
3
Agent Endpoint
Read the Agentic Commerce Protocol specification (open-sourced at github.com/agentic-commerce-protocol). Identify which of your existing checkout steps it maps to. Adopting ACP does not require rebuilding your checkout: it requires exposing a structured, callable surface over what you already have. For most products, this is a 2 to 4 week engineering project. Your product becomes accessible to every AI assistant that has adopted the standard, across every platform, without building separate integrations for each.
2 to 4 weeks engineering · Open standard, free to adopt · Resolves the authorisation gap
The Take
Serve the Second User on Your Terms
The serving problem is solvable in a quarter. The second customer is already at the window.
The Amazon case will not resolve cleanly. The Ninth Circuit will draw a line between user authorisation and platform authorisation, and wherever it lands, new questions will open. The legal argument will continue for years. The commercial pressure will not pause for it.
AI-sourced traffic grew 393% in the first quarter of 2026. It converts 42% better than human traffic. Agents influenced a quarter of a trillion dollars in holiday spend. These are not projections. They are what already happened, measured across more than one trillion retail visits.
Your product is serving a second customer whether you designed for it or not. Build the door. They are already at the window.
Where to start
Run the no-JavaScript audit on your primary product page. Write down what disappears. Identify which disappeared items contain differentiating arguments.
Add schema markup to your top 10 pages: product schema, FAQ schema, breadcrumb. Plugin on standard platforms, engineer on custom builds.
Read the Agentic Commerce Protocol spec (github.com/agentic-commerce-protocol). Identify which checkout steps it maps to before a competitor adopts it and starts appearing in ChatGPT shopping results instead of you.
Designate one page as agent-first test. Rebuild with agent-readable hierarchy as the primary constraint. Measure conversion against the current version for 30 days.
If an agent can complete a purchase on your product without a human ever seeing the page, what is the interface actually for?
NWritten byNadim A. MassihAI & Tech StrategistMore articles
Common questions
Questions, answered first
What is MCP and why does it matter if I am not a developer?
The Model Context Protocol is the agreed standard that lets AI assistants interact with products without scraping a webpage. If your product supports MCP, every AI assistant that has adopted it (Claude, ChatGPT, Gemini) can work with it directly. Over 10,000 MCP servers were active by March 2026. You do not write the code. You ask your engineering team whether this is relevant. In most cases, it is.
Does building for agents compromise the human experience?
In almost every case, no. The changes agents require (clear information hierarchy, structured pricing, plain-text proof, unambiguous call to action) are the same changes that reduce friction for human buyers. The page that gets out of the way of the decision converts better for both users.
What is the difference between this and standard SEO?
SEO stops at the click. Agent optimisation covers what happens during the session: whether the agent can extract, compare, and complete the transaction. Schema markup and structured data serve both AI Overviews and agentic commerce. But the surface audit, checking what is readable without JavaScript, is the additional step SEO work does not address.
How do I know if my product page is currently agent-readable?
Disable JavaScript in your browser. Navigate your product page. Try to extract: current price, three key differentiators, one countable piece of social proof, and a clear action. If any are missing, those are your gaps. The most common failure is pricing in a JavaScript-rendered comparison table, invisible to agents, leaving price as the only comparison signal.
Receipts
Sources & references
Adobe Analytics, April 2026
+393% AI-sourced traffic Q1 2026; 75%/66% machine-readability; 42% conversion reversal from −38% in March 2025; 37% higher revenue per visit. business.adobe.com
Salesforce, December 2025
$262B influenced spend; 20% of retail sales driven by AI; agent actions +142%; 66% increase in agentic conversations Nov-Dec 2025. salesforce.com
CNBC / GeekWire, March 2026
Amazon preliminary injunction against Perplexity’s Comet, March 9 2026; paused by Ninth Circuit; oral arguments June 11 2026. Judge Chesney ruling on user vs platform authorisation.
Anthropic / modelcontextprotocol.io, 2026
MCP adoption: 10,000+ active servers, 97M monthly SDK downloads, 1,000+ production deployments as of March 2026.
OpenAI / Stripe, September 2025
Agentic Commerce Protocol: open standard for agent commerce; Etsy and Shopify merchants adopted; ChatGPT Instant Checkout live for US users. openai.com, stripe.com
Presta / Shopify, 2026
28% higher conversion for agent-optimised stores vs unoptimised. Single vendor study; real-world variation expected by category. wearepresta.com
Patient Comet · Engineering · 28 May 2026 · 8 min read
When Code Is Cheap, What Do You Ship?
In April 2026, Snap fired about a thousand people and told the market why: AI writes more than 65% of its code. The stock went up around eight per cent. Everyone read it as a story about jobs disappearing. It is actually a story about what engineering is, and always was. The tools changed. Engineering did not. That distinction is good news for anyone who understands it.
AI now writes code, reviews it, explains it, tests it, and documents it. The hardest problems in software, understanding what to build, debugging what is live, making the right call under pressure, have not changed. That is the news. And for engineers who understand their craft, it is good news.
N
Nadim A. Massih
Patient Comet · 28 May 2026 · 8 min read
65%
of Snap’s new code is AI-generated , the stated reason for cutting 16% of staff (CNBC, 2026)
1.7×
more bugs and security flaws in AI-written code than human-written (CodeRabbit, 2025)
76%
of developers say unclear requirements are the #1 blocker when using AI , the problem no model has ever solved (Stack Overflow, 2024)
What these three mean together: AI is generating code at scale, but that code carries 1.7× more bugs, and the biggest obstacle to using it well is not the model, it is unclear requirements, which no model solves. More investment speeds up the code-writing layer. The scarce resource is now the engineer who knows what to build, not the one who can build it.
The Number in the Layoff Memo
On 15 April 2026, Snap fired about a thousand people. Sixteen per cent of the company. Gone in a memo.
The memo did something unusual. It told the truth.
Most layoff notes hide behind weather words. Headwinds. Realignment. A challenging macro environment. Snap’s chief executive skipped all of that and named the cause out loud: AI agents now generate more than 65% of the company’s new code, and small squads using AI tools can do the work that used to need larger engineering teams (CNBC, 2026).
Then the part nobody could have scripted. The stock went up. Around eight per cent.
Read that sequence again. A company says machines now write most of its software, says it therefore needs far fewer humans, and the market rewards the confession. A sceptic will say the stock went up because Snap has been burning cash for years and the market simply applauded any sign of discipline. That reading is not wrong. But it misses what was actually being priced: not the headcount reduction, but the named cause attached to it. The layoff was not the bad news. The layoff was the proof of concept. (
Worth noting: the roles Snap actually cut were in product and partnerships, not engineering directly. Spiegel named engineering efficiency as the reason; the redundancies landed elsewhere. That gap between stated cause and actual cuts is precisely why the sceptics in this piece deserve a hearing.
The question the memo raises is the right one: if AI writes the code, what does the engineer write?
To answer it properly, it helps to look at what AI can actually do for an engineer right now, in 2026.
AI handles this now
Still requires a human
Writes code
Reviews code
Explains code
Generates tests
Creates documentation
Finds bugs
Writes SQL
Builds UIs
Creates agents
Talks to other AI agents
11Understanding requirements
12Debugging production issues
13Making good technical decisions
14Communicating with humans
15Building reliable systems
Technology changed.Engineering did not.
That is the entire story. Everything else in this piece is evidence for it, or implication from it.
The Deliverable Moves Up the Stack
The code was never the valuable part, not for most teams. The code was the toll you paid to find out whether your judgement was right. Now the toll is cheap.
“The code was the toll you paid to find out whether your judgement was right. Now the toll is cheap. The judgement is worth more.”
Nadim A. Massih · Patient Comet · 2026
This is not the first time a craft's tooling got cheap. It happened to photography when digital arrived; to music production when software replaced the studio; to design when templates made layout trivial. Each time, the commodity layer collapsed and the layer above it, the judgement, the eye, the taste, became the thing that separated people who were genuinely good at it. Software is the latest version of that pattern.
For thirty years, the scarce and defining act of building software was the writing of it. The code was the work. You hired for it, you queued for it, you protected the people who could do it.
That bottleneck is dissolving. Google says 75% of its new code is now AI-generated and approved by engineers; Microsoft puts its figure at 20 to 30%; Snap is past 65 (Google; CNBC, 2025-2026). Roughly nine in ten developers now reach for an AI tool as a matter of course. The act that used to be the job is becoming the cheap part of the job.
Most teams are not at 65 or 75 per cent yet. But the direction is not in question, and preparing now, before the transition arrives at your organisation, is the entire point.
Sundar Pichai at Google I/O 2026. At Cloud Next the month before, he confirmed that 75% of all new Google code is now AI-generated and approved by engineers, up from 25% in October 2024. Click to watch on YouTube. (Google I/O 2026 Keynote)
So where did the value go?
Most people get this wrong in the same direction. They assume that when the cost of code collapses, the value collapses with it. The opposite is happening. The value did not vanish. It moved, up the stack, to the things the model cannot do for you: deciding what to build, knowing whether it is any good, specifying it precisely enough that an agent can execute it, and getting it in front of the right people.
What is left exposed is the engineering. Items eleven through fifteen, the understanding, the decisions, the reliable systems, the communication with humans, were always the job. Now they are the only part of the job that compounds in value as the tools get cheaper.
IBM CEO Arvind Krishna has described what he calls the $8 trillion math problem: reaching 100 gigawatts of global AI compute, at roughly $80 billion per gigawatt, requires $8 trillion of capital expenditure, committed entirely to making items 1 through 10 faster and cheaper. Not one watt of those data centres makes items 11 through 15 easier to do (Futurism / IBM, 2025).
The return data confirms the gap. Goldman Sachs calculates that sustaining current investor return expectations requires AI companies to generate more than $1 trillion in annual profit from their AI infrastructure. The 2026 consensus estimate: around $450 billion, roughly $0.45 back for every $1 of capital expenditure (capex)(Goldman Sachs, 2024-2026). The report named the problem plainly: “too much spend, too little benefit.” The DORA 2026 developer research programme found the missing variable: strong engineering foundations are what actually drive AI return on investment. The organisations seeing real returns are the ones with mature practices in the things AI cannot do, requirements, decisions, reliable systems.
As code authorship rises, the human deliverable moves up
The share of new code written by machines roughly tripled in eighteen months. As that line climbs, the human’s job slides up the stack, from keystrokes to spec, review, and taste (Google; Snap, 2024-2026).
Cheap to Write Is Not Cheap to Own
Those numbers describe where the value actually lives , and it is not in the production layer. Cheap to write is not the same as cheap to own. This is not a theoretical concern, the evidence is in, and it is not flattering. The cost of code did not disappear when the typing got fast. It relocated, downstream, to review, to maintenance, to the slow tax of running software nobody on the team fully understands.
CodeRabbit (a code review analytics platform) found that AI-co-authored pull requests, bundled sets of code changes submitted for review, carried about 1.7 times more issues than human-only code, with security problems up to 2.7 times worse (CodeRabbit, 2025). The code arrives faster, and arrives carrying more of the kind of problem you do not see until later.
Then the part that should unsettle you. A controlled trial, sixteen experienced developers, 246 tasks, randomised assignment, put developers on code they knew well. With AI, they were about 19% slower(METR, an AI evaluation safety research organisation, 2025). Not faster. Slower. And they believed they were faster the whole time. The tool did not just cost them time. It cost them the ability to notice they were losing it, which is a different kind of problem entirely.
Developers forecast a 24% speedup from AI and observed a 19% slowdown. Even after experiencing the slowdown, they still believed AI had made them faster. (METR, 2025)
A large industry study found the pattern underneath: AI raises throughput (the rate at which code ships) and worsens delivery stability (DORA, the annual industry developer research report, 2025). It does not fix your team. It amplifies whatever your team already is. Disciplined shops get a multiplier. Sloppy ones get a faster way to ship the mess. There is a name worth keeping for the bill that comes due here: comprehension debt, the accumulating cost of shipping code nobody fully understands. You take it on quietly, at speed. You repay it all at once, in production, on the worst possible day.
The Apprenticeship Needs a New Curriculum
The cost moved. The work moved to higher ground. What about the people?
Stanford found that employment for early-career developers, the ones aged 22 to 25, is down about 20% since AI went standard (Stanford, 2025). Not redistributed. Down. The roles that contracted fastest were the ones closest to items 1 through 10, take a clear ticket, write the obvious implementation, hand it back. That is exactly what an agent now does for nothing.
76% of developers cite incomplete or unclear requirements as the primary blocker when working with AI coding tools, making requirements-gathering the constraint on whether AI delivers any value at all (Stack Overflow Developer Survey, 2024).
But look at what is expanding. The roles companies cannot fill fast enough are the ones requiring items 11 through 15: the technical lead who can run a requirements session, the senior engineer who can debug production without a map, the architect who makes good decisions under real ambiguity.
The apprenticeship is not being automated away. It is being rebuilt around a different curriculum. The path forward for early-career engineers is not to compete with AI on items 1 through 10. It is to use those tools to compress the routine work and invest the hours saved into items 11 through 15. The engineers who do that now, before the transition arrives everywhere, are not behind the curve. They are the curve.
Items 11 through 15 are where the work now lives. If you run a team that ships software, four moves follow.
1
Make the spec the deliverable
A spec is a written description of what software should do and why, before anyone writes a line of code. Stop treating it as paperwork on the way to the real thing. The spec is now the real thing, the artefact you track and protect. GitHub’s Spec Kit (a tool for writing structured specifications before any code is written) makes this literal, and once the spec is solid the agents underneath become interchangeable.
Product & eng
2
Move people from author to verifier
The old senior wrote the hard code; the new senior reads everything and decides what is true. That is a different muscle, and most teams have let it atrophy. Train for it on purpose, because the scarce skill is now judging a diff (the specific lines of code that changed) a machine wrote, fast, and knowing whether it is right.
Engineering
3
Move people up, not out
Snap moved people out, and the market clapped, but that answer eats your own future. The harder, better one is to move people up the stack faster than the machine eats the bottom of it: into problem definition, into taste, into judgement.
Leadership
4
Make distribution and taste the moat
When anyone can produce working software in an afternoon, the software is not the moat. Knowing what to build, building it with taste, and getting it to the right people are the three things the model still cannot do. Spend your scarce human attention there.
A dedicated piece will go deep on what taste is at the craft level, and why it compounds the cheaper the tools get.
Strategy
Those four moves assume you believe the shift is real and the direction is set. Not everyone does. Three people are arguing about all of this, and they are all partly right, and you should hear them before you decide which side to stand on.
Three Roles Worth Understanding Now
The measurer
“The gains are an illusion.”
A controlled trial found experienced developers 19% slower with AI on code they knew well, and they believed they were faster the whole time. The tool did not just cost them time. It cost them the ability to notice. A productivity revolution you cannot measure is a story, not a result, so measure your real cycle time (how long it actually takes from starting a feature to shipping it) before you believe the headline.
The maintainer
“You are confusing writing with owning.”
AI pull requests carry about 1.7 times more issues, security problems multiply, and the bill is deferred, not cancelled. The code is cheap to produce and expensive to live with, and comprehension debt compounds in the dark.
The realist
“AI is an amplifier, not an engine.”
It does not fix a team, it magnifies what is already there: strong teams pull ahead, weak ones get worse, and stability degrades without discipline. Real, and good, but only for teams disciplined enough to deserve it. For everyone else, a faster way to be exactly what you already were.
The Take
Technology Changed. Engineering Did Not.
Technology changed. Engineering did not. That is good news, if you know what engineering actually is.
The $8 trillion going into AI data centres is the largest public signal the industry has ever sent about where the scarce value sits. Every dollar of it is chasing items 1 through 10. Not one of those dollars makes items 11 through 15 easier to do. Which means the investment, read correctly, is pointing directly at engineers who understand requirements, who can debug production, who make good decisions, who communicate with humans, who build systems that actually hold.
The companies reading the Snap memo as a headcount story are reading it wrong. The right reading: when the commodity layer gets cheap, what remains is what was always underneath it. And what was always underneath it was engineering. The $8 trillion is not a threat to the craft. It is a searchlight illuminating where the craft lives.
One thing to do this week: take the next feature on your list and write what it needs to do and why, before anyone writes a line of code. Then review the result against that intent, not against the lines themselves. That single habit is the gate between an engineer who feeds the machine and one who decides what the machine builds. It has always been the gate. Now it is the only one that matters.
If you are an individual engineer rather than a lead, the same signal runs one level down. Items 11 through 15 are not team-level abstractions, they are what you do every day when you are doing engineering well. Invest in them now, explicitly, before everyone else realises they are scarce.
If you are early-career: this is the best career news you have had. The path to compound value runs through items 11 through 15. Use the tools to handle items 1 through 10. Invest the hours you save into learning to understand requirements nobody else can articulate, to debug systems nobody else can read, to make calls nobody else wants to own. Those skills do not get automated. They get scarcer, and more valuable, as every new model makes items 1 through 10 cheaper.
Where to start
Write the next feature as a spec. What and why, with acceptance criteria, before any code.
Review against the spec, not the diff. Judge whether it does what you meant, not whether the lines look plausible.
Reinvest the saved hours in verification. Testing, version control, small batches, real review.
Move juniors into specifying, not out the door. That is where the next seniors come from.
If you are early-career: use items 1 through 10 to compress your learning time, then invest every hour saved into items 11 through 15. The engineers who compound from here are the ones who get to requirements before anyone writes a ticket. That is not a new job description. It is the original one.
The $8 trillion going into AI infrastructure is not a threat to engineers who know their craft. It is a searchlight pointing at exactly where the craft lives, and always lived.
NWritten byNadim A. MassihAI & Tech StrategistMore articles
Common questions
Questions, answered first
Why isn't AI investment paying off yet?
Goldman Sachs estimates that sustaining current investor return expectations would require $1 trillion+ in annual AI profit, but 2026 consensus is ~$450 billion, roughly $0.45 back for every $1 of capex. DORA's 2026 research found the organisations that do see real AI returns are those with strong engineering foundations: mature requirements gathering, sound architecture, and reliable systems. The tool works. The return depends entirely on who is wielding it (Goldman Sachs; DORA, 2024-2026).
Is code really mostly written by AI now?
At the largest engineering organisations it is now the majority of new code: Google says 75%, Snap over 65%, Microsoft 20 to 30%. The caveat is “AI-generated and approved by engineers”: humans still gate it (Google; Snap; CNBC, 2025-2026).
Does AI-generated code actually make teams faster?
Mixed. One industry study found higher throughput overall, while a controlled trial found experienced developers 19% slower on code they knew well. Real gains on new work, real risks on deep maintenance (DORA; METR, 2025).
If code is cheap, what becomes the scarce skill?
Problem definition, architecture, taste, and distribution. The value moves to a higher layer, problem definition, architecture, taste, and distribution, and the early-career roles doing commodity coding are the ones already shrinking (Stanford, 2025).
Is this actually good news for engineers?
Yes, if you understand what engineering actually is. Every dollar invested in AI data centres is going into items 1 through 10 (code, tests, docs, SQL, agents). Items 11 through 15 (requirements, production debugging, good decisions, human communication, reliable systems) are not getting easier. Capital chases scarcity: the $8 trillion points directly at where the scarce value sits. Engineers who invest in 11 through 15 now are building skills that compound as every new AI model makes 1 through 10 cheaper (IBM; Stack Overflow, 2024-2025).
What is the catch nobody mentions?
Cheap to write is not cheap to own. AI pull requests carry about 1.7 times more issues, security problems multiply, and the cost relocates to review, maintenance, and the comprehension debt of code nobody fully understands (CodeRabbit, 2025).
Receipts
Sources & references
Goldman Sachs, 2024-2026
GS Research report “Gen AI: Too Much Spend, Too Little Benefit?” (June 2024) estimated ~$1 trillion in AI capex with little measurable return so far. January 2026 update found companies need $1T+ annual profit to justify current capex; 2026 consensus estimates show ~$450 billion, roughly $0.45 for every $1 spent. GS also found hyperscalers are consuming 94% of operating cash flow on AI infrastructure, forcing $108B+ in debt financing during 2025 alone.
IBM / Futurism, 2025
IBM CEO Arvind Krishna described the "$8 trillion math problem" on the Decoder podcast: 100 gigawatts of planned AI compute at $80bn/GW equals $8 trillion capex, requiring ~$800bn profit just to cover interest. The investment is entirely in the commodity layer of software (items 1-10); none of it reduces the difficulty of requirements, debugging, decisions, communication, or reliability.
Stack Overflow Developer Survey, 2024
76% of 90,000+ developers surveyed cite incomplete or unclear requirements as the most common blocker when working with AI coding tools, making requirements-gathering (item 11) the primary bottleneck in AI-assisted development.
CNBC / Snap, 2026
Snap cut ~16% of staff in April 2026; the CEO said AI agents generate over 65% of new code and small squads now do the work of larger teams; the stock rose ~8% in regular trading (some pre-market reports cited higher).
Google / Fast Company, 2026
Google said 75% of new code is AI-generated and approved by engineers; a leader said engineers are becoming product engineers and architects.
CodeRabbit, 2025
Across pull requests, AI-co-authored ones carried about 1.7x more issues than human-only ones, with security issues up to 2.7x higher.
METR / DORA, 2025-2026
A randomised controlled trial (16 developers, 246 tasks) found experienced developers 19% slower with AI on code they knew well; developers expected a 24% speedup and still believed AI had helped them after. METR published a 2026 update tracking productivity with later AI models. DORA found AI raises throughput but worsens delivery stability and amplifies existing discipline.
Stanford / JetBrains, 2025-2026
Employment for early-career developers (ages 22 to 25) is down about 20% since AI went standard; about 90% of developers now use an AI tool at work.
Patient Comet · Strategy · 4 June 2026 · 9 min read
The Taste Problem
On New Year’s Day 2026, Instagram head Adam Mosseri posted a 20-slide memo and buried the lede: “Authenticity is becoming infinitely reproducible.” He meant it as a warning. It is also the most hopeful thing anyone in technology has said in years. When AI can fake effort, polish, and sincerity on demand, the proof of human presence migrates to the one thing no tool can replicate , a specific, opinionated perspective that only one person could hold. This is the taste problem. It turns out it is not a problem at all.
When everyone rents the same intelligence, taste becomes the moat.
N
Nadim A. Massih
Patient Comet · 4 June 2026 · 9 min read
When Effort Stopped Signalling Quality
Slide nine of Adam Mosseri’s New Year’s memo is easy to miss. He is not, on the surface, talking about Instagram. He is talking about cameras.
“The camera companies are betting on the wrong aesthetic. They’re competing to make everyone look like a pro photographer from 2015. But in a world where AI can generate flawless imagery, the professional look becomes the tell.”
For twenty years, the logic of creative credibility ran on a single premise: visual quality was expensive. A polished carousel took hours of real work. A well-lit photograph required equipment, training, or both. A written piece with genuine rhythm required someone who knew what they were doing. The polish itself was the proof. If something looked considered, it was because someone had considered it.
AI dissolved that logic in approximately eighteen months.
It is not simply that anyone can now generate something that looks like effort. It is that the effort signal itself, the cue audiences used to calibrate trust, has been severed from the product. The carousel that took four hours of genuine thought looks identical to the one assembled in eleven seconds. The written piece with real rhythm is, in form, indistinguishable from the piece a machine produced by pattern-matching against a million others.
When a signal breaks, the market searches for a new one. Here is what makes this moment unusual: the new signal is something most working creative people have been building for years without knowing it had a name.
Mosseri named it in slide twelve. After explaining that AI will soon replicate even the raw, imperfect aesthetic that some creators are leaning into as proof of humanity, he wrote: “At that point we’ll need to shift our focus to who says something instead of what is being said.”
This is not a platform strategy note. It is a description of what authenticity has always been, once everything else it used to cosplay as has been stripped away.
As AI tools commoditise technical execution, the competitive value of taste and judgment rises. The two lines crossed somewhere around 2022-23. In 2026 the gap is measurable: the person who can select and judge at speed is worth considerably more than the person who can produce at speed. (Patient Comet analysis, 2026)
“AI dissolved the effort signal in approximately eighteen months. The carousel that took four hours of genuine thought looks identical to the one assembled in eleven seconds.”
What Taste Actually Is
Start with what it is not.
Taste is not aesthetic preference. It is not having opinions about fonts. It is not knowing what looks good. All of those things can be faked, borrowed, or generated from a reference library. They are outputs of taste, not taste itself.
Taste is the operating system that runs at every decision point in a creative process. It evaluates options not on the question “is this competent?” but on the question “is this right for this?” The two questions look similar. They are not.
AI is very good at “could.” Ask it for options and it will produce fifty of them, all technically competent. What it cannot do is tell you which of those fifty is right for this audience, this moment, this brand, this specific argument. That knowing, which is not reasoning from first principles but feels like recognising something you have seen before in a different form, is taste.
The radio producer Ira Glass (host of the long-running American public radio programme This American Life) described the beginner’s version of this gap in a 2009 interview that has circulated among creative professionals ever since. Beginners get into creative work, he said, because their taste is already strong. The problem is that their taste exceeds their ability. They can sense the work is wrong but cannot fix it yet. The gap closes slowly, through volume.
AI inverts this exactly. The machine’s output capability now exceeds most people’s capacity to judge it. The new gap is not “can I execute?” It is “can I sense which execution is right?”
That is good news. The sensing, built slowly through years of having opinions, defending them, being wrong about them, and having them again, is what no tool generates from a prompt. You have been building it for longer than you think.
Ira Glass describes the gap between taste and ability at the start of a creative career. AI inverts this gap: the machine’s output capability now exceeds most people’s judgment. The direction has reversed, which means the constraint is now the thing you have been building all along. Click to watch on YouTube. (David Shiyang Liu / YouTube, 2009)
Every Democratisation Is a Taste Test
This has happened before. It just tends to take longer.
In 1985, PageMaker and the Apple LaserWriter put a print shop inside every office. Typesetting, which had required training, equipment, and the quiet expertise of people who understood leading and kerning and when not to mix typefaces, was suddenly available to anyone with a Macintosh and an afternoon. The first generation of desktop-published newsletters used approximately every font available simultaneously. By 1989, offices were drowning in justified text, clip art, and drop shadows rendered at 72 dots per inch.
The typographers and designers who survived that flood were not the ones who learned PageMaker fastest. They were the ones with enough taste to know which of its capabilities to leave unused.
They did not just survive. They became more valuable.
In 2010, the iPhone 4 camera marked the moment when photography stopped being a technical bottleneck for most people. The image-making floor rose sharply. Professional photographers were widely declared redundant. What actually happened was that photographers who had built genuine visual judgment over years, who could see something worth capturing before lifting the camera, became more valuable, not less.
The technical floor rose. The taste ceiling rose further.
The pattern holds across every access event: the work that survives is the work that was never just about technical access. Desktop publishing, affordable studio equipment, the smartphone camera, and now generative AI, each one raises the floor. Each one makes the question of what is worth making more important than the question of whether you can make it.
The difference this time is speed and scope. AI has raised the floor not on one skill but on dozens simultaneously: writing, design, music, code, photography, video, strategy documents, analysis. The taste test is happening across every discipline at once. And the people who had been quietly building judgment in their field are, right now, sitting on the most valuable professional asset in the room.
“The typographers who survived the desktop publishing flood were not the ones who learned PageMaker fastest. They were the ones with enough taste to know which capabilities to leave unused.”
The Three New Signals
After the old signal breaks, what does the market use instead?
Mosseri named three in his memo, though not in those terms. Specificity. Voice. The visible presence of a real person’s perspective.
Specificity is the observation that only someone in that room could make. Not “AI is changing creative work” but “on the Thursday morning after our studio read the Mosseri memo, we retired our standard carousel template for the first time in two years.” The machine can approximate general. It cannot approximate yours. It does not know what your Thursday morning felt like, what specific decision preceded this one, what two things were rejected before you landed here. Specific detail is a proof of origin. It costs nothing to add and is impossible to replicate.
Voice is not consistent tone across a brand. It is the grain of how one person processes a problem: the rhythm, the things they notice before anyone else notices them, the comparisons they reach for that no one else would reach for. Voice takes years to develop because it is the residue of thousands of choices made with conviction. You cannot synthesise residue.
The visible presence is sometimes the smallest thing. The Instagram educator Rishi Shine (creator of the widely followed @therishishine carousel format) signs every cover with “by Rishi” in handwriting. One handwritten line in an otherwise designed system. The machine perfection breaks exactly once, and the reader’s nervous system registers it: a person made this.
Not as a disclaimer. As a presence.
This logic extends far beyond Instagram. The same signal applies to written pieces, product decisions, design systems, and the work of anyone who communicates publicly. The cost of the signal is approximately zero. A script font signature on a cover. The admission in a piece of writing that you changed your mind mid-argument. The one specific detail no one else would have included.
For Patient Comet, the equivalent of “by Rishi” is “by Nadim” in a script font on every cover. Not because it looks good. Because it is the one element in an otherwise pixel-perfect system that a machine, left to its own devices, would never think to add.
Taste Gets Stronger With Every Decision
What makes taste unusual as an asset is that it appreciates through use rather than depreciating.
Every creative choice you make , which headline, which case study, where exactly to end a paragraph builds a judgment model that becomes the filter for the next choice. The reps are not visible on the outside. The accumulation is.
This is the most hopeful part of the picture. Every creative professional who has ever stopped at a bad layout and asked themselves why it is wrong, everyone who has ever rewritten a headline three times to find the sharper version, everyone who has ever cut a paragraph they loved because it was not serving the argument, has been doing compound-interest work without calling it that.
This is why experienced creative directors are not faster processors of options. They are more refined refusers of options. The ability to scan forty competent layouts and immediately identify the three worth discussing is not speed. It is a dense, compressed library of pattern-and-reject, assembled through sustained, opinionated exposure to work that was trying and mostly failing to be excellent.
In an AI-accelerated workflow, this matters more than it ever has. When a tool generates forty competent options in the time it once took to produce one, the bottleneck shifts entirely to the human doing the selecting. The judgment has to be faster, sharper, and more accurate than the tool is prolific. The person with a compounding taste library just became the most valuable person in that room.
The Cannes Lions international festival of creativity (the industry’s benchmark awards event, running since 1954) made this concrete in 2026. For the first time in the festival’s history, all submissions are subject to integrity standards that evaluate provenance and human authorship alongside craft. What Cannes is acknowledging, in its characteristically deliberate institutional way, is that the creative industry has moved from valuing who can make it to valuing who can judge it.
The question, then, is not whether to build taste. It is how.
“Experienced creative directors are not faster processors of options. They are more refined refusers of options. That refusal is built from years of reps, and no prompt generates it.”
How It Gets Built
Taste is not received. It is built slowly, through a specific kind of practice.
The mechanism is not consuming more. It is having more opinions about what you consume. The designer who can identify a wrong spacing decision from across a room developed that ability not by reading about spacing but by stopping at bad spacing, naming what is wrong, articulating exactly why, and filing the specific quality of its wrongness somewhere she will access automatically for the next twenty years.
Three practices matter more than any tool or tutorial.
The first is cross-domain exposure. The sharpest taste in any field tends to be built on fluency in adjacent ones. The designer who reads novels. The strategist who studies architecture. The engineer who has spent real time with music. Not because the domains share technique but because they share the underlying logic of what it means to decide what to include and what to leave out. Paul Rand (the American graphic designer, creator of the IBM and ABC logos) read widely outside graphic design. Dieter Rams (the German industrial designer whose work became the direct inspiration for Apple’s design language) drew from music and architecture as reference disciplines. Taste built in one room tends to come from inputs gathered in several.
The second is defended positions. You do not have taste unless you have opinions you will argue for. The ability to say “this is right and this is wrong,” precisely, in the face of technically competent alternatives, and with a clear account of why, is the practice that sharpens taste into something you can actually use when fifty options are sitting in front of you.
The third is the mistake archive. Every call you got wrong is tuition paid. The headline that underperformed. The design direction you championed that confused the audience. The feature you shipped that nobody used. Taste only compounds if you stay in the room after the wrong call and understand specifically what you missed. Not “it didn’t land” but “I misjudged this because I prioritised x over y.” The specificity is the compound interest.
Who Becomes Most Valuable Now
The work that AI cannot do well is not creative work in general. It is the specific work of choosing.
A creative tool can produce a hundred options. It cannot tell you which of those options is right for this brand, this audience, this specific tension in the market at this specific moment. That knowing lives in accumulated human judgment, and it does not transfer to a prompt.
This changes what you should be building in yourself and in the people around you. In a pre-AI talent model, the expensive thing to develop was technical skill: how to produce something at the required standard, how to operate the tools, how to meet the spec. In the model we are now in, the expensive thing to develop is judgment. How to survey a space of options and know, immediately and defensibly, which three are worth discussing. How to say “not that” in a way that moves the work forward. How to recognise the version that is almost right and to fix it, rather than accepting the version that is merely technically complete.
This also changes who you want in the room. The person who produces the most options fastest is not the most valuable person at this stage. The person who can look at fifty options and immediately identify which three deserve attention, and can explain why in terms that others can use, is worth considerably more than they were five years ago.
Mosseri ended his memo with this: “In a world of infinite abundance and infinite doubt, the creators who can maintain trust and signal authenticity, by being real, transparent, and consistent, will stand out.”
Strip out the platform-speak. What he is describing is judgment: applied publicly, maintained over time, and expressed with enough specificity to be recognisably yours.
“The tools produce competent. Taste produces excellent. And taste, as it happens, was already being built by everyone who ever cared enough to have an opinion about their work.”
The Take
The Asset You Already Have
The floor has risen for everyone. That is the news you have been reading for two years, and it is accurate. Here is what is less reported: when the floor rises, the ceiling rises with it.
The distance between competent and excellent has never been larger. The tools produce competent. Taste produces excellent. And taste was already being built by everyone who ever cared enough to have an opinion about their work, by everyone who ever rewrote a headline three times, by everyone who ever cut a paragraph they loved because it was not serving the argument.
You could object that “taste” is simply the word we use for whatever worked. That the people we call tasteful were just lucky, and the compounding mechanism is survivorship bias. That is a reasonable position. Here is the problem with it: it cannot explain why the same people keep winning across changing conditions, different tools, and categories they have never worked in before. Luck is not portable. Taste is. The machine does not compound. You do.
1
The Choice Audit
Review your last ten creative decisions: briefs, headlines, designs, arguments, pitches. Of those ten, how many were genuine choices you made, and how many were defaults you accepted because they were generated or available? The audit shows where your judgment is operating and where you have quietly outsourced it.
Anyone making creative decisions
2
The Defended Position
Name one opinion about your field that you would argue for in a room full of people who disagreed. Not a preference. A position, with a reason behind it. If you cannot name one immediately, you are accumulating exposure without accumulating judgment. That is the thing to fix first.
All disciplines
3
The Cross-Domain Rep
Spend an hour with one discipline adjacent to yours: a novel, a building, an album, a film. Write two sentences about what the best thing in it did (not what it was about). File them. The practice builds the library that judgment draws from, and cross-domain fluency is where the sharpest taste consistently comes from.
Designers, writers, strategists, engineers
4
The Signature
On your next public creative output, whatever form it takes, add one element that is specifically and unmistakably yours. Not a brand watermark. A presence. The one thing in an otherwise produced system that tells the reader a person was here, and this person had a point of view. It costs nothing. It is the signal the new environment rewards.
87% of consumers across the US, UK, France, and Germany said the rise of generative AI has made it harder to discern fact from fiction in online content. 6,000-person primary research study. Published May 2024: blog.adobe.com. A separate Adobe study found 67% of consumers expect brands to disclose AI-generated content.
Cannes Lions 2026 Global Integrity Standards
For the first time in the festival’s 72-year history (founded 1954), all competition entries are subject to global integrity standards requiring authenticity verification and provenance review. Statement: canneslions.com. Coverage: “Cannes Lions 2026: The AI Hype Era Is Over, Proof Is the New Flex,” Ad Pulse, 2026.
Ira Glass, on the taste-ability gap, 2009
Interview animated by David Shiyang Liu. The foundational description of the creative taste gap: beginners enter creative work because their taste already exceeds their ability to execute. The gap closes through volume of work. Widely cited across design and editorial disciplines. youtube.com/watch?v=91FQKciKfHI.
Further reading, 2025-2026
“Taste Is the New Bottleneck: Design, Strategy, and Judgment in the Age of Agents and Vibe-Coding,” Designative.info, February 2026. “Why creative taste, not AI, is the true advantage for brands,” Ad Age, 2025. “After an oversaturation of AI-generated content, creators’ authenticity and ‘messiness’ are in high demand,” Digiday, 2025. “Taste will be the new creative superpower in 2026,” Creative Bloq, 2025.