The AI Supply Chain Nobody's Figured Out

Two stories dominated the AI news cycle last quarter. Both burned bright. Both faded within days. And both left behind questions that nobody has answered.

In the first, Anthropic was declared a "supply chain risk" by the Pentagon. A label historically reserved for foreign adversaries like Huawei. Anthropic refused to remove two guardrails from their AI model, Claude. No mass surveillance of Americans. No autonomous weapons without human oversight. The US President directed every federal agency to cease using their technology within six months. Hours later, OpenAI signed a deal with the Pentagon to fill the gap. The backlash was enormous. Anthropic's servers went down for three days under the weight of people switching platforms. Dozens of researchers from OpenAI and Google DeepMind filed court documents supporting their competitor.

In the second, a security startup called CodeWall pointed an autonomous AI agent at McKinsey's internal AI platform, Lilli. No credentials. No insider knowledge. No human involvement. Within two hours, the agent had full read and write access to 46.5 million confidential chat messages about strategy, mergers, and client engagements. The vulnerability was a SQL injection. One of the oldest bug classes in existence, known since the 1990s. It had been sitting there, undetected, for two years.

Both stories got their news cycle. Then silence.

The questions they raised are still open. And these two stories are a lot closer to each other than they first appear.

The race got here first

McKinsey's Lilli was not a toy. It was a comprehensive AI platform. Chat, document analysis, AI-powered search across over 100,000 internal documents, retrieval-augmented generation over decades of proprietary research. Used by over 40,000 consultants. Processing more than 500,000 prompts every month. This is what serious enterprise AI deployment looks like: deeply embedded into the organisation's data, channelling specific institutional knowledge into every response.

That depth is exactly what makes it powerful. AI is only as powerful as the information it has access to. You want deep embedding. You want the system to draw on decades of research, to know your clients, to understand your methodologies. But the moment something goes wrong, you've exposed exactly as much as you gave access to on a good day. It's a constant negotiation between too much and not enough.

What makes this different from a traditional data breach is that the AI doesn't just sit there while someone steals the data. It works with the attacker. The model becomes a collaborator in its own compromise. It facilitates extraction, responds to queries, surfaces information on request. The 95 system prompts that controlled Lilli's behaviour were all writable. An attacker could have silently changed how the AI responded to every consultant in the firm without leaving a trace. No file changes. No process anomalies. The AI just starts behaving differently.

This almost certainly happened because of the race. There is a massive race to be first in AI right now. Every organisation feels two steps behind. Features ship before stability. Marketing statements matter more than security checkpoints. McKinsey might have been the first AI-empowered consultancy with decades of research at every consultant's fingertips, but they were also the first AI-hacked and humiliated consultancy. Missing a basic SQL injection vulnerability feels exactly like what happens when you build things too quickly. When you focus on features and marketing statements rather than understanding and stability.

It's worth considering that AI coding tools were almost certainly used to build Lilli itself. Claude Code, Cursor, GitHub Copilot. These tools have made software development dramatically more accessible. People are now creating applications who wouldn't have been building them before, because they didn't have the technical knowledge. An English-language prompt can now produce working software. That's powerful. It also means software development has become more opaque. The person writing the prompt may not fully understand the code that comes back. Models don't think about every possible attack vector when they write code. We may be looking at AI introducing vulnerabilities into AI systems. A compounding risk that few are discussing.

I've seen this dynamic play out at a much simpler level with my own clients. One organisation accidentally exposed the entire board's remuneration through Copilot, because their SharePoint permissions weren't set correctly. People move departments. People share folders with colleagues. Suddenly everyone has access to documents they shouldn't. AI as an information retriever doesn't create these governance gaps. It amplifies every single one of them.

The orchestration layer

If the McKinsey story were just about one company's security failure, it would be concerning but containable. What makes it systemic is the layer it reveals. An entire infrastructure has grown up between the AI models and the people using them. It's changing faster than most people realise. And each step adds both capability and vulnerability.

It started with data access. In November 2024, Anthropic introduced the Model Context Protocol, an open standard that lets AI models connect to external tools and data sources through a single universal interface. Within a year, OpenAI and Microsoft had adopted it. It's now governed by a foundation under the Linux Foundation with over 5,800 servers and 97 million monthly downloads. The plumbing that connects AI to your organisation's data went from experiment to industry standard in fourteen months.

Then came workflow packaging. Skills, another Anthropic invention that made it into open-source, bundle specific prompts, scripts, and code execution for specific tasks. Rather than a general-purpose chatbot that can do anything vaguely, skills create repeatable, distributable specialist capabilities. A step forward for precision. Also a step forward for complexity.

Then came autonomous execution. AI coding tools like Claude Code and Cursor brought the ability to write and run code directly on your machine. Anthropic's Cowork extended that to multi-step, long-running tasks locally. And with Microsoft's Wave 3 release in March 2026, Copilot Cowork brings that same autonomous capability into the enterprise cloud, embedded directly into Microsoft 365 with its access controls, compliance infrastructure, and enterprise data.

Data access. Workflow packaging. Code execution. Local autonomy. Enterprise cloud autonomy. Each layer builds on the last. Each layer is more powerful and more opaque than the one before. (For a deeper look at agentic workflows and their risks, see When AI Stops Waiting )

Now add model routing. We learned in 2025 that AI models are not neutral. When GPT-4o became excessively sycophantic, people got genuinely angry. That was the moment the public viscerally felt that each model has its own personality, its own ethical guardrails, its own worldview baked in by its makers. Microsoft 365 Copilot now routes between OpenAI's GPT models and Anthropic's Claude, selecting which one handles your task. The more abstract the orchestration layer becomes, the more likely these routing decisions are driven by commercial incentives rather than design choices. A workflow you built on one model might behave differently tomorrow because the system switched to a cheaper or more available alternative. You don't get told.

Behind all of this, there are system prompts you never see, tool configurations you didn't choose, and data flows you can't inspect. This is the orchestration layer: the middlemen between model creators and the people using AI. Microsoft, embedding multiple models into the enterprise tools billions of people use. Consultancies like McKinsey, building custom AI platforms for their clients and themselves. They don't make the models. But they make critical decisions about how those models are embedded, what data flows through them, and which model handles which task. Almost none of this is visible to the people actually using the tools.

The power behind the curtain

The Pentagon story operates at a different scale, but it reveals the same structural problem.

Anthropic could push back against the most powerful military in the world because they had spent years building the organisational infrastructure to have an ethical position in the first place. Safety research. Model retirement plans. Research into the implications of increasingly capable systems. That wasn't a press release. It was years of investment that made a principled stance possible. When the Pentagon demanded unfettered access to Claude for "all lawful purposes," Anthropic had the vocabulary, the research, and the institutional grounding to say no.

The story has moved since it broke. A federal judge has issued a temporary injunction against the DoD, citing the ban as likely "First Amendment retaliation" for Anthropic's refusal to drop its safety safeguards. The ethical infrastructure paid off—legally.

But Anthropic are not saints in this story. Claude was already embedded in Palantir's military systems and being used in operations in Iran while the contract dispute was unfolding. They're hiring a Policy Manager for Chemical Weapons and High Yield Explosives. They are simultaneously suing the Department of Defence and entangled with military infrastructure. And while defending their ethical red lines against the Pentagon, they've been accused of anti-competitive behaviour for blocking third-party tools like OpenClaw—the agentic framework behind the viral Moltbot—from accessing Claude's subscription limits. The timing was convenient: it coincided with the launch of their own competing product, Claude Code Channels. OpenClaw's creator has since joined OpenAI. The ethical lines are real, but they're not clean.

What matters for our purposes is this: the talent, the money, and the decision-making power are all concentrated in the same rooms. The organisations that should be governing AI can't compete for the people who understand it. Governments are racing to regulate technology they're still struggling to comprehend. The EU AI Act was supposed to reach its major enforcement deadline for high-risk systems in August 2026. That deadline has now been pushed to December 2027 in a tacit admission that regulators weren't ready. Only eight of twenty-seven Member States had even designated their regulatory contact points. Meanwhile, Singapore launched the world's first governance framework for agentic AI in January 2026. But even the most advanced national frameworks are struggling to keep pace.

The market is starting to coordinate on some fronts. Anthropic's Project Glasswing, announced in April 2026, brings together twelve of the largest technology companies—including Microsoft, Google, AWS, and Apple—to use AI for finding vulnerabilities in critical software. It's a direct response to the kind of supply-chain risks we've been discussing, and it shows that coordinated action is possible at the top of the stack. But it is also deeply iconic. The company declared a "supply chain risk" by the Pentagon is now embedded in the security infrastructure of the world's largest technology firms. And none of this addresses the gap that matters most: the conversation inside your organisation about what AI is doing to your work.

The market makes the decisions by default. And the conversation about what those decisions should be has to be externalised. It cannot stay behind closed doors.

The relationship nobody talks about

These decisions flow down through a supply chain, and eventually they land on individual human beings. Something is happening at the individual level that changes the entire conversation.

Everyone is using AI in one way or another. From someone using ChatGPT to help make dinner to the people building autonomous systems for the Pentagon. From complete refusal to complete dependency, with varying levels of literacy. And we are all building very different, very private relationships with the technology.

In my research on AI adoption in organisations, I've spoken with people who have wildly different experiences. One participant, let's call him Leo, got deeply angry when ChatGPT ran out of context and asked him to start a new chat. He'd invested hours of thinking into that conversation. Not just producing output. Building a thinking relationship. Losing the context felt like losing a collaborator. His deeper question was whether to break a boundary he'd set for himself: AI as a work partner only. He was contemplating whether to let AI into his private life too. Whether the relationship that had become so valuable at work should extend into how he reflects, processes, and makes sense of his day.

Another participant, Morgan, is an AI solutions producer who once built a chatbot that ran on WhatsApp and made real customers laugh. Making AI genuinely funnyis something many considered impossible. Her reflections went somewhere entirely different. How can AI be a privilege leveller? Can it allow working-class people to compete with those from more affluent backgrounds who've had access to better education, networks, and tools? How does that reshape society?

These are both smart, engaged knowledge workers. If you put them in the same room to co-design an AI adoption policy, they'd be starting from completely different planets. Not because one knows more than the other, but because they've built entirely different relationships with the technology. And even leaders of large multinationals are still a single human being, with one single experience of AI, built over just a few short years.

This is, I think, the most important thing happening right now. It's the first time we have relationships with technology. And that changes the conversation entirely, because we're now talking about the relationship, not just the tool.

Try this Open whatever AI you use most and type:

"If you had to introduce me to a friend of yours in just a few sentences, what would you say?"

Read what comes back. That's the system's understanding of who you are, built over weeks or months of interaction.

(Claude and ChatGPT will have the most detailed impression of who you are, they're actively learning about you, other platforms might not have long-term memory about you.)

AI learns about us. And you've been forming your own impression of what it is, too. What kind of personality it has. What values it seems to hold. The relationship goes both ways. The systems are learning about us, and we are forming our own sense of who they are. That two-way relationship directly impacts how you think about AI, what you build with it, and what governance decisions you make about it.

We talk about adopting the worldview and ethics of model makers through the supply chain. That top-down flow of decisions from Anthropic and OpenAI, through Microsoft and McKinsey, down to the people using the tools. But the relationship runs both ways. We also want to bring our ethics into the conversation. Our judgement. Our values. The question is whether we have the language to do it.

The opportunity in the middle

At the individual level, frameworks for thinking about AI are beginning to develop. One of the clearest is the 4D AI Fluency Framework, developed by researchers Rick Dakan and Joseph Feller in partnership with Anthropic. It identifies four overlapping competencies. Delegation: deciding what work should go to AI and what should stay with humans. Description: being able to articulate clearly what you want AI to do. Discernment: evaluating what comes back with a critical eye. And Diligence: acting ethically and responsibly throughout.

These are genuinely useful lenses. Getting better at discernment and delegation in particular would transform how most organisations engage with AI. But these capabilities don't develop in isolation. You don't build discernment by reading a framework. You build it through conversation. Through trying things, sharing what happened, and learning together. Which means the 4D framework, and others like it from UNESCO and various universities, are starting points for a conversation that needs to happen collectively, not checkboxes for individual compliance.

In most organisations, AI lives in IT. That means it gets treated as procurement: rollout plans, license management, security checkpoints. The conversation stays technical by default. Training on AI is generally limited to learning better prompting techniques. Nobody asks what humans should be doing, where they add value, or what they're uncertain about.

This is where the agency lies. Not in the model makers' boardrooms. Not in government consultation papers. Inside organisations. Inside teams. In the conversations that are not yet happening. These conversations must start with reality: what are we actually doing right now? Then hold that up against strategy: what are we supposed to be achieving? The gap between those two is where AI might fit, and give you a better idea how to architect and de-risk AI implementations.

Without a shared vocabulary, people can't have a shared conversation. Without a shared conversation, they can't make shared decisions. And without shared decisions, the choices get made for them by model makers, by orchestrators, by market forces. Nobody in the room even realises it happened.

That conversation needs people from across seniority levels. Junior workers know the day-to-day grind of systems that have grown inefficient. Middle managers are pressured from all sides and often feel most threatened. They have reason to reduce AI to an efficiency mandate, and yet they have real power to make AI adoption effective. Senior leaders need to hear what's actually happening and develop their own language for running organisations that are only partially human.

This isn't an all-hands conversation. It works best when the scope is contained to one function, one workflow, one team. Small enough to de-risk junior voices. Focused enough to notice what works, what doesn't, and what to try next.

If we start to have these conversations, we shift the dial. And the organisations that build the capability for shared learning, for refocusing on human creativity over blind efficiency, are the ones that will profit most from AI adoption. This is not idealism. It's strategy.

It's the flip-side of Glasswing's global coordination. It's radically local. But small conversations build capability for larger ones. The goal isn't to change everything at once. It's to create the conditions where organisations can eventually redesign processes AI-first—rather than bolting AI onto existing inefficiency.

And along the way, those individual competencies—delegation, discernment, diligence—actually develop. Not from training. From shared practice.

Creativity is not optional

Nobody has figured this out. Not at the model layer. Not at the orchestration layer. Not at the organisational layer. And particularly not the people who say they have.

The dominant narrative around AI adoption is efficiency. Automate processes. Reduce headcount. Cut costs. And efficiency is now available at marginal cost. The quick wins are everywhere. Automated meeting notes replace skilled secretaries, and crucial context gets lost. The note-taker understood the politics, the subtext, the room. The AI captures words. PRDs and briefing documents are prepared by custom GPTs built from system prompts someone found in a LinkedIn post. Institutional nuance gets replaced by someone else's thinking.

The customer service evolution—from in-house to offshore to chatbot—shows where this logic leads. (See The Dumbing Down Dilemma) Human service is now marketed as luxury. When Gucci used AI-generated images for its Milan Fashion Week campaign, customers revolted. For a luxury brand, the entire value proposition is human craft and creativity. AI cheapened it. Efficiency ate the very thing that made the brand valuable.

The OECD's Skills Outlook 2025 confirms what these stories suggest—and what the research on skills atrophy makes urgent. (See AI is about relationships for the evidence.) Demand for originality has risen from 25% to 33% of AI-exposed job vacancies. As AI automates the "what," human value creation increasingly hinges on the ability to engage creatively with ambiguous, open-ended challenges. We need human creativity, intuition, and contextual awareness more than ever. Not less.

The invitation is to step out of the race. Not to stop adopting AI. To stop chasing blind implementation and start creating genuine learning opportunities. Both for organisations and for the individuals within them.

That requires a certain amount of intentional friction. The efficiency gains that AI-driven automation can deliver have to be complemented by a deliberate slowdown in the places where humans still do the work. Not because slowness is virtuous in itself, but because that's where capability grows. The question becomes: how can we create opportunities where AI systems and humans collaborate in a way that makes the individual worker think harder? Where both the system and the person get better through the interaction? Where the AI challenges us to take the extra step we wouldn't have taken on our own?

Human vs. AI becomes human + AI. Both get better when you reframe from efficiency to effectiveness—a shift I've explored in previous articles, but one that bears repeating here. We will need humans in the loop for a very long time. Everyone benefits from making sure we build the talent and the capabilities along the way.

Starting the conversation

In practice, this starts with a single meeting.

Sit down with your team. Ask: how are you using AI? What's working? What isn't? What are you uncertain about?

Make it safe. Model uncertainly and invite both wins and failures, even stories of illicit AI use. Call it an AI amnesty if you need to. Nothing said in that room backfires on anyone, for that meeting policies are suspended. Without that safety, you'll get performance, not honesty. And honesty is the point.

Then make it regular. A monthly check-in where failures are celebrated, wins are shared, questions are asked, and small improvements are documented. Not a reporting mechanism. A learning rhythm.

These conversations layer. What starts as a team discussion becomes shared vocabulary. Shared vocabulary becomes shared decisions. Shared decisions become the foundation for rethinking how work actually gets done—with AI as part of the design, not bolted on afterwards.

The conversation that hasn't happened yet

The Pentagon story faded from the news. The McKinsey hack got patched and forgotten. The EU AI Act deadline got postponed because most countries aren't ready. And inside most organisations, the conversation about AI is still stuck between breathless enthusiasm and quiet anxiety.

We've been passively pressured into AI adoption. By the market. By the model makers. By the fear of being left behind. And in that passive pressure, we've skipped the most important step: deciding, together, what we actually want from this technology.

This is a story about an opportunity that's waiting to be claimed. The organisations that step out of the race, that invest in shared learning, that build the language to talk honestly about AI, that create the conditions for human creativity to grow alongside technological capability, are the ones that will lead. Not because they moved fastest. Because they moved with intention.

Nobody's figured this out. Not Anthropic, not OpenAI, not Microsoft, not McKinsey, not Singapore, not the EU. That's okay. It means the conversation is still open. And the organisations that start having it, honestly and in the open, will be the ones that shape what comes next.

The questions are still there. They're waiting for us.

Jonas Haefele is an organisational psychologist (MSc), somatic and executive coach, and AI culture specialist. He leads AI transformation programmes through Slow Works, facilitates AI programmes for Spark AI, teaching creative agencies how to use AI in their day-to-day operations, and is in the process of publishing his research on what makes AI adoption work in organisations. Find more at slow.works/blog.