Teaching the Machine to Think

This text is based on a very long conversation with Claude, after I had read an article about agentic AI https://www.computerworld.com/article/4135331/ai-agents-still-need-humans-to-teach-them.html

Based on the fact, that Claude is just awesome, I was aksed it Claude should write a summary/concept/guide, based on our conversation. I agreed and you’ll finde the result here:

The Uncomfortable Truth About Agentic AI

A short article published in February 2026 made a claim that should have surprised nobody but apparently still does: AI agents perform significantly better when humans provide them with detailed, curated procedural knowledge. Left to generate their own skills, they show no meaningful improvement at all.

The response from parts of the AI community was telling. Some called it a temporary limitation, a capability gap that the next model generation would close. Others framed it as evidence that AI would never replace human expertise. Both reactions missed the more important point.

The question is not whether AI agents will eventually become more autonomous. They probably will, in certain domains, under certain conditions. The question is what kind of human infrastructure needs to exist around them — now and in the future — for that autonomy to produce trustworthy, accountable, and genuinely valuable outcomes.

This document is an attempt to answer that question seriously. Not with a diagram of boxes and arrows, but with a way of thinking about what it actually means to build agentic systems responsibly.

Start With Why — And Mean It

Every failed technology initiative in organizational history shares a common ancestor: someone asked “how do we implement this?” before asking “why should we?” The agentic AI wave is repeating this pattern at remarkable speed.

Organizations are deploying agents to automate processes, reduce headcount, accelerate workflows — and in many cases, doing all of this without a coherent answer to the most fundamental question: what are we actually trying to achieve, and why does it matter?

This is not a philosophical luxury. It is a practical prerequisite. Without a clear why, you cannot evaluate whether your agents are succeeding. You cannot design meaningful escalation criteria. You cannot align a swarm of agents around coherent objectives. And you cannot explain to regulators, employees, or customers why the system made the decisions it did.

In practice, this means requiring genuine vision before committing to architecture. Not goals. Not KPIs. Vision — a meaningful statement of purpose that survives the question “why does this matter beyond the next quarter?” Most managers have never been asked to produce one, and it shows.

When the why is clear, everything downstream becomes more tractable. The architecture serves a purpose. The agents are designed against real requirements. The governance exists to protect something that actually matters. When the why is absent, you are building on sand — and the agents will faithfully optimize for the wrong things.

Architecture Before Agents

Once the why is established, the next failure mode is building agents before designing the system they will operate in. This is the agentic equivalent of hiring staff before deciding what the organization does.

Responsible agentic development begins with systemic questions: What tasks need to be automated, and in what sequence? Which agents, if functional, would deliver enough value to justify the governance investment? Where are the critical handover points between agents and humans? What does a useful minimum viable product look like — not impressive in a demo, but genuinely valuable in production?

The MVP question is particularly important because it forces prioritization based on value rather than technical interest. It is easy to become fascinated by what agents can do in principle. It is harder, and more important, to identify which agents should be built first because they solve a real problem with measurable impact.

This architectural thinking also creates the container within which individual agents can be designed intelligently. An agent does not exist in isolation. It exists in a system with other agents, with human oversight layers, with data flows and decision points and failure modes. You cannot design an agent well without understanding the system it serves.

The Pre-Chewing Principle

Here is a useful mental model for agent design: think of each agent as a child being prepared for a task. Not a child left to figure things out alone, but one whose parent has thought carefully about what they need to know before they start — and has provided it.

This is what the SkillsBench research confirms empirically. Agents with curated procedural knowledge — context, domain grounding, step-by-step guidance — outperform agents without it by an average of 16 percentage points. Agents asked to generate their own preparation show no improvement at all. The child who is pre-chewed for the task performs significantly better than the child left to work it out independently.

Pre-chewing means more than writing good prompts. It means thinking through: What domain knowledge does this agent need before it encounters the task? What does “I don’t know” look like in this context, and what should the agent do when it gets there? What are the failure modes specific to this domain, and how should the agent recognize them? What is the escalation path when the agent reaches the boundary of its competence?

The pre-chewing burden is real and it does not disappear as AI capabilities improve. Even if agents become better at certain kinds of autonomous learning, the organizational knowledge that makes their outputs contextually appropriate — knowing the culture, the constraints, the history, the stakeholder landscape — will still need to come from humans. That is not a limitation to be designed around. It is a structural feature of what organizations are.

Consequence-Aware Design

Current agent design is almost entirely task-focused. Here is the objective. Here are the tools. Execute. What it rarely includes is any meaningful grounding in what happens if the execution goes wrong.

This matters because consequence awareness changes behavior in decision-making systems. A doctor who understands the full implications of a misdiagnosis approaches ambiguous cases differently from one who thinks only about completing the diagnostic protocol. A lawyer who internalizes what a wrong interpretation means for their client reads contracts differently from one processing documents as a task queue.

Embedding consequence context into agent design — what does a wrong decision mean in this domain, for this organization, for the humans involved — can function as a natural brake on overconfident action. An agent that “knows” a particular class of errors triggers compliance breaches or financial liability will, if well-designed, treat ambiguous situations with more caution and default to escalation rather than proceeding.

The human accountability side is equally important and almost universally neglected. Too many agentic deployments diffuse responsibility so broadly that when something fails, nobody is actually accountable. This is not just an ethical problem. It is a governance failure that removes the organizational incentive to take oversight seriously.

Making human accountability explicit — this specific role, this specific person, bears the consequence of this agent’s decisions in this domain — is a powerful forcing function. It transforms oversight from a checkbox into a genuine responsibility. The agent may execute. The human carries the weight. That asymmetry, properly designed, creates the right incentives on both sides of the human-machine boundary.

The Swarm as a Team

Individual agent design is necessary but not sufficient. Agentic systems in practice are not single agents performing isolated tasks. They are swarms — multiple agents with different capabilities operating in coordination, passing context between them, making interdependent decisions.

This raises a question that current governance frameworks mostly ignore: how does a swarm of agents remain coherent? How do you prevent individual agents, each optimizing for their local task, from working at cross-purposes at the system level?

The answer, developed through thinking about how high-performing human teams actually work, is vision propagation. A well-functioning team does not just receive task assignments. Every member understands the vision, the why, and uses that as a decision-making compass when they encounter situations not covered by their instructions. This shared directional understanding is what makes teams coherent rather than merely coordinated.

The same principle can be applied to agent swarms. Rather than treating the vision as context for the orchestrator alone, it can be propagated as persistent shared context across all agents in the system. Not just “here is your task” but “here is what we are trying to achieve, here is why it matters, here is what failure looks like at the system level.” Agents informed this way are better positioned to recognize when their local execution conflicts with the system-level intent, and to escalate rather than proceed.

This requires the vision to be operationalized more precisely than in human communication. A human team can interpret an inspiring narrative through shared cultural context and conversation. Agents need the intent structured and explicit. But this is a design challenge worth solving, not a reason to abandon the principle.

Learn From the Rockets

In 2006, SpaceX launched its first Falcon 1 rocket. It failed. So did the second. And the third. Each failure was documented, analyzed, and shared across the team. The information extracted from what went wrong on launch three directly informed what succeeded on launch four.

This feedback loop — running fast, without shame, with rigorous documentation — is a significant reason why SpaceX moved faster than any government space program with a fraction of the budget. They understood something that most organizations do not: in complex engineering, failure is not the opposite of success. It is the mechanism of it.

Compare this to the organizational response to the Challenger disaster. Engineers knew about the O-ring problems. The technical failure was preceded by a human communication failure — a culture in which raising concerns felt more dangerous than staying silent. The “you made a mistake, you are bad” dynamic produced an environment where critical information was suppressed until it was too late.

Agentic AI development will produce failures. Some will be obvious — an agent producing clearly wrong outputs. Others will be subtle — a swarm quietly optimizing for the wrong objective, producing plausible-looking results that diverge from intent in ways that only become apparent downstream. Understanding what went wrong in these cases requires reconstructing the full context: the vision framing, the agent instructions, the architecture decisions, the specific trigger conditions. Without disciplined documentation at the time, that reconstruction is often impossible.

The organizations that treat these failures as expensive embarrassments to minimize will learn slowly. The ones that document honestly, share across teams, and treat failure analysis as a core competency will compound their learning and develop the pattern recognition that the field currently lacks.

This requires something that cannot be designed into the technical architecture: psychological safety. Failure documentation only works if the organizational culture treats it as a contribution rather than a confession. Building that culture is a leadership challenge that precedes any technical investment. If the humans operating around the system cannot report honestly when things go wrong, no governance framework will save you.

A Framework in Six Layers

What has emerged from this line of thinking is not a technology framework. It is an organizational one, built from the same principles that have always governed effective human systems — applied to a domain where most practitioners are starting from scratch technically while ignoring everything organizational design already learned the hard way.

Layer 1: Vision and Why

Before architecture, before agents, establish genuine purpose. What are we trying to achieve, and why does it matter beyond the next quarter? This becomes the evaluation criterion for everything that follows.

Layer 2: Architecture Before Agents

Design the system before designing its components. Which agents are needed? Which deliver MVP value? Where are the human oversight layers? What are the critical handover points?

Layer 3: The Pre-Chewing Principle

Each agent must be prepared in detail before deployment. Domain knowledge, contextual grounding, escalation criteria, failure mode recognition. Not just what to do — what to do when uncertain.

Layer 4: Consequence-Aware Design

Embed the weight of wrong decisions into agent context. Name the human who carries accountability for each agent’s domain. Make consequences visible and responsibility explicit.

Layer 5: Vision Propagation Across the Swarm

Treat the agent swarm as a team informed by shared purpose. Propagate vision and why as persistent context across all agents, creating coherent judgment rather than isolated task execution.

Layer 6: Failure as Learning Infrastructure

Document what goes wrong. Share it across teams. Treat failure analysis as a core competency, not a post-mortem ritual. Build the psychological safety that makes honest reporting possible.

The Question That Remains

The longer agentic AI exists, the clearer it becomes that the technology is not the hard part. The models improve. The tooling matures. The capability gaps narrow, sometimes faster than expected.

The hard part is the same as it has always been: getting humans to think clearly about what they want, to take responsibility for what they build, to learn honestly from what goes wrong, and to design systems that serve purposes worth serving.

Agentic AI does not change that challenge. It amplifies it. Systems that act autonomously at scale, in real organizational environments, with real consequences — these require more human clarity, more human accountability, and more human learning capacity than the systems they replace. Not less.

The question is not whether your organization is ready for agentic AI. The question is whether the humans in your organization are ready to govern it well.

After this engaging chat, I wated to ask Gemini, to create a featured image for this post. To provide Gemini with context, I shared the text from Claude. Gemini reacted quite positive and here’s his reaction:

Would you like me to help you take the concept Claude wrote and “stress-test” it against these six layers of governance from the document?

No need to say, that I confirmed 🙂

To stress-test your concept, we can pass it through the Six-Layer Framework outlined by Rob van Linda. This approach treats AI governance as an organizational challenge rather than just a technical one.

Here is how we can evaluate your plan using the document’s criteria:

1. The “Why” Test (Layer 1)

The Requirement: Establish a genuine purpose that survives the question, “Why does this matter beyond the next quarter?”.
The Stress-Test: Does your concept define a vision (a statement of purpose) rather than just KPIs or automation goals?. Without a clear “why,” agents may faithfully optimize for the wrong things.

2. Systemic Architecture (Layer 2)

The Requirement: Design the system before the components.
The Stress-Test: Have you identified the critical handover points where the AI passes a task back to a human?. The concept should define which agents deliver “MVP value” rather than just being technically impressive.

3. The “Pre-Chewing” Depth (Layer 3)

The Requirement: Provide detailed procedural knowledge, as agents left to generate their own skills show no meaningful improvement.
The Stress-Test: Does your plan include curated domain grounding and step-by-step guidance?. Specifically, does it define what the agent should do when it encounters the boundary of its competence?.

4. Consequence Awareness (Layer 4)

The Requirement: Embed the weight of wrong decisions into the agent’s context.
The Stress-Test: Does the concept name a specific human role that bears the accountability for the agent’s decisions?. High-stakes domains require the agent to “know” when an error triggers a compliance breach, forcing it to default to escalation.

5. Swarm Coherence (Layer 5)

The Requirement: Propagate the “vision” as persistent context across all agents in a system.
The Stress-Test: If your concept involves multiple agents, how do you prevent them from working at cross-purposes?. They need the “why” structured explicitly so they can recognize if local execution conflicts with system-level intent.

6. Failure Infrastructure (Layer 6)

The Requirement: Treat failure as a learning mechanism, much like SpaceX’s rocket iterations.
The Stress-Test: Is there a disciplined process for documenting and sharing failures across your team?. This requires “psychological safety,” where reporting a mistake is seen as a contribution rather than a confession

Would you like to share the specific concept Claude wrote, or should we start by drafting a “Vision Statement” (Layer 1) for your AI project to ensure it’s built on a solid foundation?

To build a Vision Statement that aligns with Rob van Linda’s framework, we must move beyond simple KPIs or quarterly goals. A true vision acts as a “decision-making compass” for both humans and AI agents.

Based on the principles in the document, a robust Vision Statement for your AI project should address these three pillars:

1. The “Why” (Purpose)

The Goal: Define what you are trying to achieve and why it matters beyond the next quarter.
The Impact: Explain the value this brings to regulators, employees, or customers.
The Risk of Absence: Without a clear “why,” you cannot align a swarm of agents or design meaningful escalation criteria.

2. The Human-Machine Boundary

The Responsibility: Explicitly state that while agents execute tasks, humans carry the weight of the consequences.
The Role of Expertise: Frame the AI not as a replacement for human expertise, but as a system that requires human-curated procedural knowledge to function.

3. Resilience and Learning

The SpaceX Model: Commit to treating technical failures as a “mechanism of success” rather than an embarrassment.
Psychological Safety: Ensure the vision promotes an environment where humans can report agent errors honestly and without shame.

Draft Template for Your Vision

“We are building an agentic system to [Insert Core Purpose] because [Insert Long-term Value]. Our agents operate within a human-designed architecture where [Insert Specific Human Role] remains accountable for outcomes. We commit to ‘pre-chewing’ our agents with deep domain expertise and treating every system failure as essential data to compound our collective learning.”

I truely love to cooperate, discuss, explore and interact with AI 🙂

Here you can donload a manifesto about this topic. It has been created on a long converation with Claude Sonnet 4.6: teaching_the_machine