Leading the AI Orchestra

The boundary between using software and building software is dissolving. Coding agents make it possible to direct systems that construct themselves according to your intent—whether you're a developer, manager, researcher, or knowledge worker.

Workshop: Late February 2026

Join Fausto Albers (AUAS Digital Twins Lab), Prof. Jurjen Helmus (AUAS), and Demetrios Brinkmann (MLOps Community) for a structured session on agent orchestration and practical implementation patterns. In-person (Amsterdam) and online participation available.

Register your interest
Fausto Albers~10 min read

Every Chatbot Is Becoming a Coding Agent

Why that matters even if you never write a line of code


Last week, Anthropic released Claude Code 2.1 and launched Claude Cowork as a research preview. Claude Code is a terminal-based coding agent that has become one of the fastest-growing AI products of 2025. Cowork brings the same capabilities to people who have never opened a terminal.

This pairing reveals something important. The most powerful AI systems are not the ones that answer questions. They are the ones that execute processes. And the substrate for execution is code.

I studied to be a sociologist, became a restaurant owner, and then transitioned to AI research. I now lead GenAI R&D at the Industrial Digital Twins Lab of Amsterdam University of Applied Sciences. I have spent three years building systems where language models do real work—not chat, not summarize, but act. What I have learned contradicts much of what people assume about AI and coding.

You do not need to write code to benefit from coding agents. But you do need to understand why code is the best substrate for AI to execute your intent.


The Accessibility Gradient

Anthropic offers three ways to use Claude. At the base sits Claude Code, a command-line interface for developers who think in systems. In the middle sits Cowork, a graphical interface for professionals who think in workflows. At the top sits Claude Chat, for those who think in questions.

All three tiers run on the same underlying model. But they differ in what that model can touch. Claude Code operates with full system and terminal access. Cowork operates in a protected container, able to see only what you explicitly share. The tradeoff is capability versus safety: more access means more leverage, but also more risk

Claude Code gives you access to a file system, a command-line terminal, a browser, and the ability to generate code at runtime. You can install tools, spawn sub-agents—independent AI processes that handle subtasks in parallel—and define hooks that trigger before and after actions. The January 7 release introduced asynchronous sub-agents that can work simultaneously on different parts of a problem.

Cowork removes the terminal but keeps the leverage. You designate a folder on your computer. Claude can read, edit, and create files within it. It uses the same sandboxed environment as Claude Code—a protected space isolated from the rest of your system for security. When it needs to accomplish a task, it writes code, executes it, and discards it. You see progress updates and approve significant actions. The code is temporary—generated on demand, run immediately, then forgotten. But the power is permanent.

The Anthropic team built Cowork in approximately ten days. Claude Code wrote all the code. The tool that builds complex systems was itself built by the same system.

This creates an accessibility gradient: the same core capability becomes usable at every skill level.


Why Code Matters When You Are Not Coding

If you do not need to produce software, why would you want an AI that writes code?

The answer lies in understanding what code actually does. Code solves four universal problems that every AI agent faces, regardless of whether the end goal involves software.

Code enables precise reasoning. Language models struggle with numerical manipulation because of how they process text. Ask an LLM to calculate compound interest over thirty years, and it will often produce errors. But ask it to write Python code that performs the calculation, execute that code, and report the result—and it becomes reliable. Code provides a deterministic execution environment for any task requiring mathematical precision.

Code enables efficient tool-calling. Traditional AI agent architectures follow a pattern: the model thinks, calls a tool, reads the result, thinks again, calls another tool, and repeats. This is slow and expensive. A coding agent can write a script that performs an entire sequence of operations, execute it once, and process the results. What takes a traditional agent multiple API calls can happen in one.

Code enables context management. Language models have limited context windows—their working memory for text. Load too many documents or tools, and performance degrades. A coding agent can store information in files, query it selectively, and load only what it needs when it needs it. The file system becomes an extension of memory.

Code enables universal interoperability. You cannot pre-build integrations for every system your users might need to access. But you can give an agent the ability to write code that interacts with any API, parses any file format, or automates any software with a scripting interface. Code generation provides infinite flexibility in how systems connect.

This is why giving an LLM access to a file system, browser, and terminal is so powerful. You are not just giving it tools. You are giving it the ability to create tools.


What This Looks Like in Practice

Let me make this concrete with two workflows from my own work.

The expense report that wrote itself. I use an American Express card for both personal and business expenses. Every quarter, I need to classify each transaction and locate the corresponding invoice for reimbursement.

The traditional workflow is tedious. Download the transaction history. Open each line item. Determine whether it was business or personal. For business expenses, find the original invoice—which might be buried in Gmail, locked behind a vendor's login portal, or stored in a random folder with an inconsistent naming convention. Match invoices to transactions. Export everything to the accounting platform.

I have done this manually for years. I have also built software to automate parts of it—last year I released an open-source tool using LLMs for structured extraction. But the setup cost was substantial, and every new vendor required new code.

With Claude Code, I approached it differently. I gave the agent access to a folder containing my transaction files, connected it to Gmail and browser tools via MCP (Model Context Protocol—a standard way to give AI agents access to external services), and described the end goal: classify each transaction, find each invoice, produce a spreadsheet with the mappings.

The agent worked for about fifteen minutes, and built a small application. It processed the transactions, accessed my Gmail, navigated vendor portals using the browser, and returned a completed spreadsheet with every invoice attached.

I did not write any code. I specified a problem and watched it solved. The code it generated was temporary—written at runtime, executed, and discarded.

Making 17,000 files navigable. The second example is larger in scope. I lead GenAI R&D for the Virtual Service Mechanic (VSM) project at Amsterdam University of Applied Sciences—an AI system that helps industrial technicians diagnose and repair equipment. The project was presented to Queen Maxima last year and has been running for over a year, accumulating materials across Teams folders, shared drives, code repositories, documentation wikis, and research papers. By my last count: approximately 17,000 files.

I needed to prepare a knowledge session for manufacturing companies considering partnership with the lab. To facilitate this, I needed to understand what we had, organize it coherently, and create onboarding materials.

No language model can read 17,000 files at once. The strategy had to be different.

I spent four hours writing a detailed PRD—a Product Requirements Document, the format used in AI coding to specify complex deliverables. The goal: structure, organize, and index the files so that both humans and AI agents could navigate them efficiently. The constraint was explicit: the agent cannot read everything simultaneously.

The strategy involved spawning teams of sub-agents. Each sub-agent tackled a unit of work—analyzing a folder, summarizing a document set, identifying themes. They wrote their findings into a shared artifact, building from high-level categories down to specific file annotations. A main agent coordinated their efforts and synthesized the results.

The system ran overnight. In the morning, I had a reorganized folder structure, a canonical entry point that describes the project for future agents, an onboarding canvas for partner companies, presentation materials, and guidance documents for facilitators and students.

The work required review. I spent a few hours making adjustments. But the bulk of an otherwise week-long task was compressed into a single evening.

The benefits keep growing over time. I uploaded the organized files to our internal GitHub. Now, when we keep working on the VSM code, Claude Code can find the right information quickly. The agent's work made it simpler and more dependable to keep using AI on the project.


The Mechanics: Tools, MCPs, and Sub-Agents

Think of an agent as a player in a video game. Its tools are the skills and abilities it can use to solve problems. An agent might have a tool for reading files, another for writing them, another for executing code, another for searching the web. More tools means more capability—but loading too many degrades performance by filling the context window.

This is where MCPs come in. The Model Context Protocol provides a standard interface that allows tools to be connected dynamically rather than loaded constantly. Instead of carrying every capability at all times, the agent connects to specific tools when it needs them. The Gmail access in my bookkeeping example came through an MCP server—it was not always present, only available when relevant.

Sub-agents extend this further. When a task is too large or requires specialized expertise, the main agent can spawn a sub-agent with a fresh context window and specific instructions. The sub-agent performs its work and reports back without cluttering the main agent's memory. My VSM indexing project used dozens of sub-agents, each focused on a specific folder or document type.

The composition of these elements creates something qualitatively different from a chatbot that answers questions. It creates a system that can decompose problems, allocate resources, manage state, recover from errors, and build the tools it needs at runtime.


Why This Should Matter to You

I understand if this sounds like developer territory. The terminology is dense. The workflows feel technical.

But the implications extend far beyond software engineering.

Consider what happens when AI can not only answer questions but execute multi-step processes autonomously. The administrative tasks that currently consume knowledge worker hours—filing, categorizing, extracting, formatting, reporting—become candidates for automation. This automation does not require engineers to build custom software. Anyone can invoke it by describing what they want.

Consider the bookkeeping example. That task required reading emails, logging into websites, understanding document formats, classifying transactions, and assembling results. A traditional automation approach would require building integrations for each step. A coding agent generates those integrations on demand.

Now multiply that across every administrative task in a knowledge worker's day. Email triage. Report generation. Data extraction. Calendar coordination. Research synthesis. Each involves repetitive cognitive work that can be specified but not easily scripted in advance.

Coding agents handle the gap between specification and execution. You describe what you want. They figure out how to build it.


The Professional Landscape

I have been practicing with generative AI since 2022, when I discovered GPT-3 while working on a startup that predicted consumer flavor preferences. I realized then that the ability to bridge human intent with system execution would eventually reshape how all digital work gets done.

I have studied, practiced, and built community around this field. Let me be direct: I am miles behind. Progress moves faster than any individual can track.

Andrej Karpathy—one of the founders of OpenAI and a researcher whose work has shaped modern AI—recently posted that he has "never felt this much behind as a programmer." He described the profession as being "dramatically refactored," with programmers contributing "increasingly sparse" amounts of code as AI handles more of the implementation. If someone at his level feels this way, the signal is clear.

This does not eliminate jobs in any simple sense. It shifts where human judgment adds value. The bottleneck moves from execution to specification—from knowing how to do something to knowing what should be done. If you can hold a coherent mental model of a desired outcome and articulate its logic clearly, you can direct systems that build themselves in front of you.

You do not need to outrun AI. You do not need to keep up with every development. But maintaining a relative position matters. Understanding what these tools can do, where their limits lie, and how to work alongside them—that knowledge compounds.


What To Do About It

The path forward involves three capabilities:

1. Recognize the Problem

Identify which of your workflows involve repetitive cognitive work that can be specified but not easily scripted. These are candidates for agent automation.

2. Understand the Solution Shape

Learn enough about agent architectures—tools, context management, sub-agents, feedback loops—to communicate what you want. You do not need to implement these systems. You need to describe them clearly.

3. Develop Systems Thinking

Effective agent orchestration requires reasoning about states, sequences, dependencies, and failure modes. This is not coding knowledge. It is design knowledge—the same knowledge that makes someone effective at project management, process improvement, or organizational design.


Join the Workshop

If you want to develop these capabilities in a structured way, I'm organizing a workshop series—"Leading the AI Orchestra"—with Professor Jurjen Helmus of Amsterdam University of Applied Sciences and Demetrios Brinkmann, founder of the global MLOps community.

Professional boundaries are blurring. Developers need to understand product and strategy. Managers need to understand enough about AI systems to direct them effectively. Everyone who operates a screen is becoming, in some sense, a conductor—someone who coordinates capabilities rather than executing every task directly.

The first event is planned for late February 2026 in Amsterdam, with online participation available. If you are interested, you can register at wonderwhy.ai.


The Shift Underway

Building complex AI systems still requires substantial work. Syntax was never the goal—it was always a means to solve a problem. A codebase is a story written in the vocabulary needed to describe what should happen and when. As with any storytelling, the combination of timing, information flow, and knowing what belongs where gives words meaning beyond their sum.

What is changing is who can tell those stories.

Anthropic has built a gradient of accessibility that lets anyone—from the systems thinker to the person who just wants their emails sorted—tap into the same underlying power. Cowork provides the same capability as Claude Code, made visible to a different audience.

The chatbots are becoming coding agents. The coding agents are becoming accessible to everyone. And the boundary between "using software" and "building software" is dissolving into something new: directing systems that construct themselves according to your intent.

The question is not whether this will affect how you work. It is whether you will be ready when it does.


Fausto Albers leads GenAI R&D at the Industrial Digital Twins Lab of Amsterdam University of Applied Sciences and co-founded the AI Builders Club. For workshop details, visit wonderwhy.ai.

This post reflects personal analysis and does not represent official positions of Amsterdam University of Applied Sciences or the Digital Twins Lab.

Register for Leading the AI Orchestra

We'll confirm dates and format details soon. Leave your information to receive updates. No obligation.

Attendance format *

We use your information only to contact you about this workshop. We don't sell data. Request deletion anytime: admin@stepintoliquid.nl. Retained for 12 months unless you request earlier deletion.