AIMultipleAIMultiple
No results found.

Building AI Agents with Anthropic's 6 Composable Patterns

Cem Dilmegani
Cem Dilmegani
updated on Oct 24, 2025

We spent 3 days experimenting workflows and agent pipelines in n8n according to Anthropic’s and OpenAI’s guides on building effective AI agents. We are going to distill down everything we have learned to give you a guide to build functional AI agents in your LLM projects.1 2

  • First, we are going to introduce the crucial components that make up an AI agent. We will cover AI agent components are, and how to choose which component/tool. 
  • We will build agent workflows based on Anthropic’s simple, composable patterns: prompt chaining, routing, parallelization, orchestrator workers, evaluator optimizer and provide and provide an example of an autonomous agent from our AI code editor benchmark.

Building blocks of automation: Workflows vs agents

An AI agent is a system that perceives its environment, processes information, and autonomously takes actions to achieve specific goals, like  coding agents like Cursor or Windsurf, AI-powered code editors with “agent modes” that can autonomously perform coding tasks using models such as Claude Sonnet 3.7. Another common example is customer service agents, who many companies use to handle inquiries.

There are many different ways to design and deploy these agents, depending on the complexity of the workflow and the degree of autonomy required.

To give a quick preview, an AI agent is often a collection of sub-agents, each performing specific tasks. Together, these sub-agents coordinate within multi-agent systems to deliver what we perceive as a single AI agent.

These are fundamentally different from workflows. Workflows are orchestrated sequences of predefined steps, like a recipe that always follows the same order:

When to use AI agents

Before diving into workflow implementation examples for building AI agents, it’s worth pausing for a quick reality check. AI agents aren’t rigid frameworks. Many teams still find that traditional workflows perform well, even in scenarios where agents could, in theory, be applied.

One of the clearest ways to think about this, described in Anthropic’s blog, is as follows:

That said, there are real situations where agents outperform traditional workflows in tasks that demand flexibility, reasoning, and adaptability:

Dynamic conversations that require adaptations:

Some interactions, like basic refund or password reset requests, fit neatly into workflows. But others require nuanced judgment or context-sensitive decisions, such as personalized recommendations, that depend heavily on context and back-and-forth reasoning.

High-value, low-volume decision-making:

Agents can be expensive to run, but in some cases, the decisions they support are far more costly if made incorrectly.

For instance, BCG reported that a leading energy provider in Germany used a GenAI-driven agentic tool to automate payment reviews.3

If you’re planning large-scale infrastructure, like optimizing engineering designs, the cost of computing is negligible. In these high-stakes cases, agents add value because the cost of being wrong far exceeds the cost of running the model.

Multi-step, unpredictable workflows:

Some workflows are too complex, where writing endless “if this, then that” rules becomes its own project.

In these cases, agentic loops simplify the chaos. Instead of hardcoding every possible path, the model dynamically decides the next step based on real-time context and reasoning.

This approach works well for diagnostic systems or tools that handle dozens of shifting variables.

When workflows are better 

High-frequency, low-complexity scenarios:

Some tasks depend more on speed and scale than on reasoning, like:

  • Retrieving information from a database
  • Parsing structured messages or emails
  • Responding to FAQ-style queries

A workflow could process thousands of these requests, with more predictable cost, latency than an agent would.

Understanding components of AI agents

Building agents involves connecting components across several domains such as models, tools, knowledge and memory, guardrails. OpenAI provides composable primitives for each:

Source: OpenAI4

Obviously, OpenAI lists its own things there first, but there’s a wide ecosystem of alternatives. Depending on your use case, you can build agents using frameworks such as LangChain, LlamaIndex, CrewAI, or even custom-built orchestration layers.

I will go into more detail about each of these components:

Models

First, you have the models component. These are your AI models, your large language models that are the core intelligence capable of reasoning, making decisions, and processing different modalities. Of course, the examples that OpenAI gives us are the GPT-5, etc.

Depending on the specific type of agent that you’re building, you want to choose a different kind of model within the OpenAI ecosystem. 

GPT-5 is your flagship model. It’s a thinking model that is strong at reasoning, multi-step problem-solving and is particularly effective at answering most questions. 

You also have o3 mini that has reasoning capabilities but is also faster. And the o3 mini high that is particularly good for coding and logic.

Outside of the OpenAI ecosystem, Claude 3.7 Sonnet is usually the go-to model for people who do a lot of coding and reasoning and STEM subject-based tasks, although Gemini 2.5 Pro is challenging this right now. 

Overall, if you care the most about things being cheap, then you probably want to go with an open-source model and host it yourself. If you want things to be fast, you want to go for smaller models. 

We have benchmarked and compared the top AI models to help you understand how each performs in terms of reasoning, speed, and cost so you can choose the one that suits your goals best.

Tools

Next up is the tools that extend the model’s capabilities, like enabling it to search the web or interact with other systems.

Almost any app can become a tool for your AI. You can connect it to Gmail, Calendar, your drive, or apps like Slack, Discord, YouTube, Salesforce, and Zapier. You can even create your own custom tools.

With OpenAI’s Agents SDK (which requires some coding), you can define tools or use built-in ones like web search, file search, and computer use.

MCP (Model Context Protocol) by Anthropic also simplifies tool integration by standardizing how models access them.

If you’re not into coding, no-code platforms such as  n8n let you drag and drop tools to link them with your model.

Knowledge and memory

There are two main types of memory: knowledge base (static memory) and persistent memory.

  • Knowledge base gives your AI access to static facts, policies, and documents that remain relatively unchanged. This is essential for agents performing policy-driven, or company-specific tasks where reference materials must stay consistent.
  • Persistent memory enables the AI to remember past interactions across sessions. This is crucial for chatbots or personal assistants that need to recall previous conversations.

OpenAI provides hosted services like vector stores, file search, and embeddings to handle memory. 

If you prefer open-source solutions, Pinecone (cloud-native and optimized for vector search) and Weaviate are popular options. 

For those using no-code tools, memory management is usually built into platforms like n8n.

For more see: Building an AI research agent with memory

Guardrails

Guardrails ensure your agent behaves as intended, avoiding irrelevant, harmful, or inappropriate responses. For example, a customer service bot should stay focused on service-related topics, not drift into unrelated ones.

Outside of OpenAI’s ecosystem, popular tools include Guardrails AI and LangChain Guardrails. Many no-code platforms already have guardrail features built in, but it’s still important to understand how they work to maintain control and compliance in your agents.

Orchestration

The final component is orchestration. This involves managing how multiple sub-agents work together, deploying them into production, and monitoring their performance over time.

Once deployed, agents need ongoing supervision. models, data, and behaviors change, so continuous updates and improvements are key.

Several platfroms/frameworks support orchestration, like:

  • Low-code/no-code platforms:
    • Stack AI
    • Microsoft Copilot Studio Agent Builder
    • Relevance AI, etc
  • Open source frameworks:
    • Crew AI, designed for multi-agent systems
    • LangChain, widely used for managing and deploying agent interactions
    • LlamaIndex, particularly strong for document-heavy agents and knowledge-base applications.

Introduction to AI agent workflows and implementations

Now that we discussed the AI agent workflows and the components that make up AI agents, let’s move on to the actual implementations. As mentioned earlier, AI agents aren’t typically a single entity. Instead, they are made up of various sub-agents that interact with each other. One of the best resources I found on common workflows and agent systems is the Building Effective Agents guide by Anthropic.5 Let’s break it down.

At the heart of agentic systems is what Anthropic calls the augmented LLM. This structure consists of three key elements:

  • the input,
  • the large language model (LLM), 
  • and the output. 

Source: Anthropic6

The augmented LLM is capable of generating its own search queries, selecting relevant tools, and deciding what information to store in memory. 

You may notice some similarities with OpenAI’s components (as outlined below). However, this version is more simplified and lacks elements like guardrails and orchestration, but the core structure remains the same. This is perfectly acceptable. For tasks such as testing and deployment, it’s best to refer to OpenAI’s components.

OpenAI’s list of AI agent components7

These augmented LLM building blocks are also known as sub-agents. Now, let’s explore how these sub-agents fit together and interact to form a larger AI agent. We’ll begin with the simpler workflows and gradually move toward more complex, fully autonomous systems:

1. Simple agentic workflows (prompt chaining)

The simplest agentic workflow is called prompt chaining. In this process, a task is broken down into a series of steps, where each sub-agent handles the output from the previous one.

At its core, it functions like an assembly line, but you can introduce decision points to redirect the flow if necessary. The general pattern remains the same: an input is processed by a sub-agent, which passes the result to another sub-agent for further processing, and so on, until the final output is produced. This method is particularly useful for tasks that can be easily split into smaller, sequential subtasks.

The prompt chaining workflow8

Real-world example:9

Prompt chaining in n8n (outline, evaluate & publish to sheets)

In the above example, the user enters a topic in the n8n chat window. Each LLM node utilizes the Azure OpenAI model.

The first LLM generates a structured outline for a blog post. The prompt for the Outline Writer is as follows:

Screenshot of the prompt for the outline generator LLM

Where {{ $json.chatInput }} refers to the topic that was entered by the user in the chat window.

The variable {{ $json.chatInput }} is gray because the workflow has not been run yet. If we had already run or tested the node, it would be green or red, depending on the validity of the variable.

Then, the following LLM will evaluate the outline based on key criteria in the system message section. The prompt can be found below:

The final Blog Writer LLM will append a row in a sheet on the topic based on the outline created by the previous LLM.

Screenshot of the prompt for the Blog Writer LLM

When to use prompt chaining:

  • Tasks can be naturally decomposed into fixed, sequential subtasks
  • Each step contributes meaningfully to the final output
  • Step-by-step reasoning enhances accuracy over direct processing
  • Quality control checkpoints are needed throughout the process

2. Routing workflow

Routing is another type of workflow where an input is received, and a sub-agent is responsible for directing that input to the appropriate follow-up task. Each task is then handled by a sub-agent specialized in that area, and once the tasks are completed, the final output is generated.

A classic example of routing is seen in customer service bots. The bot may receive various types of queries, such as general inquiries, refund requests, or technical support issues. The first sub-agent identifies the nature of the query and routes it to the sub-agent that specializes in handling that particular issue. 

For instance, if the query is about a refund, it would be routed to the refund specialist sub-agent, while a technical support question would be directed to the technical support sub-agent. 

Another example is routing questions to different models based on their strengths. For more complex STEM-related questions, you might route the input to a model like Claude Sonnet 3.7. For simpler, faster queries, you might opt to route it to a model like Gemini Flash, which is optimized for speed. 

Real-world example:10

In the above example, agent routes user input to specialized agents (like a Reminder Agent, Email Agent, etc.) using a structured output from a language model

The router is connected to GPT 4o mini. The prompt and the categories are as follows:

Screenshot of the parameters of the AI agent node

Use case examples:

You can enter a query fin the n8n chat window, for example let’s say:

  • User says: “Remind me to call my mom tomorrow.”
    → Routed to Reminder Agent
  • User says: “Send an email to the HR team.”
    → Routed to Email Agent
  • User says: “Schedule a meeting with John next week.”
    → Routed to Meeting Agent

When to use routing:

  • Diverse input types: Your system receives various types of queries that benefit from specialized handling
  • Resource optimization: You want to assign simple queries to cost-effective processors while routing complex requests to advanced systems
  • Domain specialization: Different categories of inputs require domain-specific expertise or processing logic
  • Performance optimization: You need to balance load and ensure optimal response times across different query types

3. Parallelization workflow

The next workflow is parallelization. This specific agentic workflow typically has two main variations. In parallelization, multiple sub-agents work on a task simultaneously, and their outputs are then combined. 

  • The first variation is called sectioning, where a task is broken down into independent subtasks that run in parallel. 
  • The second variation is voting, where the same task is performed multiple times by different sub-agents to produce diverse outputs, which are then aggregated.

This helps achieving faster modular automation, especially in large workflows.

Sequential workflow vs parallel workflow timing comparsion11

Real-world example:12

Screenshot of the parallelization workflow example in n8n

The parallel execution in n8n example demonstrates a task where the workflow queries Google search using the SERP API to retrieve LinkedIn URLs and store them in a Google Sheet. In the initial setup, the workflow processes each task sequentially, one website at a time:

  1. The workflow is triggered.
  2. The Get tool retrieves the website from the Google Sheet.
  3. The AI agent uses the SERP API to search Google and fetch the LinkedIn URL.
  4. The LinkedIn URL is then updated in the Google Sheet.

At this point, tasks are processed one after another, which can be slow when dealing with large datasets.

n8n has this feature where you can basically select nodes, click and then say I want to convert these selected nodes to a sub-workflow.

And what happens is when you click on this button, it is basically going to name my workflow. When you hit confirm, it turns all of that into a sub-workflow and it is already linked up right here and being called by this guy.

The sub-workflow created

So n8n turned this into a sub-workflow, but you don’t yet have parallelization because it would still run all through here.

To make this actually run in parallel, all items should run as individual executions. So, when you click into the node you can choose run once for each item, which means it is going to call the sub-workflow individually for each item.

And then once you have changed that, you can go into the sub-workflow and click on executions. And you are going to see that all three items are running at the exact same time.

When to use parallelization: Parallelization is most effective when tasks can be divided into smaller, independent subtasks that can run simultaneously, improving both speed and efficiency.

It is also valuable when multiple perspectives or repeated attempts are required to achieve higher confidence in the results. For complex problems involving several dimensions or evaluation criteria, large language models aim to perform better when each specific aspect is handled by a separate model call, enabling more focused and accurate reasoning for each part of the task.

4. Orchestrator workers workflow

The next workflow, which becomes more complex, is the orchestrator–worker pattern.

The orchestrator–worker architecture makes your n8n workflows modular, scalable, and adaptive, turning what would be a single rigid automation into a composable system of cooperating agents.

At first glance, it might look similar to parallelization since multiple sub-agents can be active, but the key distinction is flexibility. Unlike parallelization, the orchestrator–worker setup does not rely on a fixed list of subtasks. Instead, the orchestrator dynamically decides which tasks need to be performed, assigns them to worker agents, and manages their coordination throughout the process.

Real-world example:13

Screenshot of the orchestrator-workers workflow example in n8n

In the above example, the brief is collected once and an orchestrator routes work to multiple specialist agents.

The CEO Agent acts as the orchestrator LLM. It processes the input brief, refines it for each department, selects which worker agents to activate, and determines how their outputs will be integrated. It can decide to call one, two, or all workers depending on context and constraints.


Screenshot of the CEO Agent node

Below, three worker agents, Marketing, Operations, and Finance, each run their own OpenAI Chat Model with separate memory and tool configurations. This allows department-specific prompts and JSON schemas for structured output.


Screenshot of the three worker agent nodes

Once the orchestrator has prepared department-specific instructions, it invokes each worker as a tool to generate outputs based on inputs.

For example the Marketing Agent creates campaigns (name, channel, KPI).

AI tool node (Marketing Agent)

After the worker outputs are generated, the CEO Agent compiles and merges the department responses into a single cohesive plan. The workflow then writes the plan to a Google Doc, adds metadata, converts it to PDF, and uploads it automatically for sharing or review.


Screenshot of document creation, conversion, and upload nodes

When executed, the orchestrator determines which agents to activate, coordinates their collaboration, and combines their outputs into one comprehensive report, demonstrating how orchestrator–worker workflows enable flexible, modular, and composable AI systems.

When to use orchestrator workers workflow: This approach is especially valuable for solving open-ended or evolving problems where the required steps cannot be known in advance.

Examples where the orchestrator–worker workflow is useful:

  • Coding tasks: When developing or debugging complex software products that require coordinated changes across multiple files, where the exact files and edits can only be determined during execution.
  • Research and information gathering: In tasks that involve searching, collecting, and analyzing data from multiple sources, where relevant information cannot be fully identified ahead of time and must be discovered dynamically.

5. Evaluator optimizer workflow

Even more complex is the evaluator–optimizer workflow. This setup moves toward more autonomous behavior, giving the sub-agent or AI agent greater freedom to decide what actions to take and how to improve its own outputs over time.

You start with an input, and the first sub-agent generates a proposed solution. That output is then passed to an evaluator sub-agent, which reviews the result. If the evaluator finds it satisfactory, the output is finalized. But if it determines that the result isn’t good enough, it sends it back to the first sub-agent with specific feedback for improvement.

This creates a continuous feedback loop in which the optimizer iteratively refines its output until the evaluator determines it meets the required quality standards.

Real-world example:14

For this example I walked though a Python simulation, rather than a no-code tool to directly show evaluation schemas, custom logic, and iterative loops.

This is not a full setup. To run the evaluator–optimizer workflow end-to-end, you’ll need proper environment configuration, model initialization, and schema setup, etc. Before this code will execute correctly. the full implementation example is here.

You can also implement an evaluator–optimizer loop using workflow automation tools that supports evaluation nodes, here is an example from n8n.

Evaluator–optimizer workflow with Python:

An example of an Evaluator–Optimizer loop, a common pattern in self-reflective AI systems or agentic workflows

This workflow represents an automated content generation and evaluation loop where two components collaborate: one creates, and the other reviews. It ensures that outputs meet quality standards before finalization.

Step-by-step explanation:

  • Initialize input: Create initial_state = {“content_topic”: topic}.
  • Run the loop: Call evaluator_optimizer_workflow.invoke(initial_state) which iteratively:
    • generates/refines content,
    • evaluates quality,
    • repeats until approved or a max iteration limit.
  • Log outcome: Print completion message and the approved generated_content.
  • Return results: final_state dict (e.g., content_topic, generated_content, quality_assessment).

Workflow visualization:

Evaluator–Optimizer loop with Python results: Each cycle uses previous feedback to improve the content. The loop eventually produces content that meets the quality standard:

When to use evaluator optimizer workflow: This workflow is especially useful when there are clear evaluation criteria and when iterative refinement can lead to meaningful improvements in quality over time.

Examples where the evaluator–optimizer workflow is useful:

  • For example, in a literary translation task, the first attempt might miss certain linguistic nuances or emotional tones. The evaluator would provide feedback and ask for revisions until the translation fully captures the intended meaning and subtleties of the original text. 
  • Another example is in complex research aggregation, where the optimizer gathers and summarizes information while the evaluator checks for depth, completeness, and accuracy. If the evaluator finds the research insufficient, it sends it back for further work until the final report meets all requirements and effectively synthesizes the necessary information.

6. Truly autonomous agent implementation

And finally, there is the truly autonomous agent implementation. This type of system is conceptually straightforward but can produce highly diverse and complex behaviors in practice.

The agent begins its operation with minimal human input; usually a single instruction or goal. Once the task is defined, it functions independently, taking actions and observing their effects on the environment.

A key characteristic of this approach is self-evaluation: the agent must determine, based on environmental feedback, whether its actions are moving it closer to the goal. For instance, if it executes code or uses external tools, it must assess whether those actions contribute to progress or if adjustments are required. This feedback-driven cycle continues until the agent determines that the objective has been achieved or that no further progress is possible.

Real-world example:

In our benchmark of AI coding tools, we observed that Windsurf and Cursor demonstrated agentic capabilities by autonomously creating file structures, editing multiple files, and running terminal commands to deploy APIs on Heroku.

Windsurf even adapted to recent platform changes, when it discovered that the PostgreSQL Hobby Dev add-on was deprecated, it correctly reconfigured the deployment to use PostgreSQL Essential 0.

Summary

Building AI agents is less about achieving full autonomy and more about creating systems that are purposeful, transparent, and dependable. From our experiments in n8n and insights gained from Anthropic’s and OpenAI’s guides, we found that effective agents come from design choices.

When implementing agents, we focus on three guiding principles:

  • Keep the architecture simple. Start small, build modularly, and only introduce complexity when it clearly improves performance or flexibility.
  • Make the reasoning process visible. Allow users and developers to see how the agent plans and makes decisions, improving interpretability and control.
  • Ensure reliable tool interactions. Design tools that are clearly scoped, well-documented, and tested so agents can act consistently in real-world environments.
Principal Analyst
Cem Dilmegani
Cem Dilmegani
Principal Analyst
Cem has been the principal analyst at AIMultiple since 2017. AIMultiple informs hundreds of thousands of businesses (as per similarWeb) including 55% of Fortune 500 every month.

Cem's work has been cited by leading global publications including Business Insider, Forbes, Washington Post, global firms like Deloitte, HPE and NGOs like World Economic Forum and supranational organizations like European Commission. You can see more reputable companies and resources that referenced AIMultiple.

Throughout his career, Cem served as a tech consultant, tech buyer and tech entrepreneur. He advised enterprises on their technology decisions at McKinsey & Company and Altman Solon for more than a decade. He also published a McKinsey report on digitalization.

He led technology strategy and procurement of a telco while reporting to the CEO. He has also led commercial growth of deep tech company Hypatos that reached a 7 digit annual recurring revenue and a 9 digit valuation from 0 within 2 years. Cem's work in Hypatos was covered by leading technology publications like TechCrunch and Business Insider.

Cem regularly speaks at international technology conferences. He graduated from Bogazici University as a computer engineer and holds an MBA from Columbia Business School.
View Full Profile
Researched by
Mert Palazoğlu
Mert Palazoğlu
Industry Analyst
Mert Palazoglu is an industry analyst at AIMultiple focused on customer service and network security with a few years of experience. He holds a bachelor's degree in management.
View Full Profile

Be the first to comment

Your email address will not be published. All fields are required.

0/450