What It Creates
Agents, sessions, and the output pipeline
Video Lesson Coming Soon
A video walkthrough for this chapter is in production. For now, dive into the written content below.
System Architecture — Chapter 2 View
This diagram reveals more of the OpenClaw architecture as you progress through chapters.
What You'll Learn
- ✓ Agent output types
- ✓ Session management
- ✓ Multi-step workflows
- ✓ File and tool interactions
- ✓ Response pipeline
- ✓ Output quality control
In this chapter 6 sections
The Three-Layer Hierarchy
Examine the project, task, and model layers in action.
Interactive — tap to explore
OpenClaw organizes all your work into a clean three-layer structure: Projects sit at the top, Tasks nest within Projects, and Models belong to Tasks. A Project might be 'Customer Support Automation'—your major initiative.
Within that, you'd have Tasks like 'Handling Billing Questions', 'Processing Refunds', and 'Escalating Complex Issues'. Finally, each Task routes to specific Models (Claude Haiku for simple queries, Sonnet for complex reasoning, Opus for critical decisions).
This structure isn't just organizational—it's the foundation for dramatic cost optimization. By routing different workloads to appropriately-sized models, you can achieve 90-97% cost reductions compared to sending everything to expensive models. The hierarchy ensures every call is intentional and cost-effective.
Cost Optimization Through Selective Routing
Track cumulative cost savings from intelligent model routing.
Interactive — tap to explore
Not every task requires your most powerful (and most expensive) model. OpenClaw's hierarchy enables selective routing: simple classification tasks use Haiku (~200x cheaper per 1M tokens than Opus), while nuanced decision-making uses Sonnet, and only the most critical operations use Opus.
Imagine processing 10,000 customer messages monthly—90% might be straightforward queries answerable by Haiku, 9% need Sonnet's reasoning, and just 1% require Opus-level understanding. This 90-9-1 split (sometimes called 85-13-2) can reduce monthly costs from $1,500+ to $30-50.
The key insight is that defaulting to the cheapest model works fine for most tasks. You're not cutting corners; you're matching tool to job, which is how expert systems have always worked.
Not every request needs an expensive frontier model. Route simple queries to faster, cheaper models and reserve expensive LLMs for complex reasoning tasks.
Agent Output Types and Workflows
OpenClaw doesn't lock you into a single interaction pattern. Agents can range from simple reactive chat responses to complex multi-step autonomous workflows. A simple output might be your agent responding to a customer question in Slack.
A moderate output involves tool use—the agent analyzing the question, fetching data, and synthesizing an answer. Complex outputs run full workflows: an agent might orchestrate multiple sub-agents, each handling a piece of a larger problem, with coordination happening through a Mission Control dashboard.
The beauty is that your infrastructure supports all these patterns without modification. You start simple and layer on complexity as needed, with the same OpenClaw instance handling both the 'hello world' chatbot and sophisticated autonomous systems.
The sequence of reasoning, tool calls, and context updates that produce an agent's output. Each workflow is logged and can be replayed for debugging, auditing, or teaching.
How Outputs Flow Through the System
OpenClaw's memory system is meticulously designed for both retention and efficiency. The system maintains five core memory files in markdown format: SOUL.md contains the agent's core identity and mission (rarely changes), USER.md tracks individual user context and preferences, IDENTITY.md holds the agent's personality and communication style, MEMORY.md stores the evolving knowledge base, and daily logs maintain session-specific context.
Additionally, HEARTBEAT.md records timestamps and status from proactive monitoring. These files aren't arbitrary—each serves a specific purpose in the agent's reasoning process.
The system intelligently caches frequently-accessed content (like SOUL.md) while re-reading volatile files (like daily logs) on each cycle. This balance keeps context windows reasonable while maintaining complete historical awareness.
Memory Files Deep Dive
See which memory files are loaded and cached each cycle.
Interactive — tap to explore
Let's examine each memory file and how agents use them:
SOUL.md acts as the agent's constitution—mission statement, core values, decision-making framework—rarely modified but always consulted.
USER.md evolves with each user interaction, tracking preferences, history, and relationship context. IDENTITY.md defines how the agent communicates: tone, style, personality quirks.
MEMORY.md is the working memory, containing recent insights and lessons learned. Daily logs provide session isolation while maintaining continuity across days. HEARTBEAT.md tracks when the agent woke and what it observed.
During the context assembly phase (before each LLM call), the system loads SOUL.md, IDENTITY.md, relevant USER.md entries, and the current session's logs. Heavier files like MEMORY.md might be summarized or loaded selectively. This selective loading keeps token usage low while maintaining rich context.
How Outputs Flow Through the System
When an agent generates output, the flow follows a predictable path: the LLM produces raw text, the Agent Loop parses it (extracting any tool calls or special directives), the system executes any requested tools, the response gets formatted for the target channel, and crucially, the interaction gets logged to memory. If the agent called tools, those results feed back into another reasoning cycle (a feedback loop).
The formatted response then flows to the Channel Adapter, which translates it to platform-specific format (emoji reactions for Slack, formatting for WhatsApp, etc.) and sends it outbound. Throughout this process, the Agent Loop updates memory files, marking significant interactions. This means every interaction shapes the agent's future behavior—learning happens continuously, not in batch training cycles.