🏗️ Chapter 2 of 7 Phase 1: Understanding

What It Creates

Agents, sessions, and the output pipeline

▶

Video Lesson Coming Soon

A video walkthrough for this chapter is in production. For now, dive into the written content below.

System Architecture — Chapter 2 View

This diagram reveals more of the OpenClaw architecture as you progress through chapters.

Project > Task > Model Hierarchy

Haiku 85%Sonnet 13%Opus 2%

routes to↓

OpenClaw Gateway

Port 18789 • Node.js v22 • 190K+ GitHub Stars

connects↓

Messaging Channels

Slack

Discord

Signal

iMessage

Chapter 2 of 7 — 29% Architecture Revealed

What You'll Learn

✓ Agent output types
✓ Session management
✓ Multi-step workflows
✓ File and tool interactions
✓ Response pipeline
✓ Output quality control

In this chapter 6 sections

The Three-Layer Hierarchy

Examine the project, task, and model layers in action.

🌳 Project Organization

Three-Layer Hierarchy in Action

📁Customer Support AutomationPROJECT

├─💬Billing QuestionsTASK

└─⚡Claude Haiku85%

├─🔥Complex EscalationsTASK

└─🧠Claude Sonnet13%

├─⚖️Policy DecisionsTASK

└─👑Claude Opus2%

0/7

Interactive — tap to explore

OpenClaw organizes all your work into a clean three-layer structure: Projects sit at the top, Tasks nest within Projects, and Models belong to Tasks. A Project might be 'Customer Support Automation'—your major initiative.

Within that, you'd have Tasks like 'Handling Billing Questions', 'Processing Refunds', and 'Escalating Complex Issues'. Finally, each Task routes to specific Models (Claude Haiku for simple queries, Sonnet for complex reasoning, Opus for critical decisions).

This structure isn't just organizational—it's the foundation for dramatic cost optimization. By routing different workloads to appropriately-sized models, you can achieve 90-97% cost reductions compared to sending everything to expensive models. The hierarchy ensures every call is intentional and cost-effective.

Cost Optimization Through Selective Routing

Track cumulative cost savings from intelligent model routing.

💰 Routing Impact

How the 90-9-1 Split Saves Money

$1,500/mo

All traffic → Opus (baseline)$1,500/mo

Move 90% to Haiku-85%

Move 9% to Sonnet-40%

Only 1% stays on Opus-67%

0/4

Interactive — tap to explore

Not every task requires your most powerful (and most expensive) model. OpenClaw's hierarchy enables selective routing: simple classification tasks use Haiku (~200x cheaper per 1M tokens than Opus), while nuanced decision-making uses Sonnet, and only the most critical operations use Opus.

Imagine processing 10,000 customer messages monthly—90% might be straightforward queries answerable by Haiku, 9% need Sonnet's reasoning, and just 1% require Opus-level understanding. This 90-9-1 split (sometimes called 85-13-2) can reduce monthly costs from $1,500+ to $30-50.

The key insight is that defaulting to the cheapest model works fine for most tasks. You're not cutting corners; you're matching tool to job, which is how expert systems have always worked.

💡

Smart Routing Saves Money

Not every request needs an expensive frontier model. Route simple queries to faster, cheaper models and reserve expensive LLMs for complex reasoning tasks.

Agent Output Types and Workflows

OpenClaw doesn't lock you into a single interaction pattern. Agents can range from simple reactive chat responses to complex multi-step autonomous workflows. A simple output might be your agent responding to a customer question in Slack.

A moderate output involves tool use—the agent analyzing the question, fetching data, and synthesizing an answer. Complex outputs run full workflows: an agent might orchestrate multiple sub-agents, each handling a piece of a larger problem, with coordination happening through a Mission Control dashboard.

The beauty is that your infrastructure supports all these patterns without modification. You start simple and layer on complexity as needed, with the same OpenClaw instance handling both the 'hello world' chatbot and sophisticated autonomous systems.

Agent Workflow

The sequence of reasoning, tool calls, and context updates that produce an agent's output. Each workflow is logged and can be replayed for debugging, auditing, or teaching.

How Outputs Flow Through the System

OpenClaw's memory system is meticulously designed for both retention and efficiency. The system maintains five core memory files in markdown format: SOUL.md contains the agent's core identity and mission (rarely changes), USER.md tracks individual user context and preferences, IDENTITY.md holds the agent's personality and communication style, MEMORY.md stores the evolving knowledge base, and daily logs maintain session-specific context.

Additionally, HEARTBEAT.md records timestamps and status from proactive monitoring. These files aren't arbitrary—each serves a specific purpose in the agent's reasoning process.

The system intelligently caches frequently-accessed content (like SOUL.md) while re-reading volatile files (like daily logs) on each cycle. This balance keeps context windows reasonable while maintaining complete historical awareness.

Memory Files Deep Dive

See which memory files are loaded and cached each cycle.

🧠 Memory Loading Strategy

What Gets Loaded Each Cycle

💎

SOUL.mdAlways loaded — agent's constitution

CACHED

🎭

IDENTITY.mdAlways loaded — communication style

CACHED

👤

USER.mdFresh per user — tracks preferences

DYNAMIC

📅

Today's LogFresh each cycle — current session

DYNAMIC

📚

MEMORY.mdSummarized selectively — growing knowledge

DYNAMIC

💓

HEARTBEAT.mdSmall, always fresh — monitoring state

DYNAMIC

0/6

Interactive — tap to explore

Let's examine each memory file and how agents use them:

SOUL.md acts as the agent's constitution—mission statement, core values, decision-making framework—rarely modified but always consulted.

USER.md evolves with each user interaction, tracking preferences, history, and relationship context. IDENTITY.md defines how the agent communicates: tone, style, personality quirks.

MEMORY.md is the working memory, containing recent insights and lessons learned. Daily logs provide session isolation while maintaining continuity across days. HEARTBEAT.md tracks when the agent woke and what it observed.

During the context assembly phase (before each LLM call), the system loads SOUL.md, IDENTITY.md, relevant USER.md entries, and the current session's logs. Heavier files like MEMORY.md might be summarized or loaded selectively. This selective loading keeps token usage low while maintaining rich context.

structure

How Outputs Flow Through the System

When an agent generates output, the flow follows a predictable path: the LLM produces raw text, the Agent Loop parses it (extracting any tool calls or special directives), the system executes any requested tools, the response gets formatted for the target channel, and crucially, the interaction gets logged to memory. If the agent called tools, those results feed back into another reasoning cycle (a feedback loop).

The formatted response then flows to the Channel Adapter, which translates it to platform-specific format (emoji reactions for Slack, formatting for WhatsApp, etc.) and sends it outbound. Throughout this process, the Agent Loop updates memory files, marking significant interactions. This means every interaction shapes the agent's future behavior—learning happens continuously, not in batch training cycles.

Key Takeaways

📝 My Notes▼

← What Is OpenClaw? The Engine Room →