Advanced Agent Patterns - Planning, Memory, and Multi-Agent Systems

LESSON

RAG, Agents, and LLM Production

006 30 min intermediate

Day 326: Advanced Agent Patterns - Planning, Memory, and Multi-Agent Systems

The core idea: advanced agent systems become reliable only when planning, memory, and delegation are treated as explicit runtime components with contracts, budgets, and observability.


Today's "Aha!" Moment

The insight: 21/05.md defined an agent as a bounded control loop over state, actions, and stop conditions. This lesson adds the next layer: once tasks get longer, context spans multiple sessions, or responsibilities need to be split, the "one loop, one prompt, one tool list" design stops scaling cleanly.

Why this matters: Teams often reach for planning, memory, or multi-agent frameworks as if they were feature upgrades. In production they are architecture choices. Each one changes where control lives, how errors accumulate, and what has to be logged, tested, and governed.

Concrete anchor: Consider an enterprise IT operations assistant handling "My laptop was stolen at the airport, disable access and get me a replacement." A useful system may need to:

That is no longer just a single agent step repeated a few times. It needs decomposition, durable state, and sometimes specialist roles.

Keep this mental hook in view: Advanced agent patterns work when reasoning artifacts, memory records, and handoffs are explicit system objects instead of hidden prompt behavior.


Why This Matters

The single-agent loop from 21/05.md is enough for short tasks: search, read, maybe call one or two tools, then stop. It starts to fail when the job has one or more of these properties:

Without advanced patterns:

With advanced patterns:

Real-world impact: Better task completion on long-running work, fewer repeated tool calls, safer permission boundaries, and clearer debugging when a complex task goes off course.

This lesson prepares for 21/07.md, where those extra moving parts force stronger safety controls, monitoring, and observability.


Learning Objectives

By the end of this session, you should be able to:

  1. Design explicit planning layers for agent systems so task decomposition is inspectable instead of hidden inside free-form model output.
  2. Choose the right memory strategy for an agent by separating working state from durable memory and reasoning about staleness, privacy, and retrieval quality.
  3. Decide when multi-agent coordination is worth the complexity and define role, tool, and handoff boundaries that make it operable in production.

Core Concepts Explained

Concept 1: Planning Externalizes Control Flow

For example, The stolen-laptop assistant cannot safely jump straight to "disable device and order replacement." It has to verify the user's identity, confirm whether the device is corporate-managed, notify security, and only then trigger replacement procurement. If the model improvises that sequence from scratch on every turn, the system becomes hard to predict and harder to audit.

At a high level, Planning is about moving part of the control logic out of hidden token-by-token reasoning and into a visible intermediate artifact. A good plan does not just list steps. It captures dependencies, success criteria, and when to re-plan.

Mechanically: In production, explicit planning often looks like:

  1. Task intake
    • classify the request
    • estimate risk and expected horizon
    • decide whether the task even needs a plan
  2. Plan generation
    • produce a structured plan with steps, required tools, and completion checks
    • attach budgets such as max_steps, max_cost, or required approvals
  3. Execution
    • run one step at a time through the same validated loop from 21/05.md
    • record outputs and mark steps complete, blocked, or failed
  4. Re-planning
    • revise the remaining plan when a step fails, new evidence arrives, or a dependency changes
def execute_task(task, tools):
    plan = planner.create(task)

    while not plan.done():
        step = plan.next_ready_step()
        result = executor.run(step, tools=tools)
        plan.record(step_id=step.id, result=result)

        if result.requires_replan:
            plan = planner.revise(task=task, history=plan.history())

    return plan.final_output()

In practice:

The trade-off is clear: Explicit planning improves traceability and long-horizon task quality, but it adds latency, another model decision point, and more orchestration code.

A useful mental model is: Treat the planner like a dispatcher that writes a work order, not like a genius narrator improvising the whole mission.

Use this lens when:

Concept 2: Memory Is Selective Persistence, Not Transcript Hoarding

For example, an employee returns the next day and asks, "Did security approve the replacement yet?" The agent should remember the ticket ID, device serial number, and approved shipping address. It should not blindly trust a stale summary from an old chat or replay an entire transcript to recover those facts.

At a high level, Memory is useful only if the system can decide what to retain, how long to trust it, and how to retrieve it without overwhelming the model. Dumping old conversations back into the prompt is not memory architecture. It is context stuffing.

Mechanically: Useful agent memory usually separates at least three layers:

  1. Working memory
    • the current run state: latest messages, retrieved documents, tool outputs, current plan
    • short-lived and usually discarded after the task ends
  2. Episodic memory
    • summaries of past interactions or completed cases
    • useful for resuming work and avoiding repeated questions
  3. Semantic memory
    • stable facts such as user preferences, system mappings, or policy-linked attributes
    • often stored as typed records rather than natural-language summaries

The write path matters as much as retrieval:

In practice:

The trade-off is clear: Memory improves continuity and personalization, but it can also amplify stale facts, hidden bias, and compliance risk if stored carelessly.

A useful mental model is: A good agent memory system looks more like a small database with retrieval policy than like a diary.

Use this lens when:

Concept 3: Multi-Agent Systems Need Role Boundaries and Handoff Contracts

For example, In the stolen-laptop case, one specialist agent may handle identity and access revocation, another may handle procurement policy, and a coordinator may decide which task comes next. That split can reduce prompt overload and isolate permissions. It can also create new failure modes if agents keep asking each other for context that was never structured clearly.

At a high level, A multi-agent system is not automatically better because multiple models are talking. It is useful when roles have meaning: different tools, different authority levels, different evaluation criteria, or different context windows.

Mechanically: A production-safe multi-agent pattern usually includes:

  1. Coordinator
    • owns the global task state
    • assigns work to specialists based on plan state or classification
  2. Specialist agents
    • each gets a narrow toolset and prompt focused on one domain
    • returns structured outputs instead of open-ended conversation
  3. Handoff contract
    • shared task ID, objective, allowed actions, expected output schema, and timeout budget
  4. Shared memory or state store
    • keeps the canonical record so agents are not relying on each other's paraphrases

The simplest useful pattern is often not "many agents debating." It is "one coordinator routing structured sub-tasks to specialists with clear contracts."

In practice:

The trade-off is clear: Multi-agent designs can improve specialization and permission isolation, but they add latency, orchestration failure modes, and more surfaces to monitor.

A useful mental model is: Design it like a service architecture. Each agent is a bounded component, not a free-floating personality.

Use this lens when:


Troubleshooting

Issue: The planner generates long plans full of speculative steps that the executor never needs.

Why it happens / is confusing: The planner is rewarded for sounding comprehensive instead of producing the minimum executable plan, or the schema does not force clear completion criteria.

Clarification / Fix: Keep plan objects small, require each step to name its dependency and success condition, and re-plan incrementally instead of asking for a perfect end-to-end plan upfront.

Issue: Memory retrieval keeps surfacing stale facts, so the agent acts on outdated information.

Why it happens / is confusing: The system stores natural-language summaries without timestamps, confidence, or invalidation rules, so old conclusions look as trustworthy as fresh facts.

Clarification / Fix: Store memory with provenance and age, prefer typed records for durable facts, and require confirmation before using old memories for high-impact writes.

Issue: Multiple agents bounce a task back and forth or duplicate the same tool calls.

Why it happens / is confusing: Role boundaries are vague, shared state is incomplete, or handoff payloads omit what the next agent actually needs.

Clarification / Fix: Give each agent a narrow mission, maintain a canonical shared task record, and make handoffs structured enough that the receiver can act without re-deriving the whole problem.


Advanced Connections

Connection 1: Advanced Agent Patterns <-> Agent Fundamentals

21/05.md introduced the bounded observe-decide-act loop. This lesson does not replace that loop. It composes it:

If the base loop is not typed, budgeted, and observable, advanced patterns only spread instability across more components.

Connection 2: Advanced Agent Patterns <-> Production Agent Systems

21/07.md is the operational consequence of this lesson. Once you add plans, memory stores, and handoffs, you also add:

Advanced patterns increase capability, but they also expand the safety and observability surface.


Resources

Optional Deepening Resources


Key Insights

  1. Planning is an execution artifact, not just a prompt trick - the value comes from inspectable steps, budgets, and re-planning rules.
  2. Memory quality depends on the write path - durable memory needs provenance, structure, and expiration, not just retrieval.
  3. Multi-agent systems are architecture, not aesthetics - they pay off when roles, permissions, and context boundaries are real enough to justify orchestration overhead.

PREVIOUS Agent Fundamentals - When LLMs Take Action NEXT Production Agent Systems - Safety, Monitoring, and Observability

← Back to RAG, Agents, and LLM Production

← Back to Learning Hub