2891 words No Slides

Video 23.6: Multi-Agent in the SDK

Course: Claude Code - Parallel Agent Development (Course 4) Section: 23: Orchestration and Best Practices Video Length: 4–5 minutes Presenter: Daniel Treasure


Opening Hook

So far, you've been orchestrating agents through the CLI and config files. But what if you want to build a custom UI for agent management? Or programmatically create teams, assign tasks, and synthesize results? That's where the Claude Code SDK comes in—Python and TypeScript libraries that let you control multi-agent workflows from your own code.


Key Talking Points

What to say:

What the SDK Provides - Programmatic control over team creation, agent assignment, and task management. - Support for multiple composition patterns: sequential (agents work one after another), concurrent (parallel), handoff (agents pass work between them), group chat (agents discuss in a shared context). - Permission callbacks: decide at runtime what each agent is allowed to do. - Result aggregation: combine findings from multiple agents into a single output.

SDK Languages - Python SDK: Most mature, full feature support, good for data processing and backend orchestration. - TypeScript SDK: Node.js, good for web-based UIs and integrations. - Both have identical APIs (mostly), so you can choose based on your ecosystem.

Composition Patterns

  1. Sequential: Agent A finishes → Agent B starts with Agent A's results.
  2. Use case: Feature → Tests → Documentation (chain of work).
  3. Reduces cost vs. parallel, because each agent has smaller context.

  4. Concurrent: All agents start at the same time, work independently, results collected.

  5. Use case: Build 4 features in parallel, merge results.
  6. Higher cost (4 agents) but faster wall-clock time.

  7. Handoff: Agent A completes a task → explicitly passes work to Agent B.

  8. Use case: Agent A builds feature → handoff to Agent B for review/hardening.
  9. Agents can reject handoff if prerequisites aren't met.

  10. Group Chat: Agents discuss in a shared context, taking turns speaking.

  11. Use case: Architecture discussion (Agent A proposes, Agent B critiques, Agent C synthesizes).
  12. Lower cost than concurrent (shared context), but slower (sequential discussion).

Permission Callbacks - SDK lets you define a callback function: when agent tries to do X, your callback decides yes/no. - Example: "Before this agent can delete a file, ask me (the user) for permission." - Callbacks run on your machine, not the agent's, so you control the gate.

Building Custom Orchestration UIs - With the SDK, you can build a web app that manages agents: - Dashboard showing running agents and their progress. - Task queue UI: users submit work, agents claim tasks. - Results UI: view synthesized output from multi-agent work. - Useful for organizations where agents are a shared resource.


What to show on screen:

  1. SDK installation and imports
  2. Show pip/npm install commands.
  3. Show importing the SDK into a Python/TS file.

  4. Sequential composition example

  5. Show code creating two agents, chaining their work.
  6. Show the first agent's output becoming the second agent's input.

  7. Concurrent composition example

  8. Show code creating 4 agents, starting them all at once.
  9. Show result aggregation: combining outputs.

  10. Permission callback example

  11. Show a callback function that intercepts an agent's action.
  12. Show the callback either allowing or blocking the action.

  13. Custom UI mockup (or live demo)

  14. Show a dashboard with running agents, task queue, results.
  15. Explain how it's powered by the SDK.

Demo Plan

Scenario: You're building a task distribution system where teams can submit coding tasks, agents pick them up, and results are synthesized. You'll use the Python SDK to orchestrate the agents and expose a simple web API.

Timing: ~4 minutes

Step 1: Install and Set Up (30 seconds)

  • Show installation: bash pip install anthropic-claude-code-sdk

  • Show a basic setup: ```python from claude_code_sdk import ClaudeCodeClient, Team, Agent, Task

client = ClaudeCodeClient(api_key="your_api_key") ```

  • Explain: "The SDK is just a Python library. Import it, instantiate a client, and you can orchestrate agents programmatically."

Step 2: Create a Team Programmatically (45 seconds)

  • Show code to create agents and a team: ```python team = client.create_team( name="backend-team", agents=[ Agent(name="feature_agent", model="claude-sonnet", budget_usd=25), Agent(name="test_agent", model="claude-haiku", budget_usd=15), Agent(name="docs_agent", model="claude-haiku", budget_usd=10) ] )

print(f"Created team: {team.id}") print(f"Agents: {[a.name for a in team.agents]}") ```

  • Explain: "Instead of editing config files, you define teams in code. This is useful if you're dynamically creating teams or integrating with other systems."

Step 3: Sequential Composition (60 seconds)

  • Show code for a sequential workflow: ```python # Step 1: Feature agent implements a feature feature_task = Task( subject="Implement payment retry logic", description="Add exponential backoff retry for failed payments", assigned_to="feature_agent" )

feature_result = client.run_task(team_id=team.id, task=feature_task) print(f"Feature agent completed: {feature_result.status}") print(f"Files created: {feature_result.files_changed}")

# Step 2: Test agent writes tests for the feature test_task = Task( subject="Write tests for payment retry logic", description=f"Write comprehensive tests. Feature agent created: {feature_result.summary}", assigned_to="test_agent" )

test_result = client.run_task(team_id=team.id, task=test_task) print(f"Test agent completed: {test_result.status}")

# Step 3: Docs agent documents the feature docs_task = Task( subject="Document payment retry logic", description=f"Document the new feature. Tests pass: {test_result.passed}", assigned_to="docs_agent" )

docs_result = client.run_task(team_id=team.id, task=docs_task)

# Result: Sequential workflow complete print("Sequential workflow complete!") print(f"Total tasks: 3, Total cost: ${feature_result.cost + test_result.cost + docs_result.cost:.2f}") ```

  • Explain: "This is sequential: feature → tests → docs. Each task gets the previous agent's work as context. No parallelization, but lower cost because context is smaller."

Step 4: Concurrent Composition (60 seconds)

  • Show code for parallel agents: ```python # Create 4 feature agents to work on different components tasks = [ Task(subject="Implement auth module", assigned_to="feature_agent_1"), Task(subject="Implement database layer", assigned_to="feature_agent_2"), Task(subject="Implement API endpoints", assigned_to="feature_agent_3"), Task(subject="Implement frontend", assigned_to="feature_agent_4"), ]

# Run all tasks concurrently results = client.run_tasks_concurrent(team_id=team.id, tasks=tasks)

# Aggregate results print("CONCURRENT RESULTS:") total_cost = 0 all_files = {}

for result in results: print(f"✓ {result.task.subject}") print(f" Status: {result.status}, Cost: ${result.cost:.2f}") all_files.update(result.files_changed) total_cost += result.cost

print(f"\nTotal files created: {len(all_files)}") print(f"Total cost: ${total_cost:.2f}") print(f"Wall-clock time: ~1 hour (vs. 4 hours sequential)") ```

  • Explain: "All 4 tasks start at the same time. Results are collected together. Cost is 4x, but time is 1/4."

Step 5: Permission Callbacks (60 seconds)

  • Show a permission callback: ```python def permission_callback(agent_name, action, resource): """ Decide whether to allow an agent action. Called before the action is executed. """ # Allow agents to read any file if action == "read": return True

    For write operations, require human approval

    if action == "write": print(f"\nREQUEST: {agent_name} wants to write to {resource}") approval = input("Allow? (y/n): ").lower() return approval == "y"

    Deny other operations

    return False

# Assign the callback to the team team.set_permission_callback(permission_callback)

# Now when an agent tries to write, your callback will be called result = client.run_task(team_id=team.id, task=task) ```

  • Explain: "Callbacks let you intercept agent actions and decide in real-time. Useful for sensitive operations that need human approval."

Step 6: Build a Simple Orchestration API (60 seconds)

  • Show a Flask app that uses the SDK: ```python from flask import Flask, request, jsonify from claude_code_sdk import ClaudeCodeClient

app = Flask(name) client = ClaudeCodeClient(api_key="your_api_key") team_id = "your_team_id"

@app.route("/tasks", methods=["POST"]) def submit_task(): """Submit a coding task to the agent team.""" data = request.json task = Task( subject=data["subject"], description=data["description"], assigned_to=data.get("assigned_to", "auto") # Auto-assign if not specified )

result = client.run_task(team_id=team_id, task=task)
return jsonify({
  "task_id": result.task.id,
  "status": result.status,
  "cost": result.cost,
  "output_files": list(result.files_changed.keys())
})

@app.route("/tasks/", methods=["GET"]) def get_task_status(task_id): """Get the status of a task.""" result = client.get_task_result(task_id) return jsonify({ "task_id": task_id, "status": result.status, "cost": result.cost, "output": result.summary })

@app.route("/teams//stats", methods=["GET"]) def get_team_stats(team_id): """Get team statistics.""" stats = client.get_team_stats(team_id) return jsonify({ "agents": len(stats.agents), "total_cost": stats.total_cost, "tasks_completed": stats.tasks_completed, "average_cost_per_task": stats.total_cost / stats.tasks_completed })

if name == "main": app.run(debug=False, port=5000) ```

  • Explain: "Now you have a REST API for submitting tasks. Organizations can use this to distribute coding work to agents."

Step 7: Show Result Synthesis (45 seconds)

  • Show code for combining agent outputs: ```python # Get results from multiple agents results = [feature_result, test_result, docs_result]

# Synthesize into a single report synthesis_task = Task( subject="Synthesize results", description=f""" Feature agent created:

Test agent created:
{test_result.summary}

Docs agent created:
{docs_result.summary}

Create a single README that documents everything.
""",
assigned_to="docs_agent"

)

final_result = client.run_task(team_id=team.id, task=synthesis_task) print("Synthesized output:") print(final_result.summary) ```

  • Explain: "The last agent takes all the results and synthesizes them into a coherent output. This is how you go from parallel work to a unified deliverable."

Code Examples & Commands

Example 1: Complete Sequential Workflow (Python)

#!/usr/bin/env python3
"""
Sequential agent workflow: Feature → Tests → Docs
"""

from claude_code_sdk import ClaudeCodeClient, Team, Agent, Task

def run_sequential_workflow():
    # Initialize client
    client = ClaudeCodeClient(api_key="sk-...")

    # Create team with three agents
    team = client.create_team(
        name="documentation-pipeline",
        agents=[
            Agent(name="builder", model="claude-sonnet", budget_usd=30),
            Agent(name="tester", model="claude-sonnet", budget_usd=20),
            Agent(name="documenter", model="claude-haiku", budget_usd=10),
        ]
    )

    print(f"Team created: {team.id}\n")

    # Step 1: Builder creates the feature
    print("Step 1: Builder creates feature...")
    build_task = Task(
        subject="Implement user authentication module",
        description="""
        Create src/auth.py with:
        - login(username, password) -> token
        - logout(token)
        - verify_token(token) -> user
        Use JWT tokens with 1-hour expiration.
        """,
        assigned_to="builder"
    )

    build_result = client.run_task(team_id=team.id, task=build_task)
    print(f"✓ Builder complete. Cost: ${build_result.cost:.2f}\n")

    # Step 2: Tester writes tests
    print("Step 2: Tester writes tests...")
    test_task = Task(
        subject="Write tests for authentication module",
        description=f"""
        Builder created:
        {build_result.summary}

        Write comprehensive tests in tests/test_auth.py:
        - Test successful login
        - Test invalid credentials
        - Test token expiration
        - Test logout
        Aim for >80% coverage.
        """,
        assigned_to="tester"
    )

    test_result = client.run_task(team_id=team.id, task=test_task)
    print(f"✓ Tester complete. Cost: ${test_result.cost:.2f}\n")

    # Step 3: Documenter writes docs
    print("Step 3: Documenter writes documentation...")
    docs_task = Task(
        subject="Document authentication API",
        description=f"""
        Builder created the auth module.
        Tester wrote tests (all passing: {test_result.passed}).

        Create docs/AUTH_API.md documenting:
        - Module overview
        - Function signatures
        - Usage examples
        - Error handling
        - Testing instructions
        """,
        assigned_to="documenter"
    )

    docs_result = client.run_task(team_id=team.id, task=docs_task)
    print(f"✓ Documenter complete. Cost: ${docs_result.cost:.2f}\n")

    # Summary
    total_cost = build_result.cost + test_result.cost + docs_result.cost
    print("="*50)
    print("WORKFLOW COMPLETE")
    print("="*50)
    print(f"Total cost: ${total_cost:.2f}")
    print(f"Tasks: 3, Agents: 3, Time: ~2-3 hours (sequential)")
    print("\nDeliverables:")
    print(f"- src/auth.py (by builder)")
    print(f"- tests/test_auth.py (by tester)")
    print(f"- docs/AUTH_API.md (by documenter)")

if __name__ == "__main__":
    run_sequential_workflow()

Example 2: Concurrent Agents (Python)

#!/usr/bin/env python3
"""
Concurrent agent workflow: 4 agents building features in parallel
"""

from claude_code_sdk import ClaudeCodeClient, Task
import asyncio

async def run_concurrent_workflow():
    client = ClaudeCodeClient(api_key="sk-...")
    team_id = "team_123"

    # Define 4 independent tasks
    tasks = [
        Task(
            subject="Implement authentication module",
            description="Create src/auth.py with login, logout, verify_token",
            assigned_to="auth_agent"
        ),
        Task(
            subject="Implement database layer",
            description="Create src/database.py with User and Post models",
            assigned_to="database_agent"
        ),
        Task(
            subject="Implement API endpoints",
            description="Create src/api.py with FastAPI endpoints",
            assigned_to="api_agent"
        ),
        Task(
            subject="Implement frontend UI",
            description="Create React components for login and dashboard",
            assigned_to="frontend_agent"
        ),
    ]

    print("Starting 4 concurrent agents...\n")

    # Run all tasks concurrently
    results = await client.run_tasks_concurrent_async(
        team_id=team_id,
        tasks=tasks
    )

    # Aggregate results
    print("="*50)
    print("CONCURRENT WORKFLOW RESULTS")
    print("="*50)

    total_cost = 0
    all_files = {}

    for i, result in enumerate(results, 1):
        print(f"\n{i}. {result.task.subject}")
        print(f"   Status: {result.status}")
        print(f"   Cost: ${result.cost:.2f}")
        print(f"   Files created: {len(result.files_changed)}")
        total_cost += result.cost
        all_files.update(result.files_changed)

    print(f"\n{'='*50}")
    print(f"Total files: {len(all_files)}")
    print(f"Total cost: ${total_cost:.2f}")
    print(f"Wall-clock time: ~1 hour (vs. 4 hours sequential)")
    print(f"Cost multiplier: 4x (concurrent agents)")

    # Synthesis: Final agent integrates everything
    print("\nRunning synthesis agent to integrate results...")
    synthesis_task = Task(
        subject="Integrate all components",
        description="""
        Auth, database, API, and frontend teams have completed their work.
        Create a main.py that:
        1. Initializes the database
        2. Starts the API server
        3. Ensures frontend is served
        Run integration tests to verify everything works together.
        """
    )

    synthesis_result = await client.run_task_async(
        team_id=team_id,
        task=synthesis_task
    )

    print(f"✓ Integration complete. Cost: ${synthesis_result.cost:.2f}")
    print(f"TOTAL PROJECT COST: ${total_cost + synthesis_result.cost:.2f}")

if __name__ == "__main__":
    asyncio.run(run_concurrent_workflow())

Example 3: Permission Callback System

#!/usr/bin/env python3
"""
Permission callbacks for controlling agent actions
"""

from claude_code_sdk import ClaudeCodeClient, Team, Agent, Task

class PermissionManager:
    def __init__(self):
        self.audit_log = []

    def check_permission(self, agent_name, action, resource):
        """
        Callback function called by SDK before agent action.
        Returns True (allow) or False (deny).
        """
        # Log the request
        self.audit_log.append({
            "agent": agent_name,
            "action": action,
            "resource": resource
        })

        # Allow reads without approval
        if action == "read":
            print(f"✓ {agent_name} reading {resource}")
            return True

        # For writes, require approval for sensitive files
        if action == "write":
            if self._is_sensitive(resource):
                print(f"\n⚠ REQUEST: {agent_name} wants to write to {resource}")
                print("This is a sensitive file. Require approval.")
                response = input("Allow? (y/n): ").lower()
                if response == "y":
                    print("✓ Approved\n")
                    return True
                else:
                    print("✗ Denied\n")
                    return False
            else:
                return True

        # Deny deletes by default
        if action == "delete":
            print(f"✗ Deny: {agent_name} cannot delete {resource}")
            return False

        return False

    def _is_sensitive(self, resource):
        """Check if resource is sensitive."""
        sensitive_patterns = [
            "config/secrets",
            "src/payment",
            ".env",
            "credentials"
        ]
        return any(pattern in resource for pattern in sensitive_patterns)

# Use it with the SDK
def main():
    client = ClaudeCodeClient(api_key="sk-...")
    perm_manager = PermissionManager()

    team = client.create_team(
        name="controlled-team",
        agents=[Agent(name="dev_agent", model="claude-sonnet")],
        permission_callback=perm_manager.check_permission
    )

    # When dev_agent tries to write to src/payment/stripe.py,
    # the permission callback will be called and require approval

    task = Task(
        subject="Fix payment processing",
        description="Debug and fix the payment processing logic"
    )

    result = client.run_task(team_id=team.id, task=task)

    # Print audit log
    print("\n" + "="*50)
    print("AUDIT LOG")
    print("="*50)
    for entry in perm_manager.audit_log:
        print(f"{entry['agent']} {entry['action']} {entry['resource']}")

if __name__ == "__main__":
    main()

Example 4: Task Distribution API (Flask)

#!/usr/bin/env python3
"""
REST API for distributing coding tasks to agent teams
"""

from flask import Flask, request, jsonify
from claude_code_sdk import ClaudeCodeClient, Task
import uuid
from datetime import datetime

app = Flask(__name__)
client = ClaudeCodeClient(api_key="sk-...")

# In-memory task store (use database in production)
task_store = {}
team_id = "production-team"

@app.route("/api/v1/tasks", methods=["POST"])
def submit_task():
    """
    Submit a new coding task.
    POST /api/v1/tasks
    {
      "subject": "Implement feature X",
      "description": "...",
      "assigned_to": "agent_1"  # optional
    }
    """
    data = request.get_json()

    if not data.get("subject"):
        return jsonify({"error": "subject required"}), 400

    # Create task
    task = Task(
        subject=data["subject"],
        description=data.get("description", ""),
        assigned_to=data.get("assigned_to", "auto")
    )

    task_id = str(uuid.uuid4())
    task_store[task_id] = {
        "task": task,
        "status": "queued",
        "submitted_at": datetime.now().isoformat(),
        "result": None
    }

    # Run task asynchronously (or enqueue)
    try:
        result = client.run_task(team_id=team_id, task=task)
        task_store[task_id]["status"] = "completed"
        task_store[task_id]["result"] = {
            "status": result.status,
            "cost": result.cost,
            "summary": result.summary
        }
    except Exception as e:
        task_store[task_id]["status"] = "failed"
        task_store[task_id]["error"] = str(e)

    return jsonify({
        "task_id": task_id,
        "status": task_store[task_id]["status"],
        "submitted_at": task_store[task_id]["submitted_at"]
    }), 201

@app.route("/api/v1/tasks/<task_id>", methods=["GET"])
def get_task(task_id):
    """Get task status and results."""
    if task_id not in task_store:
        return jsonify({"error": "task not found"}), 404

    entry = task_store[task_id]
    return jsonify({
        "task_id": task_id,
        "status": entry["status"],
        "submitted_at": entry["submitted_at"],
        "result": entry.get("result"),
        "error": entry.get("error")
    })

@app.route("/api/v1/teams/<team_id>/stats", methods=["GET"])
def get_team_stats(team_id):
    """Get team statistics."""
    completed_tasks = [t for t in task_store.values() if t["status"] == "completed"]
    total_cost = sum(t["result"]["cost"] for t in completed_tasks if t.get("result"))

    return jsonify({
        "team_id": team_id,
        "tasks_completed": len(completed_tasks),
        "total_cost": f"${total_cost:.2f}",
        "average_cost_per_task": f"${total_cost / len(completed_tasks) if completed_tasks else 0:.2f}"
    })

if __name__ == "__main__":
    app.run(debug=False, host="0.0.0.0", port=5000)

Gotchas & Tips

Gotcha 1: Blocking Callbacks - Your permission callback is slow (makes a database query for each action). Now agents are blocked waiting. - Tip: Keep callbacks fast. Pre-cache permissions or use simple in-memory checks.

Gotcha 2: Mixing Composition Patterns - You try to use sequential and concurrent at the same time, and results get confused. - Tip: Stick to one pattern per workflow. Sequential or concurrent, not both. If you need both, chain them: concurrent phase 1 → synthesis → concurrent phase 2.

Gotcha 3: Over-Engineering - You build a fancy orchestration UI that looks cool but adds complexity without value. - Tip: Start simple. CLI + config files might be enough. Add UIs only when the pain point is real.

Gotcha 4: API Rate Limits - You submit 1,000 tasks to the API in parallel. SDK hits rate limits and tasks fail. - Tip: Queue tasks and rate-limit your submissions. Process them in batches.

Gotcha 5: Result Aggregation Failures - You're aggregating results from 4 agents, but one agent's output format is unexpected. Aggregation crashes. - Tip: Validate agent outputs before aggregation. Or use a synthesis agent to normalize results.


Lead-out

You've now seen the full arc: from task decomposition, through hooks and quality gates, cost management, real-world case studies, enterprise patterns, and finally SDK-based orchestration. You have all the tools to build serious multi-agent systems. The final videos in this course are about putting it all together: advanced patterns, troubleshooting, and building for the future.


Reference URLs

  • Anthropic Claude Code SDK (Python): https://github.com/anthropics/claude-code-sdk-python
  • Anthropic Claude Code SDK (TypeScript): https://github.com/anthropics/claude-code-sdk-ts
  • SDK Documentation: https://claude.ai/docs/sdk
  • Flask Documentation: https://flask.palletsprojects.com/
  • FastAPI Documentation: https://fastapi.tiangolo.com/
  • Async Python: https://docs.python.org/3/library/asyncio.html

Prep Reading

  1. SDK documentation: Read through the examples and API reference.
  2. Python async patterns: If using async orchestration.
  3. API design best practices: If building a custom task distribution API.
  4. Composition pattern literature: Papers on agent choreography and orchestration.

Notes for Daniel

This video is technical and code-heavy, so pace it carefully. Show working code, explain what's happening, run it if possible. Viewers should see that SDK usage isn't magic—it's just an abstraction over what you've been doing in the CLI.

The key value of the SDK is automation: you can build tools that other people use. Show how a Flask API turns multi-agent work into a service.

Walk through each composition pattern and explain when to use it. Sequential is simpler and cheaper; concurrent is faster but costs more. That's the trade-off.

If you can, live-code a small example. Even a simple sequential workflow (feature → test → docs) is powerful to watch.

Emphasize that the SDK doesn't replace the CLI. They're complementary. CLI for ad-hoc work, SDK for automation.