1836 words Slides

8.3 Model Configuration

Course: Claude Code - Power User Section: Configuration Video Length: 3-4 minutes Presenter: Daniel Treasure


Opening Hook

Claude comes in three flavors: fast, balanced, and deep-thinking. Picking the right one for the job saves you time and money. Let's talk speed, reasoning, and cost trade-offs.


Key Talking Points

1. Claude's Three Models

  • Haiku (fast): Lightweight, speedy, perfect for quick edits and simple tasks
  • Sonnet (balanced): The sweet spot—fast enough, smart enough, general purpose
  • Opus (deep): Maximum reasoning, best for complex problems and research

What to say: "Think of Haiku as a sprint, Sonnet as a marathon runner, and Opus as a chess grandmaster. Pick the right one for the job."

What to show on screen: A comparison table: | Model | Speed | Reasoning | Cost | Best For | |-------|-------|-----------|------|----------| | Haiku | Fastest | Good | Cheapest | Quick fixes, simple tasks | | Sonnet | Fast | Excellent | Balanced | General coding, most tasks | | Opus | Slower | Maximum | Higher | Complex architecture, research |

2. Model Selection Methods

  • CLI flag: claude code --model haiku (overrides everything)
  • Environment variable: ANTHROPIC_MODEL=claude-3-5-sonnet-20241022
  • settings.json: "model": "claude-3-5-sonnet-20241022" at user or project level
  • Model aliases: fast → haiku, balanced → sonnet, deep → opus (convenient shortcuts)
  • Default behavior: Falls back through settings hierarchy, uses Sonnet if nothing specified

What to say: "CLI flags win. Then env vars. Then settings files. The hierarchy makes it easy to override when you need to."

What to show on screen: Open terminal and show each method: 1. claude code --model fast (CLI flag) 2. export ANTHROPIC_MODEL=claude-3-5-haiku-20241022 (env var) 3. Show settings.json with model field

3. Model Aliases and Shortcuts

  • fast = Haiku (great for quick iterations)
  • balanced = Sonnet (recommended default)
  • deep = Opus (use when you need serious reasoning)
  • Aliases make it easy to remember and switch
  • Can be configured per project

What to say: "Instead of remembering the full model names, use aliases. It's faster to type and clearer to read in your settings files."

What to show on screen: Show .claude/settings.json with "model": "fast" and explain that it maps to Haiku

4. Fallback Models and Reliability

  • --fallback-model flag: If your primary model is overloaded or unavailable, Claude Code tries the fallback
  • Useful for critical workflows (e.g., CI/CD)
  • Example: Use Opus as primary (best reasoning), Sonnet as fallback (always available)
  • Configured in settings or on CLI

What to say: "If you're running important automation, set a fallback. It's like a backup parachute—you probably won't need it, but it's there if the system gets busy."

What to show on screen: Show a CLI example: claude code --model opus --fallback-model sonnet

5. Speed vs. Cost vs. Reasoning Trade-offs

  • Haiku: 1x speed, 0.3x cost, good reasoning (80% of cases don't need more)
  • Sonnet: 2-3x slower than Haiku, 2x cost, excellent reasoning (handles 95% of tasks)
  • Opus: 3-4x slower than Haiku, 3-4x cost, maximum reasoning (for the hard problems)
  • Token cost varies: Haiku cheapest, Sonnet middle, Opus highest

What to say: "Your quota or API costs matter. Haiku is great for refactoring. Sonnet is your daily driver. Opus is for the really hard architectural decisions."

What to show on screen: Show a decision tree: - "Is this a quick code fix?" → Haiku - "Is this general coding work?" → Sonnet - "Do I need deep reasoning/architecture?" → Opus - "Am I using this in CI/CD?" → Sonnet + fallback

6. Fast Mode — Turbocharging Opus

Fast mode is a speed toggle for Opus 4.6 that delivers up to 2.5x faster output at significantly higher cost. Same model, same quality, same weights — just faster inference.

  • Enable: Type /fast in a session to toggle on/off
  • Persistent: Set "fastMode": true in settings.json for always-on
  • Indicator: The ↯ lightning bolt appears next to the prompt when active
  • Auto-switch: Enabling fast mode automatically switches to Opus 4.6 if you're on another model
  • Graceful fallback: When rate limits are hit, fast mode silently drops to standard Opus speed (↯ turns gray), then re-enables when cooldown expires

Cost reality: | | Standard Opus | Fast Mode | Multiplier | |---|---|---|---| | Input (≤200K context) | $5/MTok | $30/MTok | 6x | | Output (≤200K context) | $25/MTok | $150/MTok | 6x | | Input (>200K context) | $10/MTok | $60/MTok | 6x | | Output (>200K context) | $25/MTok | $225/MTok | 9x |

What to say: "Fast mode is like taking an Uber instead of the bus. Same destination, way faster, significantly more expensive. Use it when your time is worth more than the API cost — live debugging sessions, tight deadlines, rapid iteration. Don't leave it on for batch work or CI/CD."

What to show on screen: 1. Type /fast — show the ↯ indicator appear 2. Ask Claude to generate something (notice the speed) 3. Type /fast again to toggle off 4. Show "fastMode": true in settings.json

Combining with effort levels: - Fast mode + default effort = maximum quality at top speed (most expensive) - Fast mode + low effort = blazing fast on simple tasks (speed stacks) - Standard mode + low effort = fast AND cheap for easy stuff

Key caveats to mention: - Opus only — doesn't work with Sonnet or Haiku - Not available on Bedrock, Vertex, or Azure — Anthropic API only - Switching between fast and standard invalidates prompt cache (costs more on long contexts) - Not available for Batch API or Priority Tier - Requires "extra usage" enabled on your plan (Pro/Max/Team/Enterprise)

7. Practical Configuration Scenarios

  • Development machine: Use Sonnet by default, Haiku for quick fixes, Opus for architecture reviews
  • Team project: Set Sonnet in project settings, let individuals override on CLI
  • CI/CD: Use Opus (or Sonnet) with fallback for reliability
  • Cost-conscious: Haiku for routine tasks, Sonnet for important work

  • Fast mode user: Enable /fast for interactive work, disable for batch tasks

  • Budget example: Start with Sonnet, use Haiku for quick fixes, save Opus+fast for architecture decisions

What to say: "Your settings are flexible. Start with Sonnet, optimize from there based on what you actually use. And if you need to go fast, /fast is right there."

What to show on screen: Show 2-3 real project configs with different model choices and explain the reasoning


Demo Plan

  1. Show current model and where it comes from (45 sec)
  2. claude code --show-config (or equivalent) to show active model
  3. Show the hierarchy that determined it
  4. Explain which layer won

  5. Demo changing the model (1 min)

  6. Run a task with --model fast
  7. Show the speed difference
  8. Run the same task with --model deep
  9. Compare output quality and time taken

  10. Configure a fallback model (1 min)

  11. Add fallback to settings.json or show CLI flag
  12. Explain the use case (especially for CI/CD)
  13. Talk about reliability trade-offs

  14. Use aliases for convenience (45 sec)

  15. Show "model": "balanced" in settings
  16. Demonstrate how much easier it is to read vs full model name
  17. Explain you can mix aliases and full names

  18. Preview next video (30 sec)

  19. "Environment variables are how you configure behavior beyond just the model. Let's explore those."

Code Examples & Commands

CLI: Switch to fast model for a single task:

claude code --model fast "refactor this function"

CLI: Use Opus with Sonnet fallback:

claude code --model opus --fallback-model sonnet

Set environment variable (temporary):

export ANTHROPIC_MODEL=claude-3-5-haiku-20241022
claude code "quick task"

Set environment variable (permanent in ~/.bashrc or ~/.zshrc):

export ANTHROPIC_MODEL=claude-3-5-sonnet-20241022
export ANTHROPIC_DEFAULT_HAIKU_MODEL=claude-3-5-haiku-20241022
export ANTHROPIC_DEFAULT_OPUS_MODEL=claude-opus-4-20250514

User-level settings with aliases:

{
  "model": "balanced",
  "env": {
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "claude-3-5-haiku-20241022",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "claude-3-5-sonnet-20241022",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "claude-opus-4-20250514"
  }
}

Project-level settings with fallback:

{
  "model": "balanced",
  "fallbackModel": "fast",
  "description": "Sonnet for quality, fallback to Haiku if overloaded"
}

Development machine configuration:

{
  "model": "balanced",
  "env": {
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "claude-3-5-haiku-20241022",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "claude-opus-4-20250514"
  }
}

CI/CD pipeline configuration (strict reliability):

{
  "model": "balanced",
  "fallbackModel": "fast",
  "permissions": {
    "allow": ["bash(npm run test)", "bash(npm run build)"],
    "deny": ["bash(*)", "edit(*)", "WebFetch(*)"]
  }
}

Fast mode in settings.json (persistent):

{
  "model": "opus",
  "fastMode": true
}

Fast mode toggle during session:

/fast
→ Fast mode ON ↯
/fast
→ Fast mode OFF

Gotchas & Tips

  • Model names have versions: claude-3-5-sonnet-20241022 is the full name. Aliases are friendlier.
  • Fallback doesn't auto-retry: It kicks in if the primary is unavailable, not if output quality is bad.
  • Cost adds up: Using Opus for everything will drain your quota fast. Be intentional.
  • Alias resolution happens at runtime: You don't need to know the exact model name if using aliases.
  • Environment variables override settings.json: If you set ANTHROPIC_MODEL, it wins over any JSON config.
  • CLI flags win everything: --model always takes precedence.
  • Reasoning tokens: Opus uses extended thinking by default (if enabled). Haiku doesn't. That affects both speed and cost.
  • Fast mode kills prompt cache: Switching between fast and standard invalidates cached prompts. Enable fast at session start, not mid-conversation, to avoid re-processing your entire context at fast-mode pricing.
  • Fast mode graceful fallback: If you hit rate limits, fast mode automatically drops to standard speed (↯ turns gray). It re-enables when the cooldown expires — no intervention needed.
  • Fast mode requires extra usage: Your plan admin must enable "extra usage" in billing settings. Fast mode tokens are billed directly, not from your included quota.

Lead-out

Model selection is about matching the tool to the task. Most of the time Sonnet is perfect. But now you know when to switch for speed or power — and if you really need to move, /fast is right there. Next, we'll talk environment variables—the broader toolkit for configuring behavior.


Reference URLs

  • Claude models documentation: https://anthropic.com/docs/about-claude/models/latest
  • Model specifications and costs: https://anthropic.com/pricing
  • Claude Code configuration docs: https://claude.ai/docs/claude-code
  • Fast mode docs: https://code.claude.com/docs/en/fast-mode
  • Fast mode API reference: https://platform.claude.com/docs/en/build-with-claude/fast-mode

Prep Reading

  • Review the three Claude models (Haiku, Sonnet, Opus)
  • Understand the speed, cost, and reasoning differences
  • Know how model selection works (CLI > env > settings hierarchy)
  • Think about your own use cases (when would you want each model?)

Notes for Daniel: Model choice is personal and depends on context. Be encouraging about experimentation. Show the actual speed difference in a demo (maybe a complex refactoring with Sonnet vs Haiku). Emphasize that Sonnet is the safe default—most users should start there. The cost/speed trade-off will resonate with developers who hit API rate limits or bills. Fast mode is exciting to demo — the speed difference is visible and impressive. Just be honest about the cost: it's 6x, not free. The Uber analogy lands well.