15.5 Monitoring & Analytics

Course: Claude Code - Enterprise Development Section: Enterprise Deployment Video Length: 2-5 minutes Presenter: Daniel Treasure

Opening Hook

When you're running Claude Code across a team or in automation, you need visibility. How many tokens are you burning? Which agents are failing? Where are the bottlenecks? OpenTelemetry integration gives you traces, metrics, and logs — the same observability stack you already use for your production services.

Key Talking Points

1. Why Monitoring Matters at Scale

Individual developer usage is self-evident — you see the results in real time
Team and automation usage is invisible without monitoring
You need to track: usage volume, cost, error rates, performance bottlenecks
Same observability principles as any production system

What to say: "When it's just you using Claude Code, you can feel whether it's working. When it's 50 developers and 10 CI pipelines, you need dashboards. That's where OpenTelemetry comes in."

What to show on screen: A monitoring dashboard (Grafana, Datadog, or similar) showing Claude Code metrics.

2. OpenTelemetry Integration

Claude Code supports OTLP telemetry export
Enable with CLAUDE_CODE_ENABLE_TELEMETRY=1 and OTEL_METRICS_EXPORTER=otlp
Collects traces (request flow), metrics (usage counts), and logs
Feeds into any OTLP-compatible backend: Grafana, Datadog, New Relic, Honeycomb

What to say: "Claude Code speaks OpenTelemetry natively. Set two environment variables and it starts exporting traces and metrics to whatever OTLP backend you already run."

What to show on screen: Show the environment variable setup and a trace view in a monitoring tool.

3. Key Metrics to Track

Request volume: How many prompts per day, per user, per pipeline
Token consumption: Input and output tokens, by model tier
Tool invocation rates: Which tools are used most, which fail most
Response latency: How long requests take, where bottlenecks occur
Error rates: Failed requests, permission denials, timeout events

What to say: "The metrics that matter most are token consumption and error rates. Token consumption tells you about cost. Error rates tell you about reliability. Everything else is optimization."

What to show on screen: A dashboard with panels for token usage, request volume, and error rates.

4. Alerting and Reporting

Set alerts on error rate spikes, budget thresholds, unusual usage patterns
Generate regular reports for cost planning and capacity management
Detect anomalies: sudden token spikes could indicate runaway automation
Use existing alerting infrastructure (PagerDuty, Slack notifications, etc.)

What to say: "The alert you definitely want: token spend exceeding your daily budget. A runaway automation loop can burn through tokens fast. Set a threshold and get notified before it gets expensive."

What to show on screen: An alert rule configuration for token budget threshold.

5. Controlling Telemetry

Disable Anthropic's built-in telemetry: DISABLE_TELEMETRY=1
Disable error reporting: DISABLE_ERROR_REPORTING=1
These are separate from YOUR custom OTLP export
Enterprise deployments may require disabling outbound telemetry for compliance

What to say: "There are two separate things here. Anthropic's built-in telemetry sends anonymized usage data back to Anthropic. YOUR OpenTelemetry export sends data to YOUR monitoring stack. You control both independently."

What to show on screen: Show the environment variables side by side, explaining the difference.

Demo Plan

Show /usage command — Built-in session and weekly usage tracking
~15 seconds
Enable OTLP export — Set environment variables bash export CLAUDE_CODE_ENABLE_TELEMETRY=1 export OTEL_METRICS_EXPORTER=otlp export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
~15 seconds
Run some Claude Code commands — Generate telemetry data
~20 seconds
Show monitoring dashboard — View traces and metrics in Grafana/Datadog
~30 seconds
Show an alert rule — Token budget threshold alert
~15 seconds
Mention disabling telemetry — DISABLE_TELEMETRY=1 for compliance
~10 seconds

Code Examples & Commands

Enable OpenTelemetry export:

export CLAUDE_CODE_ENABLE_TELEMETRY=1
export OTEL_METRICS_EXPORTER=otlp
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317

Check session usage:

/usage

Disable Anthropic telemetry (enterprise compliance):

export DISABLE_TELEMETRY=1
export DISABLE_ERROR_REPORTING=1

JSON output for custom metrics collection:

claude -p "analyze code" --output-format json 2>/dev/null | jq '{
  model: .model,
  input_tokens: .usage.input_tokens,
  output_tokens: .usage.output_tokens
}' >> metrics.jsonl

Cost tracking in CI:

claude -p "review changes" --max-budget-usd 0.50 --output-format json | \
  jq '.usage' >> /var/log/claude-usage.jsonl

Gotchas & Tips

OTLP endpoint must be reachable. If Claude Code can't reach your collector, it fails silently — you won't see errors, just missing data. Test connectivity first.
/usage is per-session. It resets when you start a new session. For historical tracking, you need OTLP export to a persistent backend.
Telemetry vs error reporting are separate toggles. DISABLE_TELEMETRY stops Statsig analytics. DISABLE_ERROR_REPORTING stops Sentry crash reports. Set both for fully private operation.
Token counts in JSON output. When using --output-format json, the response includes a usage object with input_tokens and output_tokens. You can parse this for your own cost tracking without OTLP.
Runaway automation detection. Set up an alert for any single session that exceeds a token threshold. A misconfigured automation loop can burn budget fast.

Lead-out

"Monitoring isn't glamorous, but it's what separates a toy deployment from a production one. Once you have visibility into how Claude Code is being used across your org, you can optimize costs, catch failures early, and prove the ROI. Next up — cost management, where we'll dig into the strategies for keeping token spend under control."

Reference URLs

Prep Reading

OpenTelemetry basics if unfamiliar (traces, metrics, logs)
Review your org's existing monitoring stack for demo compatibility
Test OTLP export with a local collector before recording
Review /usage command output format

Notes for Daniel: This is an enterprise-audience video. The viewers are team leads and platform engineers, not individual developers. Focus on the "why" and the dashboard view rather than deep technical setup. If you can show a real Grafana dashboard with Claude Code metrics, that's the money shot. A pre-recorded dashboard walkthrough might be easier than live setup.

Quick Reference