Skip to content

Q: extending instrument_claude_agent_sdk() — scope review + landing strategy #1908

@oneryalcin

Description

@oneryalcin

Hi @alexmojaki — I've been using instrument_claude_agent_sdk() for a few weeks and built out coverage in my fork
(oneryalcin/logfire@main) for gaps I hit (see: https://github.com/oneryalcin/logfire/pulls?q=is%3Apr+is%3Aclosed) . Before opening a PR I want to check scope and landing strategy with you, since you own the integration (#1799 / #1809).

Inventory

12 scoped changes, all under logfire/_internal/integrations/claude_agent_sdk.py plus adjacent semconv / scrubbing / docs / tests:

Stream-source coverage

  • Server-side tool blocks (web_search, web_fetch, code_execution, advisor, ...) as tool_call / tool_call_response parts on the chat span — currently dropped.
  • RateLimitEvent state transitions and MirrorErrorMessage as logs under invoke_agent.

Root invoke_agent span enrichment

  • ClaudeAgentOptions session config (model / permission_mode / max_turns / effort / cwd / system_prompt / ...).
  • Per-turn identifiers from AssistantMessage on chat spans.
  • ResultMessage agent-run metadata closing the root span (subtype, errors, structured_output, model_usage, tools_used, ...).

Hook coverage — the 5 SDK hooks the integration doesn't surface today

  • UserPromptSubmit, Stop, PreCompact, Notification, PermissionRequest as logfire log records under invoke_agent.
  • can_use_tool outcomes (PermissionResult allow/deny) and PreToolUse.updatedInput mutations reflected on the execute_tool span's gen_ai.tool.call.arguments.

Lifecycle + control methods (none currently traced)

  • connect, disconnect, set_model, set_permission_mode, rewind_files, stop_task, interrupt, mcp.reconnect, mcp.toggle.
  • receive_messages parity with receive_response (currently a black hole).

Subagents (Task tool)

  • subagent {agent_type} span keyed by agent_id (opens on SubagentStart, closes on SubagentStop).
  • Task started / Task progress / Task {status} SystemMessage events as logs.
  • AgentDefinition metadata when configured (claude.agent.model, .system_prompt, .tools, ...).
  • Tool calls and per-turn chat spans inside a subagent re-parented under the subagent envelope (via agent_id / parent_tool_use_id).

Plumbing

  • New claude.* semconv constants in _internal/integrations/llm_providers/semconv.py.
  • Allowlist additions in BaseScrubber.SAFE_KEYS for model-generated and operator-set fields (text / errors / structured_output / usage / etc).
  • claude-agent-sdk>=0.1.62 dev pin + [tool.uv.exclude-newer-package] entry, matching the existing pattern.
  • New + refreshed VCR cassettes; ~3,500 lines of test additions; new section in docs/integrations/llms/claude-agent-sdk.md.

Size

~+6,000 / -400 against pydantic/logfire@main, scoped entirely to
this integration + plumbing. 98 tests pass, make format / make lint
/ make typecheck clean.

Two questions

  1. Scope — any of these you'd reject on principle? Specific ones I'd flag for sanity-check:
    • claude.stop.last_assistant_message / claude.subagent.last_assistant_message are captured via defensive .get() on undocumented wire fields (the SDK type definitions don't declare them; the integration silently skips if a future SDK release renames them).
    • Hook-event surfacing as logs adds noise to traces; some teams may
      want it gated behind a flag rather than always-on.
    • The gen_ai.tool.call.arguments overwrite on PreToolUse
      updatedInput mutation matches OTel Gen AI semconv (arguments =
      what the tool actually saw) but the original is preserved under
      claude.tool_call.arguments.original — wanted to flag the
      semantic choice.
  2. Cadence — happy to ship as one big PR or as a series of 4–6
    mapped to the groups above (stream → root-span → hooks → lifecycle
    → subagents → glue), each 200–800 lines. Recent feature PRs in this
    repo trend in that range, so the series probably matches your
    review rhythm better, but I'll defer to your preference.

No urgency on my side. Thanks for the original integration — solid
foundation to build on.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions