release: v0.0.19#68
Merged
Merged
Conversation
claude_code adapter — fix parse_session_jsonl dropping all assistant
turns from agent*_traj.json.
The parser treated message.role as authoritative and rejected any event
whose role wasn't in {user, assistant, system}. Recent claude-code
session writers emit assistant turns with message.role: None — the role
lives only in event.type — so every LLM turn (text / thinking /
tool_use) got filtered out, leaving traj files that contained only user
tool_result entries. Affected every claude_code run since the
session-format shift, including all 0.0.17 / 0.0.18 trajectories on
disk.
Fix: fall back to event.type when message.role is missing.
On a representative session (anyhow_task/390/f1_f4/agent2_session.jsonl)
this recovers all 86 assistant events, taking the parsed trajectory
from 43 messages (all user) to 129 (43 user + 86 assistant).
The underlying *_session.jsonl and *_stream.jsonl files were always
complete — only the derived *_traj.json was wrong, so historical runs
can be re-parsed by calling parse_session_jsonl on the on-disk session
file.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes
claude_codeadapter dropping every assistant turn fromagent*_traj.json.The bug
parse_session_jsonl(src/cooperbench/agents/claude_code/parsers.py) treatedmessage.roleas authoritative and rejected any event whose role wasn't in{user, assistant, system}. Recent claude-code session writers emit assistant turns withmessage.role: None— the role lives only in the top-levelevent.type— so every LLM turn (text, thinking, tool_use) got filtered out, leaving traj files that contained onlyusertool_result entries.Affected every
claude_coderun since the session-format shift, including all 0.0.17 / 0.0.18 trajectories on disk.The fix
One line — fall back to
event.typewhenmessage.roleis missing. Purely additive: existingmessage.rolevalues still win.Validation
Re-parsed the broken
anyhow_task/390/f1_f4/agent2_session.jsonl(86 raw assistant events, 43 user events) with the fix:userCross-checked 17 session files across recent CooperBench + CooperData runs:
tool_useblock counts match raw event counts exactly (28/28, 125/125, 44/44, 42/42 on spot-checked agents)Edit/Writetool_use callsOther parsers untouched (
parse_stream_jsonpopulatesresult.jsonfields, unaffected). Auditedsrc/cooperbench/for similarmessage.role-strict checks — none found inclaude_codeorrunner/;swe_agentuses a different schema.Test plan
uv run ruff check src/cooperbench/uv run ruff format --check src/cooperbench/uv run python -m mypy src/cooperbench/uv run python -m pytest tests/ --tb=short -q(385 passed, 63 skipped)[tool_use ...] {...}/ text /[tool_result] ...Migration
Old
agent*_traj.jsonfiles on disk are still wrong — they were written with the buggy parser. But the underlying*_session.jsonland*_stream.jsonlare complete; the trajectory can be re-derived from*_session.jsonlby calling the fixedparse_session_jsonlon it.🤖 Generated with Claude Code