Skip to content

test: rename test_runner_context.py to test_experiment.py

ef1aa71
Select commit
Loading
Failed to load commit list.
Open

feat(ci): add RunnerContext and RegressionError for experiment GH action #1635

test: rename test_runner_context.py to test_experiment.py
ef1aa71
Select commit
Loading
Failed to load commit list.
Claude / Claude Code Review completed Apr 22, 2026 in 9m 18s

Code review found 1 important issue

Found 6 candidates, confirmed 4. See review comments for details.

Details

Severity Count
πŸ”΄ Important 1
🟑 Nit 2
🟣 Pre-existing 0
Severity File:Line Issue
πŸ”΄ Important langfuse/experiment.py:1147-1150 RunnerContext.run_experiment cannot override run_name or _dataset_version back to None
🟑 Nit langfuse/experiment.py:1128-1130 Mutable default arguments in RunnerContext.run_experiment
🟑 Nit langfuse/experiment.py:1186-1202 RegressionError: missing self.message attribute and misleading partial-metric message

Annotations

Check failure on line 1150 in langfuse/experiment.py

See this annotation in the file changed.

@claude claude / Claude Code Review

RunnerContext.run_experiment cannot override run_name or _dataset_version back to None

In RunnerContext.run_experiment, run_name and _dataset_version use None as both the "not provided" sentinel and a legitimate override value, so calling ctx.run_experiment(run_name=None) when the context has self.run_name="pr-123-sha-abc" silently ignores the caller intent and uses the context default instead. The class docstring guarantee that "users can override any default by passing the corresponding argument explicitly" is violated for these two parameters when the desired override is None. 

Check warning on line 1130 in langfuse/experiment.py

See this annotation in the file changed.

@claude claude / Claude Code Review

Mutable default arguments in RunnerContext.run_experiment

The `evaluators` and `run_evaluators` parameters in `RunnerContext.run_experiment` use mutable list literals (`[]`) as default argument values, which is a Python anti-pattern β€” the same default object is shared across all calls at definition time. In current code no caller mutates these lists (they are only iterated), so this is not actively broken, but it mirrors the pre-existing anti-pattern in `Langfuse.run_experiment` (client.py:2377,2379) and should be changed to `None` with an in-body subs

Check warning on line 1202 in langfuse/experiment.py

See this annotation in the file changed.

@claude claude / Claude Code Review

RegressionError: missing self.message attribute and misleading partial-metric message

Two issues in `RegressionError.__init__`: (1) the `message` parameter is accepted but never stored as `self.message`, so `exc.message` raises `AttributeError` while every other field (`result`, `metric`, `value`, `threshold`) is correctly stored as an instance attribute; (2) when `metric` is provided but `value` and/or `threshold` are `None`, the format string produces `"Regression on `avg_accuracy`: None (threshold None)"` β€” confusingly showing Python's `None` where numeric data is expected. Fi