diff --git a/README.md b/README.md index f8d67a7..1da00bd 100644 --- a/README.md +++ b/README.md @@ -6,7 +6,7 @@ When an AI agent crashes mid-task, what happens on restart? Without effect-log, ## The Core Idea -Every tool declares its **effect kind** at registration time. This drives all recovery behavior: +Every tool has an **effect kind** that drives all recovery behavior: | EffectKind | Recovery (completed) | Recovery (crashed) | |---|---|---| @@ -18,36 +18,117 @@ Every tool declares its **effect kind** at registration time. This drives all re ## Quick Start +### Auto mode — just pass functions + ```python -from effect_log import EffectKind, EffectLog, ToolDef +from effect_log import EffectLog + +log = EffectLog.auto("task-001", tools=[search_db, send_email, upsert_record]) +log.execute("search_db", {"query": "Q4 revenue"}) # auto → ReadOnly +log.execute("send_email", {"to": "ceo@co.com", ...}) # auto → IrreversibleWrite +log.execute("upsert_record", {"id": "r-001", ...}) # auto → IdempotentWrite +``` -def send_email(args): - return smtp.send(args["to"], args["subject"], args["body"]) +### Manual mode — explicit ToolDef for full control + +```python +from effect_log import EffectKind, EffectLog, ToolDef tools = [ ToolDef("read_file", EffectKind.ReadOnly, read_file), ToolDef("send_email", EffectKind.IrreversibleWrite, send_email), ToolDef("upsert", EffectKind.IdempotentWrite, upsert_record), ] +log = EffectLog.manual("task-001", tools=tools, storage="sqlite:///effects.db") +``` + +### Recovery — just add `recover=True` -log = EffectLog(execution_id="task-001", tools=tools, storage="sqlite:///effects.db") -log.execute("read_file", {"path": "/tmp/report.csv"}) -log.execute("send_email", {"to": "ceo@co.com", "subject": "Report", "body": "..."}) -log.execute("upsert", {"id": "report-001", "data": data}) +```python +log = EffectLog.auto("task-001", tools=[search_db, send_email, upsert_record], + storage="sqlite:///effects.db", recover=True) +log.execute("search_db", {"query": "Q4 revenue"}) # Replayed (fresh data) +log.execute("send_email", {"to": "ceo@co.com", ...}) # Sealed — NOT re-sent +log.execute("upsert_record", {"id": "r-001", ...}) # Replayed (idempotent) ``` -Recovery — just add `recover=True`, re-run the same steps: +### Override when needed + +If the classifier gets something wrong, override just that tool: ```python -log = EffectLog(execution_id="task-001", tools=tools, storage="sqlite:///effects.db", recover=True) -log.execute("read_file", {"path": "/tmp/report.csv"}) # Replayed (fresh data) -log.execute("send_email", {"to": "ceo@co.com", ...}) # Sealed — NOT re-sent -log.execute("upsert", {"id": "report-001", ...}) # Replayed (idempotent) +from effect_log import EffectKind, EffectLog + +log = EffectLog.auto("task-001", + tools=[search_db, send_email, process_order], + overrides={"process_order": EffectKind.IdempotentWrite} +) +``` + +### Inspect classifications + +```python +from effect_log import classify_tools + +report = classify_tools([search_db, send_email, process_order]) +print(report) +# search_db -> ReadOnly (0.50) name +# send_email -> IrreversibleWrite (0.50) name +# process_order -> IrreversibleWrite (0.00) default (no signals)!!! + +# Apply with corrections +tools = report.apply(overrides={"process_order": EffectKind.IdempotentWrite}) +log = EffectLog("task-001", tools=tools) ``` +### Hybrid mode (default constructor) + +The default constructor accepts a mix of callables and ToolDefs: + +```python +from effect_log import EffectLog, ToolDef, EffectKind + +log = EffectLog("task-001", tools=[ + search_db, # auto-classified + ToolDef("send_email", EffectKind.IrreversibleWrite, send_email), # explicit +]) +``` + +### Decorators + +```python +from effect_log import tool, auto_tool + +@tool(effect=EffectKind.ReadOnly) # explicit +def read_file(args): ... + +@tool() # auto-classified +def search_db(args): ... + +@auto_tool # shorthand for @tool() +def fetch_data(args): ... +``` + +## Auto-Classification + +effect-log classifies tools using a 4-layer weighted heuristic: + +| Layer | Signal | Weight | Example | +|---|---|---|---| +| **Name prefix** | `func.__name__` matched against prefix→kind map | 0.50 | `search_` → ReadOnly | +| **Docstring keywords** | `inspect.getdoc()` scanned for keyword families | 0.25 | "irreversible" → IrreversibleWrite | +| **Parameter names** | `inspect.signature()` parameter names | 0.15 | `to`, `recipient` → IrreversibleWrite | +| **Source AST** | `inspect.getsource()` for HTTP/SDK patterns | 0.10 | `requests.post()` → IrreversibleWrite | + +**Safety guarantees:** +- Low confidence → defaults to `IrreversibleWrite` (never re-executes ambiguous tools) +- Compensatable auto-downgrades to `IrreversibleWrite` (requires compensation function) +- Explicit always wins: `overrides=`, `ToolDef(kind)`, `@tool(EffectKind.X)` bypass classification +- All classifications logged (`effect_log.classify` logger) + ## Framework Integration -Built-in middleware for major agent frameworks: +Built-in middleware for major agent frameworks. All middleware now accepts raw callables with auto-classification: | Framework | Middleware | Entry Point | |---|---|---| @@ -56,7 +137,22 @@ Built-in middleware for major agent frameworks: | **CrewAI** | `effect_log.middleware.crewai` | `effect_logged_crew`, `effect_logged_tool` | | **Pydantic AI** | `effect_log.middleware.pydantic_ai` | `effect_logged_agent`, `EffectLogToolset` | | **Anthropic Claude API** | `effect_log.middleware.anthropic` | `effect_logged_tool_executor`, `process_tool_calls` | -| **Bub** | `effect_log.middleware.bub` | `effect_logged_registry`, `effect_logged_tool` | +| **Bub** | `effect_log.middleware.bub` | `EffectLoggedToolExecutor`, `effect_logged_agent` | + +Middleware `make_tooldefs()` / `make_tools()` now accepts raw callables alongside spec dicts: + +```python +from effect_log.middleware.anthropic import make_tooldefs + +# Before: always needed explicit effect +make_tooldefs([ + {"func": search_db, "effect": EffectKind.ReadOnly}, + {"func": send_email, "effect": EffectKind.IrreversibleWrite}, +]) + +# After: just pass functions +make_tooldefs([search_db, send_email]) +``` See [`examples/`](examples/) for runnable demos: @@ -95,11 +191,10 @@ pytest tests/ -v - [x] Core library — WAL engine, recovery engine, SQLite + in-memory backends - [x] Python bindings — PyO3 + maturin -- [x] Framework middleware — LangGraph, OpenAI Agents SDK, CrewAI, Pydantic AI, Anthropic Claude API -- [x] Framework middleware — LangGraph, OpenAI Agents SDK, CrewAI, Pydantic AI, Bub +- [x] Framework middleware — LangGraph, OpenAI Agents SDK, CrewAI, Pydantic AI, Anthropic Claude API, Bub +- [x] Auto-classification — infer effect kind from function name, docstring, parameters, and source AST - [ ] TypeScript bindings — napi-rs, Vercel AI SDK - [ ] Additional backends — RocksDB, S3, Restate journal -- [ ] Auto-classification — infer effect kind from HTTP methods / API metadata ## Inspiration diff --git a/bindings/python/python/effect_log/__init__.py b/bindings/python/python/effect_log/__init__.py index 6750bf7..4d622e0 100644 --- a/bindings/python/python/effect_log/__init__.py +++ b/bindings/python/python/effect_log/__init__.py @@ -1,27 +1,202 @@ """effect-log: Semantic side-effect tracking for AI agents.""" -from effect_log.effect_log_native import EffectKind, EffectLog, ToolDef +from enum import Enum -__all__ = ["EffectKind", "EffectLog", "ToolDef", "tool", "middleware"] +from effect_log.effect_log_native import ( + EffectKind, + EffectLog as _NativeEffectLog, + ToolDef, +) +from effect_log.classify import ( + classify_effect_kind, + classify_from_name, + classify_tools, + classify_with_llm, +) -def tool(effect: EffectKind, compensate=None): +class ClassifyMode(Enum): + """Controls how tools are classified in an EffectLog. + + AUTO — all tools are raw callables; effect kinds are inferred automatically. + MANUAL — all tools are explicit ToolDef instances; no auto-classification. + HYBRID — mixed: callables are auto-classified, ToolDefs pass through as-is. + """ + + AUTO = "auto" + MANUAL = "manual" + HYBRID = "hybrid" + + +__all__ = [ + "ClassifyMode", + "EffectKind", + "EffectLog", + "ToolDef", + "tool", + "auto_tool", + "classify_tools", + "classify_effect_kind", + "classify_from_name", + "classify_with_llm", + "middleware", +] + + +def _wrap_callable(func): + """Wrap a bare callable so it receives **kwargs from args dict.""" + + def adapted(args, _fn=func): + return _fn(**args) + + return adapted + + +class EffectLog: + """Python wrapper around _NativeEffectLog with auto-classification. + + Accepts raw callables alongside ToolDef instances. Raw callables are + auto-classified using heuristic analysis. Use overrides= to correct + any misclassifications. + + Args: + execution_id: Unique identifier for this execution. + tools: List of ToolDef instances, callables, or a mix. + storage: Storage backend ("memory" or "sqlite:///path"). + recover: Whether to recover from a previous execution. + overrides: Optional dict mapping function name -> EffectKind + to override auto-classification. + mode: ClassifyMode controlling validation (default HYBRID). + """ + + def __init__( + self, + execution_id: str, + tools: list, + storage: str = "memory", + recover: bool = False, + overrides: dict[str, EffectKind] | None = None, + mode: ClassifyMode = ClassifyMode.HYBRID, + ): + if mode is ClassifyMode.MANUAL and overrides: + raise ValueError( + "overrides= is not supported in MANUAL mode. " + "Remove overrides or use HYBRID mode." + ) + + overrides = overrides or {} + processed = [] + for t in tools: + if isinstance(t, ToolDef): + if mode is ClassifyMode.AUTO: + raise TypeError( + "In AUTO mode, pass raw callables instead of ToolDef. " + "Use MANUAL or HYBRID mode for explicit ToolDef." + ) + processed.append(t) + elif callable(t): + if mode is ClassifyMode.MANUAL: + raise TypeError( + "In MANUAL mode, all tools must be ToolDef instances. " + "Use EffectLog.auto() or HYBRID mode for raw callables." + ) + name = getattr(t, "__name__", str(t)) + kind = overrides.get(name) + if kind is None: + kind = classify_effect_kind(t, name).effect_kind + processed.append(ToolDef(name, kind, _wrap_callable(t))) + else: + raise TypeError(f"Expected ToolDef or callable, got {type(t).__name__}") + self._inner = _NativeEffectLog(execution_id, processed, storage, recover) + + @classmethod + def auto( + cls, + execution_id: str, + tools: list, + storage: str = "memory", + recover: bool = False, + overrides: dict[str, EffectKind] | None = None, + ) -> "EffectLog": + """Create an EffectLog in AUTO mode — all tools are raw callables.""" + return cls( + execution_id=execution_id, + tools=tools, + storage=storage, + recover=recover, + overrides=overrides, + mode=ClassifyMode.AUTO, + ) + + @classmethod + def manual( + cls, + execution_id: str, + tools: list, + storage: str = "memory", + recover: bool = False, + ) -> "EffectLog": + """Create an EffectLog in MANUAL mode — all tools must be ToolDef.""" + return cls( + execution_id=execution_id, + tools=tools, + storage=storage, + recover=recover, + mode=ClassifyMode.MANUAL, + ) + + def execute(self, tool_name: str, args: dict): + """Execute a tool through the effect-log WAL.""" + return self._inner.execute(tool_name, args) + + def history(self) -> list[dict]: + """Get execution history.""" + return self._inner.history() + + +def tool(effect=None, compensate=None): """Decorator to register a function as an effect-logged tool. + Supports both ``@tool`` (no parens) and ``@tool()`` / ``@tool(EffectKind.X)``. + Args: - effect: The EffectKind classification for this tool. + effect: The EffectKind classification. If None, auto-classified. compensate: Optional compensation function for Compensatable effects. Returns: A ToolDef wrapping the function. """ + # Handle @tool without parens: effect will be the decorated function + if callable(effect): + return auto_tool(effect) def decorator(func): + kind = effect + if kind is None: + kind = classify_effect_kind(func).effect_kind return ToolDef( name=func.__name__, - effect_kind=effect, + effect_kind=kind, func=func, compensate=compensate, ) return decorator + + +def auto_tool(func): + """Convenience decorator: auto-classifies effect kind from function metadata. + + Equivalent to @tool() with no arguments. + + Usage: + @auto_tool + def search_db(args): + return db.query(args["query"]) + """ + kind = classify_effect_kind(func).effect_kind + return ToolDef( + name=func.__name__, + effect_kind=kind, + func=func, + ) diff --git a/bindings/python/python/effect_log/classify.py b/bindings/python/python/effect_log/classify.py new file mode 100644 index 0000000..01ada8b --- /dev/null +++ b/bindings/python/python/effect_log/classify.py @@ -0,0 +1,749 @@ +"""Auto-classification of tool effect kinds from function metadata. + +Classifies callables into EffectKind categories using heuristic analysis +of function names, docstrings, parameter names, and source AST patterns. +""" + +from __future__ import annotations + +import inspect +import logging +import os +import re +from dataclasses import dataclass, field +from typing import Callable, Sequence + +from effect_log.effect_log_native import EffectKind, ToolDef + +logger = logging.getLogger("effect_log.classify") + +# ── Name prefix → EffectKind maps (longest match wins) ────────────────────── + +_PREFIX_MAP: list[tuple[str, EffectKind]] = [ + # ReadOnly + ("read_", EffectKind.ReadOnly), + ("fetch_", EffectKind.ReadOnly), + ("get_", EffectKind.ReadOnly), + ("search_", EffectKind.ReadOnly), + ("query_", EffectKind.ReadOnly), + ("list_", EffectKind.ReadOnly), + ("check_", EffectKind.ReadOnly), + ("validate_", EffectKind.ReadOnly), + ("describe_", EffectKind.ReadOnly), + ("count_", EffectKind.ReadOnly), + ("find_", EffectKind.ReadOnly), + ("lookup_", EffectKind.ReadOnly), + ("browse_", EffectKind.ReadOnly), + ("view_", EffectKind.ReadOnly), + ("show_", EffectKind.ReadOnly), + ("inspect_", EffectKind.ReadOnly), + ("parse_", EffectKind.ReadOnly), + ("transform_", EffectKind.ReadOnly), + ("format_", EffectKind.ReadOnly), + ("log_", EffectKind.ReadOnly), + ("trace_", EffectKind.ReadOnly), + # IrreversibleWrite + ("send_", EffectKind.IrreversibleWrite), + ("email_", EffectKind.IrreversibleWrite), + ("notify_", EffectKind.IrreversibleWrite), + ("broadcast_", EffectKind.IrreversibleWrite), + ("publish_", EffectKind.IrreversibleWrite), + ("deploy_", EffectKind.IrreversibleWrite), + ("delete_", EffectKind.IrreversibleWrite), + ("remove_", EffectKind.IrreversibleWrite), + ("destroy_", EffectKind.IrreversibleWrite), + ("drop_", EffectKind.IrreversibleWrite), + ("purge_", EffectKind.IrreversibleWrite), + ("revoke_", EffectKind.IrreversibleWrite), + ("terminate_", EffectKind.IrreversibleWrite), + ("kill_", EffectKind.IrreversibleWrite), + ("post_to_", EffectKind.IrreversibleWrite), + ("tweet_", EffectKind.IrreversibleWrite), + ("sms_", EffectKind.IrreversibleWrite), + # IrreversibleWrite (non-idempotent creates) + ("create_", EffectKind.IrreversibleWrite), + ("add_", EffectKind.IrreversibleWrite), + ("insert_", EffectKind.IrreversibleWrite), + # IdempotentWrite + ("upsert_", EffectKind.IdempotentWrite), + ("update_", EffectKind.IdempotentWrite), + ("put_", EffectKind.IdempotentWrite), + ("save_", EffectKind.IdempotentWrite), + ("set_", EffectKind.IdempotentWrite), + ("write_", EffectKind.IdempotentWrite), + ("store_", EffectKind.IdempotentWrite), + ("upload_", EffectKind.IdempotentWrite), + ("register_", EffectKind.IdempotentWrite), + ("configure_", EffectKind.IdempotentWrite), + ("enable_", EffectKind.IdempotentWrite), + ("disable_", EffectKind.IdempotentWrite), + ("assign_", EffectKind.IdempotentWrite), + ("tag_", EffectKind.IdempotentWrite), + # Compensatable (auto-downgraded to IrreversibleWrite since compensation + # functions can't be auto-detected; kept here so the downgrade message + # can tell users to provide compensate= to unlock Compensatable) + ("reserve_", EffectKind.Compensatable), + ("lock_", EffectKind.Compensatable), + ("allocate_", EffectKind.Compensatable), + ("book_", EffectKind.Compensatable), + ("hold_", EffectKind.Compensatable), + ("checkout_", EffectKind.Compensatable), + ("claim_", EffectKind.Compensatable), + # ReadThenWrite + ("transfer_", EffectKind.ReadThenWrite), + ("swap_", EffectKind.ReadThenWrite), + ("exchange_", EffectKind.ReadThenWrite), + ("move_", EffectKind.ReadThenWrite), + ("migrate_", EffectKind.ReadThenWrite), + ("sync_", EffectKind.ReadThenWrite), + ("reconcile_", EffectKind.ReadThenWrite), +] + +# Sort by prefix length descending so longest match wins +_PREFIX_MAP.sort(key=lambda x: len(x[0]), reverse=True) + +# Exact name matches for single-word function names +_EXACT_NAME_MAP: dict[str, EffectKind] = { + "search": EffectKind.ReadOnly, + "query": EffectKind.ReadOnly, + "fetch": EffectKind.ReadOnly, + "get": EffectKind.ReadOnly, + "read": EffectKind.ReadOnly, + "list": EffectKind.ReadOnly, + "find": EffectKind.ReadOnly, + "lookup": EffectKind.ReadOnly, + "check": EffectKind.ReadOnly, + "validate": EffectKind.ReadOnly, + "count": EffectKind.ReadOnly, + "parse": EffectKind.ReadOnly, + "transform": EffectKind.ReadOnly, + "format": EffectKind.ReadOnly, + "log": EffectKind.ReadOnly, + "send": EffectKind.IrreversibleWrite, + "email": EffectKind.IrreversibleWrite, + "notify": EffectKind.IrreversibleWrite, + "publish": EffectKind.IrreversibleWrite, + "deploy": EffectKind.IrreversibleWrite, + "delete": EffectKind.IrreversibleWrite, + "remove": EffectKind.IrreversibleWrite, + "destroy": EffectKind.IrreversibleWrite, + "purge": EffectKind.IrreversibleWrite, + "create": EffectKind.IrreversibleWrite, + "upsert": EffectKind.IdempotentWrite, + "update": EffectKind.IdempotentWrite, + "save": EffectKind.IdempotentWrite, + "insert": EffectKind.IrreversibleWrite, + "write": EffectKind.IdempotentWrite, + "store": EffectKind.IdempotentWrite, + "upload": EffectKind.IdempotentWrite, + "register": EffectKind.IdempotentWrite, + "reserve": EffectKind.Compensatable, + "lock": EffectKind.Compensatable, + "book": EffectKind.Compensatable, + "transfer": EffectKind.ReadThenWrite, + "swap": EffectKind.ReadThenWrite, + "migrate": EffectKind.ReadThenWrite, + "sync": EffectKind.ReadThenWrite, +} + +# ── Docstring keyword families ─────────────────────────────────────────────── + +_DOCSTRING_KEYWORDS: list[tuple[list[str], EffectKind]] = [ + # ReadOnly + ( + [ + "read-only", + "readonly", + "no side effect", + "pure function", + "retrieves", + "fetches", + "queries", + "looks up", + "returns data", + "does not modify", + "non-destructive", + ], + EffectKind.ReadOnly, + ), + # IrreversibleWrite + ( + [ + "irreversible", + "cannot be undone", + "sends email", + "sends notification", + "permanently delete", + "destructive", + "cannot undo", + "broadcasts", + "deploys", + "publishes", + "posts to", + "external service", + "side effect", + "not idempotent", + ], + EffectKind.IrreversibleWrite, + ), + # IdempotentWrite + ( + [ + "idempotent", + "upsert", + "safe to retry", + "creates or updates", + "overwrites", + "replaces", + "puts", + "stores", + "saves", + "can be retried", + "same result if repeated", + ], + EffectKind.IdempotentWrite, + ), + # Compensatable + ( + [ + "compensat", + "can be reversed", + "can be undone", + "rollback", + "reservation", + "temporary hold", + "can cancel", + ], + EffectKind.Compensatable, + ), + # ReadThenWrite + ( + [ + "read then write", + "reads and writes", + "transfers", + "moves data", + "migrates", + "synchronizes", + "reconciles", + ], + EffectKind.ReadThenWrite, + ), +] + +# ── Parameter name signals ─────────────────────────────────────────────────── + +_PARAM_SIGNALS: list[tuple[list[str], EffectKind]] = [ + # IrreversibleWrite: messaging/notification params + ( + ["to", "recipient", "recipients", "email", "phone", "channel"], + EffectKind.IrreversibleWrite, + ), + # IdempotentWrite: identity/key params + ( + ["id", "key", "record_id", "entity_id", "pk", "primary_key"], + EffectKind.IdempotentWrite, + ), + # ReadOnly: query/filter params + (["query", "search", "filter", "q", "term", "keyword"], EffectKind.ReadOnly), +] + +# ── AST patterns ───────────────────────────────────────────────────────────── + +_AST_PATTERNS: list[tuple[str, EffectKind]] = [ + # HTTP method patterns + (r"\.get\(", EffectKind.ReadOnly), + (r"\.fetch\(", EffectKind.ReadOnly), + (r"requests\.get\(", EffectKind.ReadOnly), + (r"\.post\(", EffectKind.IrreversibleWrite), + (r"requests\.post\(", EffectKind.IrreversibleWrite), + (r"\.delete\(", EffectKind.IrreversibleWrite), + (r"requests\.delete\(", EffectKind.IrreversibleWrite), + (r"\.put\(", EffectKind.IdempotentWrite), + (r"requests\.put\(", EffectKind.IdempotentWrite), + (r"\.patch\(", EffectKind.IdempotentWrite), + (r"requests\.patch\(", EffectKind.IdempotentWrite), + # SDK patterns + (r"smtp|sendmail|send_mail", EffectKind.IrreversibleWrite), + (r"\.send\(", EffectKind.IrreversibleWrite), + (r"\.publish\(", EffectKind.IrreversibleWrite), + (r"\.select\(|\.query\(|\.find\(", EffectKind.ReadOnly), + (r"\.insert\(|\.upsert\(|\.update\(|\.save\(", EffectKind.IdempotentWrite), + (r"\.drop\(|\.truncate\(", EffectKind.IrreversibleWrite), +] + +# ── Layer weights ──────────────────────────────────────────────────────────── + +_WEIGHT_NAME = 0.50 +_WEIGHT_DOCSTRING = 0.25 +_WEIGHT_PARAMS = 0.15 +_WEIGHT_AST = 0.10 + + +# ── Core types ─────────────────────────────────────────────────────────────── + + +@dataclass +class ClassificationResult: + """Result of classifying a single tool's effect kind.""" + + effect_kind: EffectKind + confidence: float # 0.0 – 1.0 + reason: str # human-readable explanation + source: str # "heuristic" | "llm" | "default" + + +@dataclass +class ClassificationReport: + """Batch classification results with printable table and .apply().""" + + results: dict[str, ClassificationResult] = field(default_factory=dict) + _funcs: dict[str, Callable] = field(default_factory=dict, repr=False) + + def apply(self, overrides: dict[str, EffectKind] | None = None) -> list[ToolDef]: + """Convert results to ToolDef list, applying overrides. + + Args: + overrides: Optional dict mapping function name -> EffectKind + to override the auto-classified result. + + Returns: + List of ToolDef instances ready for EffectLog construction. + """ + overrides = overrides or {} + defs = [] + for name, result in self.results.items(): + kind = overrides.get(name, result.effect_kind) + fn = self._funcs[name] + + def adapted(args, _fn=fn): + return _fn(**args) + + defs.append(ToolDef(name, kind, adapted)) + return defs + + def __str__(self) -> str: + lines = [] + max_name = max((len(n) for n in self.results), default=0) + for name, r in self.results.items(): + kind_str = _kind_name(r.effect_kind) + conf_str = f"{r.confidence:.2f}" + warn = "" if r.confidence >= 0.6 else " !!!" + lines.append( + f"{name:<{max_name}} -> {kind_str:<20} ({conf_str}) {r.reason}{warn}" + ) + return "\n".join(lines) + + def __repr__(self) -> str: + return f"ClassificationReport({len(self.results)} tools)" + + +# ── Scoring helpers ────────────────────────────────────────────────────────── +# Note: PyO3 EffectKind is not hashable, so we use int(kind) as dict keys +# and _kind_from_int() to convert back. + +_ALL_KINDS = [ + EffectKind.ReadOnly, + EffectKind.IdempotentWrite, + EffectKind.Compensatable, + EffectKind.IrreversibleWrite, + EffectKind.ReadThenWrite, +] +_KIND_BY_INT = {int(k): k for k in _ALL_KINDS} + + +def _kind_from_int(i: int) -> EffectKind: + return _KIND_BY_INT[i] + + +def _ki(kind: EffectKind) -> int: + """Shorthand: EffectKind -> int key for score dicts.""" + return int(kind) + + +_KIND_NAMES = { + int(EffectKind.ReadOnly): "ReadOnly", + int(EffectKind.IdempotentWrite): "IdempotentWrite", + int(EffectKind.Compensatable): "Compensatable", + int(EffectKind.IrreversibleWrite): "IrreversibleWrite", + int(EffectKind.ReadThenWrite): "ReadThenWrite", +} + + +def _kind_name(kind: EffectKind) -> str: + """Get the string name of an EffectKind.""" + return _KIND_NAMES.get(int(kind), str(kind)) + + +def _score_name(name: str) -> dict[int, float]: + """Layer 1: Score by function name prefix or exact match.""" + scores: dict[int, float] = {} + + # Exact match first + if name in _EXACT_NAME_MAP: + kind = _EXACT_NAME_MAP[name] + scores[_ki(kind)] = 1.0 + return scores + + # Prefix match (longest wins) + for prefix, kind in _PREFIX_MAP: + if name.startswith(prefix): + scores[_ki(kind)] = 1.0 + return scores + + return scores + + +def _score_docstring(func: Callable) -> dict[int, float]: + """Layer 2: Score by docstring keyword analysis.""" + scores: dict[int, float] = {} + doc = inspect.getdoc(func) + if not doc: + return scores + + doc_lower = doc.lower() + for keywords, kind in _DOCSTRING_KEYWORDS: + hits = sum(1 for kw in keywords if kw in doc_lower) + if hits > 0: + # Normalize: more keyword hits = higher confidence, capped at 1.0 + scores[_ki(kind)] = min(hits / 3.0, 1.0) + + return scores + + +def _score_params(func: Callable) -> dict[int, float]: + """Layer 3: Score by parameter name signals.""" + scores: dict[int, float] = {} + try: + sig = inspect.signature(func) + except (ValueError, TypeError): + return scores + + param_names = [p.lower() for p in sig.parameters] + for signal_names, kind in _PARAM_SIGNALS: + hits = sum(1 for p in param_names if p in signal_names) + if hits > 0: + scores[_ki(kind)] = min(hits / 2.0, 1.0) + + return scores + + +def _score_ast(func: Callable) -> dict[int, float]: + """Layer 4: Score by source code AST patterns.""" + scores: dict[int, float] = {} + try: + source = inspect.getsource(func) + except (OSError, TypeError): + return scores + + for pattern, kind in _AST_PATTERNS: + ki = _ki(kind) + if re.search(pattern, source): + scores[ki] = scores.get(ki, 0.0) + 0.5 + scores[ki] = min(scores[ki], 1.0) + + return scores + + +def _combine_scores( + name_scores: dict[int, float], + doc_scores: dict[int, float], + param_scores: dict[int, float], + ast_scores: dict[int, float], +) -> tuple[EffectKind, float, str]: + """Combine weighted scores from all layers. + + Returns (effect_kind, confidence, reason). + """ + all_kinds = set(name_scores) | set(doc_scores) | set(param_scores) | set(ast_scores) + + if not all_kinds: + return EffectKind.IrreversibleWrite, 0.0, "default (no signals)" + + best_kind_int = _ki(EffectKind.IrreversibleWrite) + best_score = 0.0 + reasons = [] + + for ki in all_kinds: + score = ( + name_scores.get(ki, 0.0) * _WEIGHT_NAME + + doc_scores.get(ki, 0.0) * _WEIGHT_DOCSTRING + + param_scores.get(ki, 0.0) * _WEIGHT_PARAMS + + ast_scores.get(ki, 0.0) * _WEIGHT_AST + ) + if score > best_score: + best_score = score + best_kind_int = ki + + # Build reason string from contributing layers + if best_kind_int in name_scores: + reasons.append("name") + if best_kind_int in doc_scores: + reasons.append("docstring") + if best_kind_int in param_scores: + reasons.append("params") + if best_kind_int in ast_scores: + reasons.append("ast") + + reason = " + ".join(reasons) if reasons else "default" + # Clamp confidence to [0, 1] + confidence = min(best_score, 1.0) + + return _kind_from_int(best_kind_int), confidence, reason + + +# ── Public API ─────────────────────────────────────────────────────────────── + + +def classify_effect_kind( + func: Callable, name: str | None = None +) -> ClassificationResult: + """Classify a callable's effect kind using heuristic analysis. + + Uses 4 weighted layers: name prefix (0.50), docstring keywords (0.25), + parameter names (0.15), and source AST patterns (0.10). + + Args: + func: The callable to classify. + name: Optional override for the function name (uses func.__name__ if None). + + Returns: + ClassificationResult with effect_kind, confidence, reason, and source. + """ + fname = name or getattr(func, "__name__", "") + + name_scores = _score_name(fname) + doc_scores = _score_docstring(func) + param_scores = _score_params(func) + ast_scores = _score_ast(func) + + kind, confidence, reason = _combine_scores( + name_scores, doc_scores, param_scores, ast_scores + ) + + # Safety: Compensatable requires a compensation function we can't detect, + # so downgrade to IrreversibleWrite with a hint + if int(kind) == int(EffectKind.Compensatable): + reason += " (auto-downgraded from Compensatable — provide compensate= to use Compensatable)" + kind = EffectKind.IrreversibleWrite + + # Log the classification + if confidence >= 0.6: + logger.info("%s -> %s (%.2f, %s)", fname, _kind_name(kind), confidence, reason) + else: + logger.warning( + "%s -> %s (%.2f, %s — consider specifying explicitly)", + fname, + _kind_name(kind), + confidence, + reason, + ) + + return ClassificationResult( + effect_kind=kind, + confidence=confidence, + reason=reason, + source="heuristic", + ) + + +def classify_from_name(name: str) -> ClassificationResult: + """Classify effect kind from a function name string only. + + Uses only the name prefix/exact match layer. Useful for middleware + wrapping where the original callable isn't available. + + Args: + name: The function/tool name to classify. + + Returns: + ClassificationResult based on name analysis only. + """ + name_scores = _score_name(name) + + if name_scores: + best_ki = max(name_scores, key=name_scores.get) + kind = _kind_from_int(best_ki) + confidence = name_scores[best_ki] + reason = "name" + else: + kind = EffectKind.IrreversibleWrite + confidence = 0.0 + reason = "default (no name match)" + + if int(kind) == int(EffectKind.Compensatable): + reason += " (auto-downgraded from Compensatable)" + kind = EffectKind.IrreversibleWrite + + if confidence >= 0.6: + logger.info("%s -> %s (%.2f, %s)", name, _kind_name(kind), confidence, reason) + else: + logger.warning( + "%s -> %s (%.2f, %s — consider specifying explicitly)", + name, + _kind_name(kind), + confidence, + reason, + ) + + return ClassificationResult( + effect_kind=kind, + confidence=confidence, + reason=reason, + source="heuristic", + ) + + +def classify_tools(funcs: Sequence[Callable]) -> ClassificationReport: + """Batch classify a list of callables. + + Args: + funcs: Sequence of callables to classify. + + Returns: + ClassificationReport with results for each function. + Use report.apply(overrides=) to convert to ToolDef list. + """ + report = ClassificationReport() + for func in funcs: + name = getattr(func, "__name__", str(func)) + result = classify_effect_kind(func, name) + report.results[name] = result + report._funcs[name] = func + return report + + +# ── LLM classification ────────────────────────────────────────────────────── + +_VALID_KINDS = { + "ReadOnly": EffectKind.ReadOnly, + "IdempotentWrite": EffectKind.IdempotentWrite, + "IrreversibleWrite": EffectKind.IrreversibleWrite, + "ReadThenWrite": EffectKind.ReadThenWrite, + "Compensatable": EffectKind.IrreversibleWrite, # safety downgrade +} + +_LLM_PROMPT_TEMPLATE = ( + "Classify this Python function's side-effect kind.\n" + "Function name: {name}\n" + "Docstring: {doc}\n" + "Source:\n{source}\n\n" + "Classify as exactly one of: ReadOnly, IdempotentWrite, " + "IrreversibleWrite, ReadThenWrite\n" + "Respond with just the classification name." +) + + +def _call_anthropic(prompt: str, model: str) -> str: + """Call the Anthropic API. Raises on failure.""" + import anthropic + + client = anthropic.Anthropic() + response = client.messages.create( + model=model, + max_tokens=50, + messages=[{"role": "user", "content": prompt}], + ) + return response.content[0].text.strip() + + +def _call_openai(prompt: str, model: str) -> str: + """Call the OpenAI API. Raises on failure.""" + import openai + + client = openai.OpenAI() + response = client.chat.completions.create( + model=model, + max_tokens=50, + messages=[{"role": "user", "content": prompt}], + ) + return response.choices[0].message.content.strip() + + +def classify_with_llm( + func: Callable, + *, + model: str | None = None, + provider: str | None = None, +) -> ClassificationResult: + """Classify effect kind using an LLM. + + Sends the function name, docstring, and source code to an LLM API + for classification. Useful as a fallback when heuristic classification + has low confidence. + + **Security note**: This sends function source code to an external API. + Do not use on functions that contain embedded secrets or sensitive logic. + + Requires either the ``anthropic`` or ``openai`` SDK installed. + Opt-in via the ``EFFECT_LOG_LLM_CLASSIFY=1`` environment variable, + or pass ``provider`` explicitly. + + Args: + func: The callable to classify. + model: Model name to use. Defaults to ``EFFECT_LOG_LLM_MODEL`` env var, + or ``"claude-haiku-4-5-20251001"`` (Anthropic) / + ``"gpt-4o-mini"`` (OpenAI). + provider: ``"anthropic"`` or ``"openai"``. Defaults to + ``EFFECT_LOG_LLM_PROVIDER`` env var. If unset, tries + Anthropic first, then OpenAI. + + Returns: + ClassificationResult from LLM analysis. + + Raises: + RuntimeError: If ``EFFECT_LOG_LLM_CLASSIFY=1`` is not set and no + explicit ``provider`` is given. + """ + if provider is None and not os.environ.get("EFFECT_LOG_LLM_CLASSIFY"): + raise RuntimeError( + "LLM classification requires EFFECT_LOG_LLM_CLASSIFY=1 " + "environment variable, or pass provider= explicitly" + ) + + provider = provider or os.environ.get("EFFECT_LOG_LLM_PROVIDER") + model_env = os.environ.get("EFFECT_LOG_LLM_MODEL") + + fname = getattr(func, "__name__", "unknown") + doc = inspect.getdoc(func) or "" + try: + source = inspect.getsource(func) + except (OSError, TypeError): + source = "" + + prompt = _LLM_PROMPT_TEMPLATE.format(name=fname, doc=doc, source=source) + + answer: str | None = None + + if provider == "anthropic": + answer = _call_anthropic( + prompt, model or model_env or "claude-haiku-4-5-20251001" + ) + elif provider == "openai": + answer = _call_openai(prompt, model or model_env or "gpt-4o-mini") + else: + # Auto-detect: try anthropic, fall back to openai + try: + answer = _call_anthropic( + prompt, model or model_env or "claude-haiku-4-5-20251001" + ) + except ImportError: + try: + answer = _call_openai(prompt, model or model_env or "gpt-4o-mini") + except ImportError: + raise ImportError( + "LLM classification requires either the anthropic or openai SDK. " + "Install with: pip install anthropic or pip install openai" + ) + + kind = _VALID_KINDS.get(answer, EffectKind.IrreversibleWrite) + confidence = 0.80 if answer in _VALID_KINDS else 0.0 + + logger.info("%s -> %s (%.2f, llm: %s)", fname, _kind_name(kind), confidence, answer) + + return ClassificationResult( + effect_kind=kind, + confidence=confidence, + reason=f"llm: {answer}", + source="llm", + ) diff --git a/bindings/python/python/effect_log/effect_log_native.pyi b/bindings/python/python/effect_log/effect_log_native.pyi index 98c41a3..0bd9ab3 100644 --- a/bindings/python/python/effect_log/effect_log_native.pyi +++ b/bindings/python/python/effect_log/effect_log_native.pyi @@ -25,6 +25,9 @@ class ToolDef: ) -> None: ... class EffectLog: + """Native EffectLog. Prefer using effect_log.EffectLog (Python wrapper) + which adds auto-classification support.""" + def __init__( self, execution_id: str, diff --git a/bindings/python/python/effect_log/middleware/__init__.py b/bindings/python/python/effect_log/middleware/__init__.py index 3362042..8a25717 100644 --- a/bindings/python/python/effect_log/middleware/__init__.py +++ b/bindings/python/python/effect_log/middleware/__init__.py @@ -4,10 +4,14 @@ Framework dependencies are optional — import errors are raised only when you actually try to use the middleware. +All middleware now supports auto-classification: pass raw callables to +make_tooldefs() / make_tools() and effect kinds are inferred automatically. + Available middleware: - effect_log.middleware.langgraph — LangGraph / LangChain - effect_log.middleware.openai_agents — OpenAI Agents SDK - effect_log.middleware.crewai — CrewAI - effect_log.middleware.pydantic_ai — Pydantic AI - effect_log.middleware.anthropic — Anthropic Claude API + - effect_log.middleware.bub — Bub agent framework """ diff --git a/bindings/python/python/effect_log/middleware/anthropic.py b/bindings/python/python/effect_log/middleware/anthropic.py index 0b377b5..a9f869b 100644 --- a/bindings/python/python/effect_log/middleware/anthropic.py +++ b/bindings/python/python/effect_log/middleware/anthropic.py @@ -9,11 +9,14 @@ from effect_log import EffectLog, EffectKind, ToolDef from effect_log.middleware.anthropic import effect_logged_tool_executor, make_tooldefs - tool_specs = [ + # With auto-classification (just pass functions): + tools = make_tooldefs([search_db, send_email]) + + # Or with explicit effects: + tools = make_tooldefs([ {"func": search_db, "effect": EffectKind.ReadOnly}, {"func": send_email, "effect": EffectKind.IrreversibleWrite}, - ] - tools = make_tooldefs(tool_specs) + ]) log = EffectLog(execution_id="task-001", tools=tools, storage="sqlite:///effects.db") executor = effect_logged_tool_executor(log, { @@ -43,25 +46,50 @@ def _ensure_anthropic(): ) -def make_tooldefs(tool_specs): +def make_tooldefs(tool_specs, mode=None): """Create ToolDef entries from raw functions for Anthropic tool_use. - Anthropic's tool_use pattern uses raw functions, so this helper takes - the same functions and produces EffectLog ToolDefs. + Accepts raw callables (auto-classified) or dicts with explicit effects. Args: - tool_specs: List of dicts with keys: - - "func": A raw callable - - "effect": The EffectKind for this tool + tool_specs: List of: + - callables (auto-classified), or + - dicts with keys "func" and optional "effect" (EffectKind) + mode: Optional ClassifyMode for validation. Returns: List of ToolDef instances ready for EffectLog construction. """ - from effect_log import ToolDef + from effect_log import ClassifyMode, ToolDef + from effect_log.classify import classify_effect_kind + + if mode is None: + mode = ClassifyMode.HYBRID defs = [] for spec in tool_specs: - fn, effect = spec["func"], spec["effect"] + if callable(spec): + if mode is ClassifyMode.MANUAL: + raise TypeError( + "In MANUAL mode, all specs must include an explicit 'effect' key." + ) + fn, effect = spec, None + elif isinstance(spec, dict): + fn = spec["func"] + effect = spec.get("effect") + if mode is ClassifyMode.AUTO and effect is not None: + raise TypeError( + "In AUTO mode, specs must not include an explicit 'effect' key." + ) + if mode is ClassifyMode.MANUAL and effect is None: + raise TypeError( + "In MANUAL mode, all specs must include an explicit 'effect' key." + ) + else: + raise TypeError(f"Expected callable or dict, got {type(spec).__name__}") + + if effect is None: + effect = classify_effect_kind(fn).effect_kind def adapted(args, _fn=fn): return _fn(**args) diff --git a/bindings/python/python/effect_log/middleware/bub.py b/bindings/python/python/effect_log/middleware/bub.py index 71bb412..f91379d 100644 --- a/bindings/python/python/effect_log/middleware/bub.py +++ b/bindings/python/python/effect_log/middleware/bub.py @@ -94,16 +94,16 @@ def execute_tool_calls(self, tool_calls: list[dict[str, Any]]) -> str: return "\n".join(results) if results else "No tools executed." -def make_tooldefs(tool_specs): +def make_tooldefs(tool_specs, mode=None): """Create ToolDef entries from bub Tool classes. - Eliminates the need to define tool functions twice — once for bub and - once for EffectLog. + Accepts raw bub Tool classes (auto-classified) or dicts with explicit effects. Args: - tool_specs: List of dicts with keys: - - "tool_class": A bub Tool subclass (e.g., RunCommandTool) - - "effect": The EffectKind for this tool + tool_specs: List of: + - bub Tool classes (auto-classified by tool name), or + - dicts with keys "tool_class" and optional "effect" (EffectKind) + mode: Optional ClassifyMode for validation. Returns: List of ToolDef instances ready for EffectLog construction. @@ -111,14 +111,40 @@ def make_tooldefs(tool_specs): _ensure_bub() from bub.agent.tools import Context + from effect_log import ClassifyMode from effect_log import ToolDef as ELToolDef + from effect_log.classify import classify_from_name + + if mode is None: + mode = ClassifyMode.HYBRID defs = [] for spec in tool_specs: - tool_class, effect = spec["tool_class"], spec["effect"] + if isinstance(spec, dict): + tool_class = spec["tool_class"] + effect = spec.get("effect") + if mode is ClassifyMode.AUTO and effect is not None: + raise TypeError( + "In AUTO mode, specs must not include an explicit 'effect' key." + ) + if mode is ClassifyMode.MANUAL and effect is None: + raise TypeError( + "In MANUAL mode, all specs must include an explicit 'effect' key." + ) + else: + if mode is ClassifyMode.MANUAL: + raise TypeError( + "In MANUAL mode, all specs must include an explicit 'effect' key." + ) + tool_class = spec + effect = None + info = tool_class.get_tool_info() name = info["name"] + if effect is None: + effect = classify_from_name(name).effect_kind + def adapted(args, _cls=tool_class): # Instantiate the tool with args, execute with a default context instance = _cls(**args) diff --git a/bindings/python/python/effect_log/middleware/crewai.py b/bindings/python/python/effect_log/middleware/crewai.py index df140e5..8e37e02 100644 --- a/bindings/python/python/effect_log/middleware/crewai.py +++ b/bindings/python/python/effect_log/middleware/crewai.py @@ -72,27 +72,52 @@ def __call__(self, *args, **kwargs): return self.run(*args, **kwargs) -def make_tooldefs(tool_specs): +def make_tooldefs(tool_specs, mode=None): """Create ToolDef entries from CrewAI SDK tool objects. - Eliminates the need to define tool functions twice — once for the SDK - and once for EffectLog. Handles both custom @tool functions (which have - a .func attribute) and pre-built tools (which use ._run()). + Accepts raw LangChain/CrewAI tools (auto-classified) or dicts with explicit effects. Args: - tool_specs: List of dicts with keys: - - "tool": A CrewAI BaseTool or @tool-decorated function - - "effect": The EffectKind for this tool + tool_specs: List of: + - CrewAI BaseTool instances (auto-classified by name), or + - dicts with keys "tool" and optional "effect" (EffectKind) + mode: Optional ClassifyMode for validation. Returns: List of ToolDef instances ready for EffectLog construction. """ - from effect_log import ToolDef + from effect_log import ClassifyMode, ToolDef + from effect_log.classify import classify_from_name + + if mode is None: + mode = ClassifyMode.HYBRID defs = [] for spec in tool_specs: - tool, effect = spec["tool"], spec["effect"] + if isinstance(spec, dict): + tool = spec["tool"] + effect = spec.get("effect") + if mode is ClassifyMode.AUTO and effect is not None: + raise TypeError( + "In AUTO mode, specs must not include an explicit 'effect' key." + ) + if mode is ClassifyMode.MANUAL and effect is None: + raise TypeError( + "In MANUAL mode, all specs must include an explicit 'effect' key." + ) + else: + if mode is ClassifyMode.MANUAL: + raise TypeError( + "In MANUAL mode, all specs must include an explicit 'effect' key." + ) + tool = spec + effect = None + name = getattr(tool, "name", getattr(tool, "__name__", str(tool))) + + if effect is None: + effect = classify_from_name(name).effect_kind + fn = getattr(tool, "func", None) if fn is not None: @@ -138,7 +163,7 @@ def effect_logged_crew( Args: log: An initialized EffectLog instance. crew: A CrewAI Crew instance. - tool_effects: Optional dict mapping tool name → EffectKind. + tool_effects: Optional dict mapping tool name -> EffectKind. Returns: The crew with all agent tools replaced by effect-logged wrappers. diff --git a/bindings/python/python/effect_log/middleware/langgraph.py b/bindings/python/python/effect_log/middleware/langgraph.py index 1ba0360..ef48f4f 100644 --- a/bindings/python/python/effect_log/middleware/langgraph.py +++ b/bindings/python/python/effect_log/middleware/langgraph.py @@ -9,7 +9,8 @@ log = EffectLog(execution_id="task-001", tools=tools, storage="sqlite:///effects.db") - # Option 1: Wrap existing LangChain tools + # Option 1: Wrap existing LangChain tools (auto-classified or explicit) + wrapped = effect_logged_tools(log, [search_tool, send_email_tool]) wrapped = effect_logged_tools(log, [ {"tool": search_tool, "effect": EffectKind.ReadOnly}, {"tool": send_email_tool, "effect": EffectKind.IrreversibleWrite}, @@ -79,48 +80,104 @@ def __call__(self, *args, **kwargs): def effect_logged_tools( log: EffectLog, - tool_specs: Sequence[dict[str, Any]], + tool_specs: Sequence[Any], + mode=None, ) -> list[EffectLoggedTool]: """Wrap a list of LangChain tools with effect-log tracking. + Accepts raw LangChain tools (auto-classified) or dicts with explicit effects. + Args: log: An initialized EffectLog instance (tools must already be registered). - tool_specs: List of dicts with keys: - - "tool": A LangChain BaseTool or @tool-decorated function - - "effect": The EffectKind for this tool + tool_specs: List of: + - LangChain BaseTool instances (auto-classified by name), or + - dicts with keys "tool" and optional "effect" (EffectKind) + mode: Optional ClassifyMode for validation. Returns: List of EffectLoggedTool wrappers compatible with LangGraph's ToolNode. """ + from effect_log import ClassifyMode + _ensure_langgraph() + + if mode is None: + mode = ClassifyMode.HYBRID + wrapped = [] for spec in tool_specs: - tool = spec["tool"] - effect = spec["effect"] + if isinstance(spec, dict): + tool = spec["tool"] + effect = spec.get("effect") + if mode is ClassifyMode.AUTO and effect is not None: + raise TypeError( + "In AUTO mode, specs must not include an explicit 'effect' key." + ) + if mode is ClassifyMode.MANUAL and effect is None: + raise TypeError( + "In MANUAL mode, all specs must include an explicit 'effect' key." + ) + else: + if mode is ClassifyMode.MANUAL: + raise TypeError( + "In MANUAL mode, all specs must include an explicit 'effect' key." + ) + tool = spec + effect = None + + if effect is None: + from effect_log.classify import classify_from_name + + effect = classify_from_name(tool.name).effect_kind + wrapped.append(EffectLoggedTool(log, tool, effect_kind_name=str(effect))) return wrapped -def make_tooldefs(tool_specs): +def make_tooldefs(tool_specs, mode=None): """Create ToolDef entries from LangChain SDK tool objects. - Eliminates the need to define tool functions twice — once for the SDK - and once for EffectLog. Handles both custom @tool functions (which have - a .func attribute) and pre-built tools (which use .invoke()). + Accepts raw LangChain tools (auto-classified) or dicts with explicit effects. Args: - tool_specs: List of dicts with keys: - - "tool": A LangChain BaseTool or @tool-decorated function - - "effect": The EffectKind for this tool + tool_specs: List of: + - LangChain BaseTool instances (auto-classified by name), or + - dicts with keys "tool" and optional "effect" (EffectKind) + mode: Optional ClassifyMode for validation. Returns: List of ToolDef instances ready for EffectLog construction. """ - from effect_log import ToolDef + from effect_log import ClassifyMode, ToolDef + from effect_log.classify import classify_from_name + + if mode is None: + mode = ClassifyMode.HYBRID defs = [] for spec in tool_specs: - tool, effect = spec["tool"], spec["effect"] + if isinstance(spec, dict): + tool = spec["tool"] + effect = spec.get("effect") + if mode is ClassifyMode.AUTO and effect is not None: + raise TypeError( + "In AUTO mode, specs must not include an explicit 'effect' key." + ) + if mode is ClassifyMode.MANUAL and effect is None: + raise TypeError( + "In MANUAL mode, all specs must include an explicit 'effect' key." + ) + else: + if mode is ClassifyMode.MANUAL: + raise TypeError( + "In MANUAL mode, all specs must include an explicit 'effect' key." + ) + tool = spec + effect = None + + if effect is None: + effect = classify_from_name(tool.name).effect_kind + fn = getattr(tool, "func", None) if fn is not None: diff --git a/bindings/python/python/effect_log/middleware/openai_agents.py b/bindings/python/python/effect_log/middleware/openai_agents.py index 7428acd..35adcf7 100644 --- a/bindings/python/python/effect_log/middleware/openai_agents.py +++ b/bindings/python/python/effect_log/middleware/openai_agents.py @@ -33,32 +33,53 @@ def _ensure_openai_agents(): ) -def make_tools(specs): +def make_tools(specs, mode=None): """Create both FunctionTool and ToolDef entries from raw functions. - OpenAI's @function_tool hides the function in a closure (no .func), - so this helper takes raw functions *before* decoration and produces - both the SDK tool and the EffectLog ToolDef. - - Pre-built OpenAI tools (WebSearchTool, FileSearchTool, etc.) execute - on OpenAI's side and cannot be effect-logged — pass them through - unchanged in effect_logged_agent(). + Accepts raw callables (auto-classified) or dicts with explicit effects. Args: - specs: List of dicts with keys: - - "func": A raw callable (not yet decorated with @function_tool) - - "effect": The EffectKind for this tool + specs: List of: + - callables (auto-classified), or + - dicts with keys "func" and optional "effect" (EffectKind) + mode: Optional ClassifyMode for validation. Returns: Tuple of (list[FunctionTool], list[ToolDef]). """ _ensure_openai_agents() from agents import function_tool as ft - from effect_log import ToolDef + from effect_log import ClassifyMode, ToolDef + from effect_log.classify import classify_effect_kind + + if mode is None: + mode = ClassifyMode.HYBRID sdk_tools, tooldefs = [], [] for spec in specs: - fn, effect = spec["func"], spec["effect"] + if callable(spec): + if mode is ClassifyMode.MANUAL: + raise TypeError( + "In MANUAL mode, all specs must include an explicit 'effect' key." + ) + fn, effect = spec, None + elif isinstance(spec, dict): + fn = spec["func"] + effect = spec.get("effect") + if mode is ClassifyMode.AUTO and effect is not None: + raise TypeError( + "In AUTO mode, specs must not include an explicit 'effect' key." + ) + if mode is ClassifyMode.MANUAL and effect is None: + raise TypeError( + "In MANUAL mode, all specs must include an explicit 'effect' key." + ) + else: + raise TypeError(f"Expected callable or dict, got {type(spec).__name__}") + + if effect is None: + effect = classify_effect_kind(fn).effect_kind + sdk_tools.append(ft(fn)) def adapted(args, _fn=fn): @@ -116,7 +137,7 @@ def effect_logged_agent( Args: log: An initialized EffectLog instance. agent: An Agent from the OpenAI Agents SDK. - tool_effects: Optional dict mapping tool name → EffectKind. + tool_effects: Optional dict mapping tool name -> EffectKind. Used for documentation only; tools must already be registered in the EffectLog. diff --git a/bindings/python/python/effect_log/middleware/pydantic_ai.py b/bindings/python/python/effect_log/middleware/pydantic_ai.py index 4c91fd1..ccc26c4 100644 --- a/bindings/python/python/effect_log/middleware/pydantic_ai.py +++ b/bindings/python/python/effect_log/middleware/pydantic_ai.py @@ -74,26 +74,50 @@ def toolset(self): return self._wrapper -def make_tooldefs(tool_specs): +def make_tooldefs(tool_specs, mode=None): """Create ToolDef entries from raw functions for pydantic-ai tools. - Pydantic-ai accepts raw functions directly as tools, so this helper - takes the same functions and produces EffectLog ToolDefs — no duplicate - definitions needed. + Accepts raw callables (auto-classified) or dicts with explicit effects. Args: - tool_specs: List of dicts with keys: - - "func": A raw callable (the same function passed to pydantic-ai) - - "effect": The EffectKind for this tool + tool_specs: List of: + - callables (auto-classified), or + - dicts with keys "func" and optional "effect" (EffectKind) + mode: Optional ClassifyMode for validation. Returns: List of ToolDef instances ready for EffectLog construction. """ - from effect_log import ToolDef + from effect_log import ClassifyMode, ToolDef + from effect_log.classify import classify_effect_kind + + if mode is None: + mode = ClassifyMode.HYBRID defs = [] for spec in tool_specs: - fn, effect = spec["func"], spec["effect"] + if callable(spec): + if mode is ClassifyMode.MANUAL: + raise TypeError( + "In MANUAL mode, all specs must include an explicit 'effect' key." + ) + fn, effect = spec, None + elif isinstance(spec, dict): + fn = spec["func"] + effect = spec.get("effect") + if mode is ClassifyMode.AUTO and effect is not None: + raise TypeError( + "In AUTO mode, specs must not include an explicit 'effect' key." + ) + if mode is ClassifyMode.MANUAL and effect is None: + raise TypeError( + "In MANUAL mode, all specs must include an explicit 'effect' key." + ) + else: + raise TypeError(f"Expected callable or dict, got {type(spec).__name__}") + + if effect is None: + effect = classify_effect_kind(fn).effect_kind def adapted(args, _fn=fn): return _fn(**args) @@ -115,7 +139,7 @@ def effect_logged_agent( Args: log: An initialized EffectLog instance. agent: A pydantic-ai Agent instance. - tool_effects: Optional dict mapping tool name → EffectKind. + tool_effects: Optional dict mapping tool name -> EffectKind. Only tools in this mapping go through the WAL; others pass through unchanged. diff --git a/bindings/python/tests/test_basic.py b/bindings/python/tests/test_basic.py index dc9d37b..cc701df 100644 --- a/bindings/python/tests/test_basic.py +++ b/bindings/python/tests/test_basic.py @@ -1,9 +1,11 @@ """Basic tests for the effect-log Python bindings.""" -import json import tempfile import os -from effect_log import EffectKind, EffectLog, ToolDef, tool + +import pytest + +from effect_log import EffectKind, EffectLog, ToolDef, tool, auto_tool def make_read_tool(): @@ -66,6 +68,46 @@ def my_reader(args): assert result["data"] == "hello" +def test_tool_decorator_auto_classify(): + """The @tool() decorator with no effect auto-classifies.""" + + @tool() + def search_db(args): + return {"results": ["a", "b"]} + + assert isinstance(search_db, ToolDef) + # Verify it works as ReadOnly by executing through EffectLog + log = EffectLog(execution_id="test-auto-dec", tools=[search_db]) + result = log.execute("search_db", {}) + assert result["results"] == ["a", "b"] + + +def test_tool_decorator_no_parens(): + """The @tool decorator without parens auto-classifies.""" + + @tool + def search_db(args): + return {"results": ["a", "b"]} + + assert isinstance(search_db, ToolDef) + log = EffectLog(execution_id="test-no-parens", tools=[search_db]) + result = log.execute("search_db", {}) + assert result["results"] == ["a", "b"] + + +def test_auto_tool_decorator(): + """The @auto_tool decorator auto-classifies.""" + + @auto_tool + def fetch_data(args): + return {"data": "hello"} + + assert isinstance(fetch_data, ToolDef) + log = EffectLog(execution_id="test-auto-tool", tools=[fetch_data]) + result = log.execute("fetch_data", {}) + assert result["data"] == "hello" + + def test_recovery_sealed_irreversible(): """IrreversibleWrite returns sealed result on recovery.""" with tempfile.TemporaryDirectory() as tmpdir: @@ -93,25 +135,145 @@ def test_recovery_sealed_irreversible(): recover=True, ) - # read_file: ReadOnly → replayed (ReplayFresh policy) + # read_file: ReadOnly -> replayed (ReplayFresh policy) log2.execute("read_file", {"path": "/tmp/a"}) - # send_email: IrreversibleWrite → sealed, NOT re-executed + # send_email: IrreversibleWrite -> sealed, NOT re-executed result = log2.execute("send_email", {"to": "ceo@co.com"}) assert result["sent_to"] == "ceo@co.com" assert call_count2["n"] == 0 # NOT re-executed -def test_human_review_on_crash(): - """Crashed IrreversibleWrite escalates to human review.""" - # We need to manually write an intent without completion to simulate crash. - # Use the in-memory store through normal execution, but this test verifies - # the error path through the Python API. +# ── Auto-classification integration tests ──────────────────────────────────── + + +def test_auto_classify_raw_callables(): + """EffectLog accepts raw callables and auto-classifies them.""" + + def search_db(query=""): + return {"results": [f"result for {query}"]} + + def send_email(to="", subject=""): + return {"sent": True, "to": to} + + def upsert_record(id="", data=None): + return {"id": id, "updated": True} + + log = EffectLog( + execution_id="test-auto-1", + tools=[search_db, send_email, upsert_record], + ) + + result = log.execute("search_db", {"query": "Q4"}) + assert result["results"] == ["result for Q4"] + + result = log.execute("send_email", {"to": "ceo@co.com", "subject": "Report"}) + assert result["sent"] is True + + result = log.execute("upsert_record", {"id": "123", "data": {"x": 1}}) + assert result["updated"] is True + + +def test_auto_classify_with_overrides(): + """EffectLog overrides= corrects misclassifications.""" + + def process_order(order_id=""): + return {"processed": order_id} + + def search_db(query=""): + return {"results": []} + + log = EffectLog( + execution_id="test-auto-2", + tools=[search_db, process_order], + overrides={"process_order": EffectKind.IdempotentWrite}, + ) + + result = log.execute("process_order", {"order_id": "ORD-001"}) + assert result["processed"] == "ORD-001" + - # For this test, we use SQLite and manually insert an orphaned intent. - # Simpler approach: create, execute step 1, then "recover" and try step 1 again - # when it was irreversible and crashed (no completion). +def test_mixed_tooldef_and_callable(): + """EffectLog accepts a mix of ToolDef and raw callables.""" - # This is tricky to test from Python without low-level store access. - # The Rust integration tests cover this path thoroughly. - pass + def search_db(query=""): + return {"results": [f"result for {query}"]} + + email_tool, call_count = make_email_tool() + + log = EffectLog( + execution_id="test-mixed", + tools=[search_db, email_tool], # callable + ToolDef + ) + + result = log.execute("search_db", {"query": "test"}) + assert "results" in result + + result = log.execute("send_email", {"to": "user@example.com"}) + assert call_count["n"] == 1 + + +def test_auto_classify_recovery(): + """Auto-classified tools work correctly with crash recovery.""" + with tempfile.TemporaryDirectory() as tmpdir: + db_path = os.path.join(tmpdir, "test.db") + storage = f"sqlite:///{db_path}" + + call_counts = {"search": 0, "send": 0} + + def search_db(query=""): + call_counts["search"] += 1 + return {"results": ["a"]} + + def send_email(to=""): + call_counts["send"] += 1 + return {"sent": True, "to": to} + + # First execution + log = EffectLog( + execution_id="task-recovery", + tools=[search_db, send_email], + storage=storage, + ) + log.execute("search_db", {"query": "test"}) + log.execute("send_email", {"to": "ceo@co.com"}) + assert call_counts["search"] == 1 + assert call_counts["send"] == 1 + + # Recovery with fresh call counts + call_counts2 = {"search": 0, "send": 0} + + def search_db2(query=""): + call_counts2["search"] += 1 + return {"results": ["a"]} + + def send_email2(to=""): + call_counts2["send"] += 1 + return {"sent": True, "to": to} + + # Use same function names for recovery + search_db2.__name__ = "search_db" + send_email2.__name__ = "send_email" + + log2 = EffectLog( + execution_id="task-recovery", + tools=[search_db2, send_email2], + storage=storage, + recover=True, + ) + + log2.execute("search_db", {"query": "test"}) + result = log2.execute("send_email", {"to": "ceo@co.com"}) + + # send_email is IrreversibleWrite -> sealed, NOT re-executed + assert call_counts2["send"] == 0 + assert result["sent"] is True + + +def test_invalid_tool_type_raises(): + """EffectLog raises TypeError for invalid tool types.""" + with pytest.raises(TypeError, match="Expected ToolDef or callable"): + EffectLog( + execution_id="test-invalid", + tools=["not a callable"], + ) diff --git a/bindings/python/tests/test_classify.py b/bindings/python/tests/test_classify.py new file mode 100644 index 0000000..e60d03f --- /dev/null +++ b/bindings/python/tests/test_classify.py @@ -0,0 +1,533 @@ +"""Tests for the auto-classification system.""" + +import logging +from unittest.mock import patch + +import pytest + +from effect_log import EffectKind +from effect_log.classify import ( + ClassificationReport, + ClassificationResult, + classify_effect_kind, + classify_from_name, + classify_tools, + classify_with_llm, +) + + +# ── Name prefix classification ─────────────────────────────────────────────── + + +class TestNamePrefixClassification: + """Test Layer 1: Name prefix matching.""" + + @pytest.mark.parametrize( + "name, expected", + [ + # ReadOnly prefixes + ("read_file", EffectKind.ReadOnly), + ("fetch_data", EffectKind.ReadOnly), + ("get_user", EffectKind.ReadOnly), + ("search_db", EffectKind.ReadOnly), + ("query_table", EffectKind.ReadOnly), + ("list_items", EffectKind.ReadOnly), + ("check_status", EffectKind.ReadOnly), + ("validate_input", EffectKind.ReadOnly), + ("describe_table", EffectKind.ReadOnly), + ("count_records", EffectKind.ReadOnly), + ("find_user", EffectKind.ReadOnly), + ("lookup_address", EffectKind.ReadOnly), + ("browse_catalog", EffectKind.ReadOnly), + ("view_report", EffectKind.ReadOnly), + ("show_details", EffectKind.ReadOnly), + ("inspect_object", EffectKind.ReadOnly), + ("parse_json", EffectKind.ReadOnly), + ("transform_data", EffectKind.ReadOnly), + ("format_output", EffectKind.ReadOnly), + ("log_event", EffectKind.ReadOnly), + ("trace_request", EffectKind.ReadOnly), + # IrreversibleWrite prefixes + ("send_email", EffectKind.IrreversibleWrite), + ("email_user", EffectKind.IrreversibleWrite), + ("notify_admin", EffectKind.IrreversibleWrite), + ("broadcast_message", EffectKind.IrreversibleWrite), + ("publish_event", EffectKind.IrreversibleWrite), + ("deploy_service", EffectKind.IrreversibleWrite), + ("delete_record", EffectKind.IrreversibleWrite), + ("remove_item", EffectKind.IrreversibleWrite), + ("destroy_instance", EffectKind.IrreversibleWrite), + ("drop_table", EffectKind.IrreversibleWrite), + ("purge_cache", EffectKind.IrreversibleWrite), + ("revoke_token", EffectKind.IrreversibleWrite), + ("terminate_process", EffectKind.IrreversibleWrite), + ("kill_session", EffectKind.IrreversibleWrite), + ("post_to_slack", EffectKind.IrreversibleWrite), + ("tweet_update", EffectKind.IrreversibleWrite), + ("sms_alert", EffectKind.IrreversibleWrite), + # IrreversibleWrite (non-idempotent creates) + ("create_user", EffectKind.IrreversibleWrite), + ("add_item", EffectKind.IrreversibleWrite), + ("insert_row", EffectKind.IrreversibleWrite), + # IdempotentWrite prefixes + ("upsert_record", EffectKind.IdempotentWrite), + ("update_profile", EffectKind.IdempotentWrite), + ("put_object", EffectKind.IdempotentWrite), + ("save_document", EffectKind.IdempotentWrite), + ("set_config", EffectKind.IdempotentWrite), + ("write_file", EffectKind.IdempotentWrite), + ("store_data", EffectKind.IdempotentWrite), + ("upload_image", EffectKind.IdempotentWrite), + ("register_webhook", EffectKind.IdempotentWrite), + ("configure_service", EffectKind.IdempotentWrite), + ("enable_feature", EffectKind.IdempotentWrite), + ("disable_feature", EffectKind.IdempotentWrite), + ("assign_role", EffectKind.IdempotentWrite), + ("tag_resource", EffectKind.IdempotentWrite), + # Compensatable -> IrreversibleWrite (auto-downgraded) + ("reserve_seat", EffectKind.IrreversibleWrite), + ("lock_resource", EffectKind.IrreversibleWrite), + ("allocate_memory", EffectKind.IrreversibleWrite), + ("book_flight", EffectKind.IrreversibleWrite), + ("hold_inventory", EffectKind.IrreversibleWrite), + ("checkout_item", EffectKind.IrreversibleWrite), + ("claim_ticket", EffectKind.IrreversibleWrite), + # ReadThenWrite prefixes + ("transfer_funds", EffectKind.ReadThenWrite), + ("swap_items", EffectKind.ReadThenWrite), + ("exchange_currency", EffectKind.ReadThenWrite), + ("move_file", EffectKind.ReadThenWrite), + ("migrate_data", EffectKind.ReadThenWrite), + ("sync_databases", EffectKind.ReadThenWrite), + ("reconcile_accounts", EffectKind.ReadThenWrite), + ], + ) + def test_prefix_classification(self, name, expected): + def dummy(args): + pass + + dummy.__name__ = name + result = classify_effect_kind(dummy, name) + assert result.effect_kind == expected, ( + f"{name}: expected {expected}, got {result.effect_kind}" + ) + + @pytest.mark.parametrize( + "name, expected", + [ + ("search", EffectKind.ReadOnly), + ("query", EffectKind.ReadOnly), + ("fetch", EffectKind.ReadOnly), + ("get", EffectKind.ReadOnly), + ("read", EffectKind.ReadOnly), + ("list", EffectKind.ReadOnly), + ("find", EffectKind.ReadOnly), + ("lookup", EffectKind.ReadOnly), + ("check", EffectKind.ReadOnly), + ("validate", EffectKind.ReadOnly), + ("count", EffectKind.ReadOnly), + ("parse", EffectKind.ReadOnly), + ("transform", EffectKind.ReadOnly), + ("format", EffectKind.ReadOnly), + ("log", EffectKind.ReadOnly), + ("send", EffectKind.IrreversibleWrite), + ("email", EffectKind.IrreversibleWrite), + ("notify", EffectKind.IrreversibleWrite), + ("publish", EffectKind.IrreversibleWrite), + ("deploy", EffectKind.IrreversibleWrite), + ("delete", EffectKind.IrreversibleWrite), + ("remove", EffectKind.IrreversibleWrite), + ("destroy", EffectKind.IrreversibleWrite), + ("purge", EffectKind.IrreversibleWrite), + ("create", EffectKind.IrreversibleWrite), + ("upsert", EffectKind.IdempotentWrite), + ("update", EffectKind.IdempotentWrite), + ("save", EffectKind.IdempotentWrite), + ("insert", EffectKind.IrreversibleWrite), + ("write", EffectKind.IdempotentWrite), + ("store", EffectKind.IdempotentWrite), + ("upload", EffectKind.IdempotentWrite), + ("register", EffectKind.IdempotentWrite), + ("transfer", EffectKind.ReadThenWrite), + ("swap", EffectKind.ReadThenWrite), + ("migrate", EffectKind.ReadThenWrite), + ("sync", EffectKind.ReadThenWrite), + # Compensatable exact -> downgraded + ("reserve", EffectKind.IrreversibleWrite), + ("lock", EffectKind.IrreversibleWrite), + ("book", EffectKind.IrreversibleWrite), + ], + ) + def test_exact_name_classification(self, name, expected): + def dummy(args): + pass + + dummy.__name__ = name + result = classify_effect_kind(dummy, name) + assert result.effect_kind == expected + + +class TestDocstringClassification: + """Test Layer 2: Docstring keyword analysis.""" + + def test_readonly_docstring(self): + def my_func(args): + """Retrieves data from the database. This is a read-only operation.""" + pass + + result = classify_effect_kind(my_func, "my_func") + assert result.effect_kind == EffectKind.ReadOnly + + def test_irreversible_docstring(self): + def my_func(args): + """Sends email notification. This is irreversible and cannot be undone.""" + pass + + result = classify_effect_kind(my_func, "my_func") + assert result.effect_kind == EffectKind.IrreversibleWrite + + def test_idempotent_docstring(self): + def my_func(args): + """Upserts record. This is idempotent and safe to retry.""" + pass + + result = classify_effect_kind(my_func, "my_func") + assert result.effect_kind == EffectKind.IdempotentWrite + + def test_name_overrides_docstring(self): + """Name prefix has higher weight than docstring.""" + + def search_records(args): + """Sends results after searching.""" + pass + + result = classify_effect_kind(search_records) + # "search_" prefix (weight 0.50) > "Sends" docstring keyword (weight 0.25) + assert result.effect_kind == EffectKind.ReadOnly + + +class TestParameterClassification: + """Test Layer 3: Parameter name signals.""" + + def test_recipient_params_suggest_irreversible(self): + def my_func(to, subject, body): + pass + + result = classify_effect_kind(my_func, "my_func") + assert result.effect_kind == EffectKind.IrreversibleWrite + + def test_query_params_suggest_readonly(self): + def my_func(query, filter): + pass + + result = classify_effect_kind(my_func, "my_func") + assert result.effect_kind == EffectKind.ReadOnly + + def test_id_params_suggest_idempotent(self): + def my_func(id, key): + pass + + result = classify_effect_kind(my_func, "my_func") + assert result.effect_kind == EffectKind.IdempotentWrite + + +class TestDefaultClassification: + """Test fallback behavior for unrecognized functions.""" + + def test_unknown_function_defaults_to_irreversible(self): + def do_something(args): + pass + + result = classify_effect_kind(do_something) + assert result.effect_kind == EffectKind.IrreversibleWrite + assert result.confidence < 0.6 + + def test_low_confidence_for_unknown(self): + def process_order(args): + pass + + result = classify_effect_kind(process_order) + assert result.confidence < 0.6 + + +class TestConfidence: + """Test confidence scoring behavior.""" + + def test_high_confidence_for_clear_prefix(self): + def search_db(args): + pass + + result = classify_effect_kind(search_db) + assert result.confidence >= 0.5 + + def test_higher_confidence_with_multiple_signals(self): + def search_db(query): + """Retrieves data from the database. Read-only operation.""" + pass + + result = classify_effect_kind(search_db) + # Name prefix (0.50) + docstring (0.25) + params (0.15) = potentially high + assert result.confidence >= 0.5 + + def test_compensatable_downgraded(self): + """Compensatable auto-downgrades to IrreversibleWrite.""" + + def reserve_seat(args): + pass + + result = classify_effect_kind(reserve_seat) + assert result.effect_kind == EffectKind.IrreversibleWrite + assert "downgraded" in result.reason + + +# ── classify_from_name ─────────────────────────────────────────────────────── + + +class TestClassifyFromName: + """Test name-only classification for middleware.""" + + def test_prefix_match(self): + result = classify_from_name("search_db") + assert result.effect_kind == EffectKind.ReadOnly + + def test_exact_match(self): + result = classify_from_name("delete") + assert result.effect_kind == EffectKind.IrreversibleWrite + + def test_unknown_name(self): + result = classify_from_name("foobar") + assert result.effect_kind == EffectKind.IrreversibleWrite + assert result.confidence == 0.0 + + +# ── classify_tools (batch) ─────────────────────────────────────────────────── + + +class TestClassifyTools: + """Test batch classification.""" + + def test_batch_classify(self): + def search_db(query): + pass + + def send_email(to, subject): + pass + + def upsert_record(id, data): + pass + + report = classify_tools([search_db, send_email, upsert_record]) + assert isinstance(report, ClassificationReport) + assert len(report.results) == 3 + assert report.results["search_db"].effect_kind == EffectKind.ReadOnly + assert report.results["send_email"].effect_kind == EffectKind.IrreversibleWrite + assert report.results["upsert_record"].effect_kind == EffectKind.IdempotentWrite + + def test_report_str(self): + def search_db(query): + pass + + def send_email(to): + pass + + report = classify_tools([search_db, send_email]) + s = str(report) + assert "search_db" in s + assert "send_email" in s + assert "ReadOnly" in s + assert "IrreversibleWrite" in s + + def test_report_apply(self): + from effect_log import ToolDef + + def search_db(query=""): + return f"results for {query}" + + def send_email(to=""): + return f"sent to {to}" + + report = classify_tools([search_db, send_email]) + defs = report.apply() + assert len(defs) == 2 + assert all(isinstance(d, ToolDef) for d in defs) + + def test_report_apply_with_overrides(self): + def search_db(query=""): + return f"results for {query}" + + def process_order(order_id=""): + return f"processed {order_id}" + + report = classify_tools([search_db, process_order]) + defs = report.apply(overrides={"process_order": EffectKind.IdempotentWrite}) + assert len(defs) == 2 + + # Verify overrides produce working ToolDefs by constructing EffectLog + from effect_log import EffectLog + + log = EffectLog(execution_id="test-override", tools=defs) + result = log.execute("process_order", {"order_id": "ORD-1"}) + assert result == "processed ORD-1" + result = log.execute("search_db", {"query": "test"}) + assert result == "results for test" + + +# ── ClassificationResult ───────────────────────────────────────────────────── + + +class TestClassificationResult: + def test_fields(self): + r = ClassificationResult( + effect_kind=EffectKind.ReadOnly, + confidence=0.92, + reason="prefix 'search_'", + source="heuristic", + ) + assert r.effect_kind == EffectKind.ReadOnly + assert r.confidence == 0.92 + assert r.source == "heuristic" + + +# ── Logging ────────────────────────────────────────────────────────────────── + + +class TestLogging: + def test_high_confidence_logs_info(self, caplog): + def search_db(args): + pass + + with caplog.at_level(logging.INFO, logger="effect_log.classify"): + classify_effect_kind(search_db) + assert any( + "search_db" in r.message and "ReadOnly" in r.message for r in caplog.records + ) + + def test_low_confidence_logs_warning(self, caplog): + def do_something(args): + pass + + with caplog.at_level(logging.WARNING, logger="effect_log.classify"): + classify_effect_kind(do_something) + assert any( + "do_something" in r.message and "consider specifying" in r.message + for r in caplog.records + ) + + +# ── classify_with_llm ──────────────────────────────────────────────────────── + + +class TestClassifyWithLlm: + """Test LLM-based classification (mocked — no real API calls).""" + + def _make_func(self): + def send_report(to, subject): + """Send a report via email.""" + pass + + return send_report + + def test_requires_opt_in(self): + """Raises without EFFECT_LOG_LLM_CLASSIFY=1 or explicit provider.""" + with pytest.raises(RuntimeError, match="EFFECT_LOG_LLM_CLASSIFY"): + classify_with_llm(self._make_func()) + + @patch("effect_log.classify._call_anthropic", return_value="IrreversibleWrite") + def test_anthropic_provider(self, mock_call): + result = classify_with_llm( + self._make_func(), provider="anthropic", model="test-model" + ) + assert result.effect_kind == EffectKind.IrreversibleWrite + assert result.confidence == 0.80 + assert result.source == "llm" + assert "IrreversibleWrite" in result.reason + mock_call.assert_called_once() + assert mock_call.call_args[0][1] == "test-model" + + @patch("effect_log.classify._call_openai", return_value="ReadOnly") + def test_openai_provider(self, mock_call): + result = classify_with_llm( + self._make_func(), provider="openai", model="gpt-4o-mini" + ) + assert result.effect_kind == EffectKind.ReadOnly + assert result.confidence == 0.80 + mock_call.assert_called_once() + + @patch("effect_log.classify._call_anthropic", return_value="IdempotentWrite") + def test_auto_detect_anthropic(self, mock_call): + """Auto-detect tries anthropic first.""" + result = classify_with_llm(self._make_func(), provider="anthropic") + assert result.effect_kind == EffectKind.IdempotentWrite + + @patch( + "effect_log.classify._call_anthropic", + side_effect=ImportError("no anthropic"), + ) + @patch("effect_log.classify._call_openai", return_value="ReadThenWrite") + def test_auto_detect_falls_back_to_openai(self, mock_openai, mock_anthropic): + """Auto-detect falls back to openai when anthropic not installed.""" + with patch.dict("os.environ", {"EFFECT_LOG_LLM_CLASSIFY": "1"}): + result = classify_with_llm(self._make_func()) + assert result.effect_kind == EffectKind.ReadThenWrite + mock_anthropic.assert_called_once() + mock_openai.assert_called_once() + + @patch( + "effect_log.classify._call_anthropic", + side_effect=ImportError("no anthropic"), + ) + @patch( + "effect_log.classify._call_openai", + side_effect=ImportError("no openai"), + ) + def test_auto_detect_no_sdk_raises(self, mock_openai, mock_anthropic): + """Raises ImportError when neither SDK is available.""" + with patch.dict("os.environ", {"EFFECT_LOG_LLM_CLASSIFY": "1"}): + with pytest.raises(ImportError, match="anthropic or openai"): + classify_with_llm(self._make_func()) + + @patch("effect_log.classify._call_anthropic", return_value="Compensatable") + def test_compensatable_downgraded(self, mock_call): + """Compensatable is safety-downgraded to IrreversibleWrite.""" + result = classify_with_llm(self._make_func(), provider="anthropic") + assert result.effect_kind == EffectKind.IrreversibleWrite + assert result.confidence == 0.80 + + @patch("effect_log.classify._call_anthropic", return_value="garbage_response") + def test_unknown_response_defaults_to_irreversible(self, mock_call): + """Unrecognized LLM output defaults to IrreversibleWrite with 0 confidence.""" + result = classify_with_llm(self._make_func(), provider="anthropic") + assert result.effect_kind == EffectKind.IrreversibleWrite + assert result.confidence == 0.0 + + @patch("effect_log.classify._call_openai", return_value="ReadOnly") + def test_env_model_override(self, mock_call): + """EFFECT_LOG_LLM_MODEL env var sets the model.""" + with patch.dict( + "os.environ", + {"EFFECT_LOG_LLM_CLASSIFY": "1", "EFFECT_LOG_LLM_MODEL": "custom-model"}, + ): + classify_with_llm(self._make_func(), provider="openai") + assert mock_call.call_args[0][1] == "custom-model" + + @patch("effect_log.classify._call_openai", return_value="ReadOnly") + def test_explicit_model_overrides_env(self, mock_call): + """Explicit model= takes precedence over env var.""" + with patch.dict( + "os.environ", + {"EFFECT_LOG_LLM_CLASSIFY": "1", "EFFECT_LOG_LLM_MODEL": "env-model"}, + ): + classify_with_llm( + self._make_func(), provider="openai", model="explicit-model" + ) + assert mock_call.call_args[0][1] == "explicit-model" + + @patch("effect_log.classify._call_anthropic", return_value="ReadOnly") + def test_env_provider(self, mock_call): + """EFFECT_LOG_LLM_PROVIDER env var selects the provider.""" + with patch.dict( + "os.environ", + {"EFFECT_LOG_LLM_CLASSIFY": "1", "EFFECT_LOG_LLM_PROVIDER": "anthropic"}, + ): + classify_with_llm(self._make_func()) + mock_call.assert_called_once() diff --git a/bindings/python/tests/test_classify_mode.py b/bindings/python/tests/test_classify_mode.py new file mode 100644 index 0000000..78496de --- /dev/null +++ b/bindings/python/tests/test_classify_mode.py @@ -0,0 +1,130 @@ +"""Tests for ClassifyMode enum and EffectLog.auto() / EffectLog.manual().""" + +import pytest + +from effect_log import ClassifyMode, EffectKind, EffectLog, ToolDef + + +# -- helpers ------------------------------------------------------------------ + + +def search_db(query=""): + return {"results": [f"result for {query}"]} + + +def send_email(to=""): + return {"sent": True, "to": to} + + +def _make_tooldef(): + def read_file(args): + return {"content": "data"} + + return ToolDef("read_file", EffectKind.ReadOnly, read_file) + + +# -- EffectLog.auto() -------------------------------------------------------- + + +def test_auto_with_callables(): + """EffectLog.auto() succeeds with raw callables.""" + log = EffectLog.auto("test-auto-ok", tools=[search_db, send_email]) + result = log.execute("search_db", {"query": "test"}) + assert result["results"] == ["result for test"] + + +def test_auto_with_tooldef_raises(): + """EffectLog.auto() rejects ToolDef instances.""" + with pytest.raises(TypeError, match="AUTO mode"): + EffectLog.auto("test-auto-fail", tools=[_make_tooldef()]) + + +def test_auto_with_overrides(): + """EffectLog.auto() allows overrides.""" + log = EffectLog.auto( + "test-auto-override", + tools=[search_db], + overrides={"search_db": EffectKind.IdempotentWrite}, + ) + result = log.execute("search_db", {"query": "x"}) + assert "results" in result + + +# -- EffectLog.manual() ------------------------------------------------------ + + +def test_manual_with_tooldefs(): + """EffectLog.manual() succeeds with ToolDef instances.""" + td = _make_tooldef() + log = EffectLog.manual("test-manual-ok", tools=[td]) + result = log.execute("read_file", {}) + assert result["content"] == "data" + + +def test_manual_with_callable_raises(): + """EffectLog.manual() rejects raw callables.""" + with pytest.raises(TypeError, match="MANUAL mode"): + EffectLog.manual("test-manual-fail", tools=[search_db]) + + +def test_manual_with_overrides_raises(): + """MANUAL mode + overrides raises ValueError.""" + with pytest.raises(ValueError, match="overrides.*MANUAL mode"): + EffectLog( + "test-manual-overrides", + tools=[_make_tooldef()], + mode=ClassifyMode.MANUAL, + overrides={"read_file": EffectKind.IdempotentWrite}, + ) + + +# -- HYBRID (default) -------------------------------------------------------- + + +def test_hybrid_mixed_input(): + """Default HYBRID mode accepts both callables and ToolDefs.""" + td = _make_tooldef() + log = EffectLog("test-hybrid", tools=[search_db, td]) + assert log.execute("search_db", {"query": "q"})["results"] == ["result for q"] + assert log.execute("read_file", {})["content"] == "data" + + +def test_default_mode_is_hybrid(): + """No mode= argument behaves as HYBRID (backward compat).""" + td = _make_tooldef() + log = EffectLog("test-default", tools=[search_db, td]) + result = log.execute("search_db", {"query": "q"}) + assert "results" in result + + +def test_explicit_hybrid_mode(): + """Explicitly passing mode=HYBRID works the same.""" + log = EffectLog( + "test-explicit-hybrid", + tools=[search_db], + mode=ClassifyMode.HYBRID, + ) + result = log.execute("search_db", {"query": "q"}) + assert "results" in result + + +# -- Edge cases --------------------------------------------------------------- + + +def test_auto_empty_tools(): + """EffectLog.auto() with empty tools list succeeds.""" + log = EffectLog.auto("test-auto-empty", tools=[]) + assert log.history() == [] + + +def test_manual_empty_tools(): + """EffectLog.manual() with empty tools list succeeds.""" + log = EffectLog.manual("test-manual-empty", tools=[]) + assert log.history() == [] + + +def test_classify_mode_enum_values(): + """ClassifyMode enum has expected values.""" + assert ClassifyMode.AUTO.value == "auto" + assert ClassifyMode.MANUAL.value == "manual" + assert ClassifyMode.HYBRID.value == "hybrid" diff --git a/bindings/python/tests/test_middleware.py b/bindings/python/tests/test_middleware.py index a7d4f40..c3fddcf 100644 --- a/bindings/python/tests/test_middleware.py +++ b/bindings/python/tests/test_middleware.py @@ -9,7 +9,6 @@ import types from unittest.mock import MagicMock, AsyncMock -import pytest from effect_log import EffectKind, EffectLog, ToolDef @@ -187,7 +186,7 @@ def my_search(query=""): assert len(defs) == 1 log = make_log(defs) - result = log.execute("search", {"query": "python"}) + log.execute("search", {"query": "python"}) assert call_log == [("search", "python")] def test_make_tooldefs_without_func(self): @@ -201,13 +200,14 @@ def test_make_tooldefs_without_func(self): defs = make_tooldefs([{"tool": mock_tool, "effect": EffectKind.ReadOnly}]) log = make_log(defs) - result = log.execute("tavily_search", {"query": "test"}) + log.execute("tavily_search", {"query": "test"}) mock_tool.invoke.assert_called_once_with({"query": "test"}) def test_recovery_through_langgraph(self): """Verify sealed results work through the LangGraph middleware.""" - import tempfile, os + import tempfile + import os from effect_log.middleware.langgraph import effect_logged_tools tmpdir = tempfile.mkdtemp() @@ -516,7 +516,8 @@ def test_make_tooldefs_without_func(self): def test_recovery_through_crewai(self): """Verify sealed results work through the CrewAI middleware.""" - import tempfile, os + import tempfile + import os from effect_log.middleware.crewai import effect_logged_tool tmpdir = tempfile.mkdtemp() @@ -698,7 +699,8 @@ def test_effect_logged_agent(self): def test_recovery_through_pydantic_ai(self): """Verify sealed results work through the pydantic-ai middleware.""" - import tempfile, os + import tempfile + import os from effect_log.middleware.pydantic_ai import EffectLogToolset tmpdir = tempfile.mkdtemp() @@ -708,9 +710,7 @@ def test_recovery_through_pydantic_ai(self): log = make_log(tools, storage=db) mock_inner = MagicMock() - wrapper = EffectLogToolset( - log, mock_inner, {"send_email": EffectKind.IrreversibleWrite} - ) + EffectLogToolset(log, mock_inner, {"send_email": EffectKind.IrreversibleWrite}) # First execution — goes through WAL synchronously result1 = log.execute("send_email", {"to": "ceo@co.com"}) @@ -723,7 +723,7 @@ def test_recovery_through_pydantic_ai(self): ) mock_inner2 = MagicMock() - wrapper2 = EffectLogToolset( + EffectLogToolset( log2, mock_inner2, {"send_email": EffectKind.IrreversibleWrite} ) @@ -866,7 +866,8 @@ def test_process_tool_calls(self): def test_recovery_through_anthropic(self): """Verify sealed results work through the Anthropic middleware.""" - import tempfile, os + import tempfile + import os from effect_log.middleware.anthropic import effect_logged_tool_executor tmpdir = tempfile.mkdtemp() @@ -895,3 +896,304 @@ def test_recovery_through_anthropic(self): assert counts2["send_email"] == 0 # Same content assert result1["content"] == result2["content"] + + +# ── Auto-classification middleware tests ────────────────────────────────────── + + +class TestAnthropicAutoClassify: + """Test Anthropic middleware accepts raw callables.""" + + def setup_method(self): + self.anthropic_mod = _mock_anthropic() + + def teardown_method(self): + sys.modules.pop("anthropic", None) + sys.modules.pop("effect_log.middleware.anthropic", None) + + def test_make_tooldefs_raw_callables(self): + from effect_log.middleware.anthropic import make_tooldefs + + call_log = [] + + def search_db(query: str = "") -> str: + call_log.append(("search", query)) + return f"results for {query}" + + def send_email(to: str = "") -> str: + call_log.append(("email", to)) + return f"sent to {to}" + + # Pass raw callables — auto-classified + defs = make_tooldefs([search_db, send_email]) + assert len(defs) == 2 + + # Verify they work correctly through EffectLog + log = make_log(defs) + log.execute("search_db", {"query": "test"}) + assert call_log == [("search", "test")] + + def test_make_tooldefs_mixed(self): + from effect_log.middleware.anthropic import make_tooldefs + + call_log = [] + + def search_db(query: str = "") -> str: + call_log.append(("search", query)) + return f"results for {query}" + + def process_order(order_id: str = "") -> str: + call_log.append(("order", order_id)) + return f"processed {order_id}" + + # Mix of raw callable and dict with explicit effect + defs = make_tooldefs( + [ + search_db, + {"func": process_order, "effect": EffectKind.IdempotentWrite}, + ] + ) + + assert len(defs) == 2 + log = make_log(defs) + log.execute("search_db", {"query": "test"}) + log.execute("process_order", {"order_id": "ORD-1"}) + assert call_log == [("search", "test"), ("order", "ORD-1")] + + def test_make_tooldefs_dict_without_effect(self): + from effect_log.middleware.anthropic import make_tooldefs + + call_log = [] + + def search_db(query: str = "") -> str: + call_log.append(("search", query)) + return f"results for {query}" + + # Dict without effect -> auto-classified + defs = make_tooldefs([{"func": search_db}]) + log = make_log(defs) + log.execute("search_db", {"query": "test"}) + assert call_log == [("search", "test")] + + +class TestLangGraphAutoClassify: + """Test LangGraph middleware accepts raw tools without explicit effects.""" + + def setup_method(self): + self.BaseTool, self.ToolMessage = _mock_langchain() + + def teardown_method(self): + for mod in [ + "langchain_core", + "langchain_core.tools", + "langchain_core.messages", + ]: + sys.modules.pop(mod, None) + + def test_make_tooldefs_auto_classify(self): + from effect_log.middleware.langgraph import make_tooldefs + + call_log = [] + + def my_search(query=""): + call_log.append(("search", query)) + return f"results for {query}" + + mock_tool = MagicMock() + mock_tool.name = "search_db" + mock_tool.func = my_search + + # Pass without effect -> auto-classified by name + defs = make_tooldefs([mock_tool]) + assert len(defs) == 1 + log = make_log(defs) + log.execute("search_db", {"query": "test"}) + assert call_log == [("search", "test")] + + def test_make_tooldefs_dict_without_effect(self): + from effect_log.middleware.langgraph import make_tooldefs + + mock_tool = MagicMock() + mock_tool.name = "send_email" + del mock_tool.func + mock_tool.invoke.return_value = "sent" + + # Dict without explicit effect -> auto-classified + defs = make_tooldefs([{"tool": mock_tool}]) + assert len(defs) == 1 + log = make_log(defs) + log.execute("send_email", {"to": "test@example.com"}) + mock_tool.invoke.assert_called_once() + + def test_effect_logged_tools_auto_classify(self): + from effect_log.middleware.langgraph import effect_logged_tools + + tools, counts = make_tools() + log = make_log(tools) + + mock_search = MagicMock() + mock_search.name = "search" + + # Pass tool directly without wrapping in dict + wrapped = effect_logged_tools(log, [mock_search]) + assert len(wrapped) == 1 + assert wrapped[0].name == "search" + + +class TestOpenAIAgentsAutoClassify: + """Test OpenAI Agents middleware accepts raw callables.""" + + def setup_method(self): + self.FunctionTool = _mock_openai_agents() + + def teardown_method(self): + sys.modules.pop("agents", None) + + def test_make_tools_raw_callables(self): + from effect_log.middleware.openai_agents import make_tools as oa_make_tools + + call_log = [] + + def get_weather(city: str) -> str: + call_log.append(("weather", city)) + return f"sunny in {city}" + + def send_alert(message: str) -> str: + call_log.append(("alert", message)) + return f"sent: {message}" + + # Pass raw callables — auto-classified + sdk_tools, tooldefs = oa_make_tools([get_weather, send_alert]) + + assert len(sdk_tools) == 2 + assert len(tooldefs) == 2 + + log = make_log(tooldefs) + log.execute("get_weather", {"city": "SF"}) + log.execute("send_alert", {"message": "hot"}) + assert call_log == [("weather", "SF"), ("alert", "hot")] + + def test_make_tools_mixed(self): + from effect_log.middleware.openai_agents import make_tools as oa_make_tools + + call_log = [] + + def get_weather(city: str) -> str: + call_log.append(("weather", city)) + return f"sunny in {city}" + + def process_order(order_id: str) -> str: + call_log.append(("order", order_id)) + return f"processed {order_id}" + + sdk_tools, tooldefs = oa_make_tools( + [ + get_weather, + {"func": process_order, "effect": EffectKind.IdempotentWrite}, + ] + ) + + log = make_log(tooldefs) + log.execute("get_weather", {"city": "NY"}) + log.execute("process_order", {"order_id": "ORD-1"}) + assert call_log == [("weather", "NY"), ("order", "ORD-1")] + + +class TestCrewAIAutoClassify: + """Test CrewAI middleware accepts raw tools without explicit effects.""" + + def setup_method(self): + self.BaseTool = _mock_crewai() + + def teardown_method(self): + for mod in ["crewai", "crewai.tools"]: + sys.modules.pop(mod, None) + + def test_make_tooldefs_auto_classify(self): + from effect_log.middleware.crewai import make_tooldefs + + call_log = [] + + def my_search(query=""): + call_log.append(("search", query)) + return f"results for {query}" + + mock_tool = MagicMock() + mock_tool.name = "search_db" + mock_tool.func = my_search + + defs = make_tooldefs([mock_tool]) + log = make_log(defs) + log.execute("search_db", {"query": "test"}) + assert call_log == [("search", "test")] + + def test_make_tooldefs_dict_without_effect(self): + from effect_log.middleware.crewai import make_tooldefs + + mock_tool = MagicMock() + mock_tool.name = "delete_record" + del mock_tool.func + mock_tool._run.return_value = "deleted" + + defs = make_tooldefs([{"tool": mock_tool}]) + log = make_log(defs) + log.execute("delete_record", {"id": "123"}) + mock_tool._run.assert_called_once_with(id="123") + + +class TestPydanticAIAutoClassify: + """Test Pydantic AI middleware accepts raw callables.""" + + def setup_method(self): + self.WrapperToolset, self.AbstractToolset = _mock_pydantic_ai() + + def teardown_method(self): + for mod in ["pydantic_ai", "pydantic_ai.toolsets"]: + sys.modules.pop(mod, None) + sys.modules.pop("effect_log.middleware.pydantic_ai", None) + + def test_make_tooldefs_raw_callables(self): + from effect_log.middleware.pydantic_ai import make_tooldefs + + call_log = [] + + def search_db(query: str = "") -> str: + call_log.append(("search", query)) + return f"results for {query}" + + def send_email(to: str = "") -> str: + call_log.append(("email", to)) + return f"sent to {to}" + + defs = make_tooldefs([search_db, send_email]) + assert len(defs) == 2 + + log = make_log(defs) + log.execute("search_db", {"query": "test"}) + log.execute("send_email", {"to": "ceo@co.com"}) + assert call_log == [("search", "test"), ("email", "ceo@co.com")] + + def test_make_tooldefs_mixed(self): + from effect_log.middleware.pydantic_ai import make_tooldefs + + call_log = [] + + def search_db(query: str = "") -> str: + call_log.append(("search", query)) + return f"results for {query}" + + def process_order(order_id: str = "") -> str: + call_log.append(("order", order_id)) + return f"processed {order_id}" + + defs = make_tooldefs( + [ + search_db, + {"func": process_order, "effect": EffectKind.IdempotentWrite}, + ] + ) + + log = make_log(defs) + log.execute("search_db", {"query": "test"}) + log.execute("process_order", {"order_id": "ORD-1"}) + assert call_log == [("search", "test"), ("order", "ORD-1")] diff --git a/examples/README.md b/examples/README.md index 8727a10..b5433d0 100644 --- a/examples/README.md +++ b/examples/README.md @@ -41,6 +41,10 @@ python examples/anthropic_integration.py | `pydantic_ai_integration.py` | Standalone demo with Pydantic AI-style tools (`search_db`, `send_email`, `upsert_record`). | | `anthropic_integration.py` | Standalone demo with Anthropic Claude-style tools (`search_db`, `send_email`, `upsert_record`). | +> **Note:** All standalone examples use explicit `ToolDef` for clarity. In your own code, you +> can pass raw callables directly to `EffectLog` — effect kinds are auto-classified from +> function names, docstrings, and parameters. See the main [README](../README.md) for details. + ## End-to-end examples (real SDK integration) These examples import the real SDKs and use the effect-log middleware wrappers. @@ -89,3 +93,20 @@ python examples/e2e_bub.py Each e2e example also demonstrates recovery at the end: after the agent finishes, it replays the same calls with `recover=True` and asserts that `IrreversibleWrite` tools are **not** re-executed. + +## Auto-classification + +All middleware `make_tooldefs()` / `make_tools()` functions now accept raw callables: + +```python +from effect_log.middleware.anthropic import make_tooldefs + +# Just pass functions — effect kinds are auto-classified +tools = make_tooldefs([search_db, send_email]) + +# Or mix raw callables with explicit specs +tools = make_tooldefs([ + search_db, # auto-classified + {"func": process_order, "effect": EffectKind.IdempotentWrite}, # explicit +]) +``` diff --git a/examples/crash_recovery.py b/examples/crash_recovery.py index 66751b3..a391d74 100644 --- a/examples/crash_recovery.py +++ b/examples/crash_recovery.py @@ -63,7 +63,7 @@ def main(): # === First execution: steps 1-3, then "crash" === print("\n--- First Execution (steps 1-3, then crash) ---\n") - log = EffectLog(execution_id="demo-001", tools=tools, storage=storage) + log = EffectLog.manual(execution_id="demo-001", tools=tools, storage=storage) r1 = log.execute("fetch_data", {"source": "https://api.example.com/data"}) print(f"Step 1 [fetch_data]: {r1}") @@ -96,7 +96,7 @@ def main(): make_counting_tool("log_result", EffectKind.ReadOnly), ] - log2 = EffectLog( + log2 = EffectLog.manual( execution_id="demo-001", tools=tools2, storage=storage,