fix(codex): use a stable prompt_cache_key instead of a per-request uuid4 to enable Codex prompt-cache reuse by sumleo · Pull Request #2390 · vectorize-io/hindsight

sumleo · 2026-06-25T08:01:41Z

What

In the Codex provider, prompt_cache_key is set to a fresh uuid.uuid4() on every request:

# hindsight-api-slim/hindsight_api/engine/providers/codex_llm.py
"prompt_cache_key": str(uuid.uuid4()),   # in both call() and call_with_tools()

prompt_cache_key is OpenAI/Codex's explicit hint for routing a request to a cached-prefix backend, so it needs to stay constant across calls that share the same instructions + tools[] prefix. A new random value per request means that prefix is never matched, giving a ~100% prompt-cache miss on the hottest LLM path (recall/reflect/retain all funnel through these two methods) with no functional upside.

Fix

Derive the key once per provider instance from (account_id, model) so it is stable across requests but still distinct per account/model, and use it at both call sites. Tiny, self-contained change; the now-unused uuid import is dropped.

@functools.cached_property
def _prompt_cache_key(self) -> str:
    seed = f"{self.account_id}:{self.model}"
    return hashlib.sha256(seed.encode("utf-8")).hexdigest()[:32]

If you'd prefer the key to be scoped more tightly (e.g. per bank/session/reflect-mission) rather than per provider instance, happy to adjust — I kept it to identifiers already in scope to stay minimal.

How this was found

Spotted via static analysis of prompt-cache anti-patterns (a tool I'm experimenting with, CacheLint) and then confirmed by hand on main at 9dafadc (lines ~391 and ~714). I have not been able to run the full integration suite against a live Codex account, so a maintainer sanity-check on the cache-key semantics would be appreciated.

A fresh uuid4 per request makes OpenAI/Codex prompt-cache prefix routing miss on every call, so the stable system-instructions + tools[] prefix is never reused. Derive the key once per provider instance from (account, model) in CodexLLM.call() and call_with_tools().

sumleo · 2026-06-27T08:15:00Z

Hi @nicoloboschi — gentle follow-up on this one (open since mid-June). It swaps the per-request uuid4 prompt_cache_key for a stable sha256(account:model) key so Codex prompt-cache reuse actually kicks in. CI is green and it's a small, self-contained change. Would you have a moment to take a look, or let me know if you'd like anything adjusted? Happy to rebase. Thanks for your time and for the project!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(codex): use a stable prompt_cache_key instead of a per-request uuid4 to enable Codex prompt-cache reuse#2390

fix(codex): use a stable prompt_cache_key instead of a per-request uuid4 to enable Codex prompt-cache reuse#2390
sumleo wants to merge 1 commit into
vectorize-io:mainfrom
sumleo:cachelint/hindsight-ap3

sumleo commented Jun 25, 2026

Uh oh!

sumleo commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

sumleo commented Jun 25, 2026

What

Fix

How this was found

Uh oh!

sumleo commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant