Skip to content

OpenAI Responses API content types input_text/input_image/input_file not recognized by instrument_openai #1901

@Mai0313

Description

@Mai0313

Description

When using logfire.instrument_openai() with client.responses.create() (the OpenAI Responses API), every content part whose type is input_text, input_image, or input_file falls through to:

  • gen_ai.unknown events on the legacy semconv path (version=1, the default)
  • a generic {**part, 'type': part_type} dict on the latest semconv path (version="latest")

Only the Chat Completions content types (text / output_text / image_url / input_audio) are currently recognized. This makes Logfire traces for any non-trivial Responses API call very noisy: image, file, and text parts all show as unknown even when they are perfectly valid content per the official OpenAI docs.

Reproduction

import logfire
from openai import OpenAI

logfire.configure()
logfire.instrument_openai()

client = OpenAI()
client.responses.create(
    model="gpt-4.1",
    input=[
        {
            "role": "user",
            "content": [
                {"type": "input_text", "text": "What is in this image?"},
                {
                    "type": "input_image",
                    "image_url": "data:image/jpeg;base64,<...>",
                },
            ],
        }
    ],
)

Console output (simplified):

events=[
  {'event.name': 'gen_ai.unknown', 'role': 'unknown',
   'content': 'input_text\n\nSee JSON for details',
   'data': {'type': 'input_text', 'text': '...'}},
  {'event.name': 'gen_ai.unknown', 'role': 'unknown',
   'content': 'input_image\n\nSee JSON for details',
   'data': {'type': 'input_image', 'image_url': '...'}},
]

The same content parts are accepted by the API and produce the expected response — this is purely an instrumentation gap.

Root cause

In logfire/_internal/integrations/llm_providers/openai.py:

Legacy pathinput_to_events:

for content_item in content:
    with contextlib.suppress(KeyError):
        if content_item['type'] == 'output_text':
            events.append({'event.name': event_name, 'content': content_item['text'], 'role': role})
            continue
    events.append(unknown_event(content_item))

Only output_text is recognized; everything else goes to unknown_event.

Latest path_convert_content_part:

if part_type in ('text', 'output_text'):
    return TextPart(type='text', content=part.get('text', ''))
elif part_type == 'image_url':
    url = part.get('image_url', {}).get('url', '')
    return UriPart(type='uri', uri=url, modality='image')
elif part_type == 'input_audio':
    return BlobPart(...)
else:
    return {**part, 'type': part_type}

Note that the recognized image type (image_url) follows the Chat Completions schema ({"type": "image_url", "image_url": {"url": "..."}}), which differs from the Responses API schema ({"type": "input_image", "image_url": "<string>"}image_url is a flat string, not a nested object).

Proposed fix

I'd propose adding three cases to _convert_content_part:

elif part_type == 'input_text':
    return TextPart(type='text', content=part.get('text', ''))
elif part_type == 'input_image':
    return UriPart(type='uri', uri=part.get('image_url', ''), modality='image')
elif part_type == 'input_file':
    return BlobPart(type='blob', content=part.get('file_data', ''), modality='file')

…and the corresponding branches in input_to_events for the legacy path.

I'm happy to send the PR myself — just let me know if this is the right shape, or if you'd prefer a different design (e.g. waiting on the upstream opentelemetry-python-contrib work tracked in #1586).

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions