Support Gemini Image models in RubyLLM.paint#750
Open
danieldenis01 wants to merge 4 commits into
Open
Conversation
The Gemini provider's image generation was hardcoded to the Imagen :predict endpoint, leaving the Gemini Image family (Nano Banana et al.) unreachable: RubyLLM.paint with gemini-2.5-flash-image, gemini-3.1-flash-image-preview, gemini-3-pro-image-preview, or nano-banana-pro-preview raised "is not supported for predict" even though those models are listed in the registry with image output. Branch internally on imagen?(model). Imagen keeps its existing :predict/instances payload and predictions[].bytesBase64Encoded parsing unchanged. Everything else routes through :generateContent with contents/parts and parses candidates[].content.parts[].inlineData, matching the protocol Gemini chat already speaks. The fallthrough also covers nano-banana-pro-preview, which doesn't share the gemini- prefix. Image-to-image editing via with: is supported on the Gemini Image branch by reusing Gemini::Media#format_attachment to build inline_data parts. validate_paint_inputs! is overridden as a no-op so the base class's blanket attachment rejection doesn't fire; the model-aware checks (mask: rejection, Imagen-with-with: rejection) live in render_image_payload after @model is assigned. size: is translated through SIZE_TO_ASPECT_RATIO for the common DALL-E sizes; unknown sizes default to 1:1 with a debug log. Users override via params:, which deep-merges into the payload so nested generationConfig blocks aren't clobbered. Tests cover both the original Nano Banana (gemini-2.5-flash-image, paint + edit) and Nano Banana 2 (gemini-3.1-flash-image-preview, the exact model from the bug report). Imagen, OpenAI, and OpenRouter image tests pass unchanged.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #750 +/- ##
==========================================
+ Coverage 87.21% 87.28% +0.06%
==========================================
Files 121 121
Lines 5703 5739 +36
Branches 1442 1454 +12
==========================================
+ Hits 4974 5009 +35
- Misses 729 730 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Author
|
I'll implement the missing test cases. |
Codecov reported four uncovered lines in the Gemini Images branching: the Imagen `with:` rejection, the Imagen response-shape guard, the unknown-attachment-type rejection on the Gemini Image branch, and the unmapped-size default. None of these paths are reachable from the existing VCR-backed integration specs (Imagen rejects `with:` before hitting the wire; Gemini Image cassettes only exercise PNG inputs and the supported `1024x1024` size). Add a focused unit spec at spec/ruby_llm/providers/gemini/images_spec.rb that extends a bare object with Gemini::Media + Gemini::Images (same pattern used in chat_spec.rb) and exercises each branch directly with stubbed attachments and Faraday::Response doubles. No new cassettes needed. Brings lib/ruby_llm/providers/gemini/images.rb to 100% line coverage and lifts branch coverage from 66.67% to 79.17%.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #473.
What this does
Fixes
RubyLLM.paintfor the Gemini Image model family (gemini-2.5-flash-image,gemini-3.1-flash-image-preview,gemini-3-pro-image-preview,nano-banana-pro-preview), which was hardcoded to the Imagen:predictprotocol and unreachable.The Gemini provider now branches on
imagen?(model)::predictwithinstances/parameters(byte-for-byte unchanged).:generateContentwithcontents/partsand parsescandidates[].content.parts[].inlineData, the same protocol Gemini chat uses.Improvements over the previous Gemini image generation
with:) — new capability. Before this PR the Gemini provider had no support for image references (with:was an unused method arg, rejected by the basevalidate_paint_inputs!). The Gemini Image branch now accepts one or more local files / URLs /Attachmentinstances viawith:, reusingGemini::Media#format_attachmentto buildinline_dataparts. Imagen still rejectswith:.size:is meaningful again on the Gemini Image branch. A small map translates the common DALL-E sizes (1024x1024,1792x1024,1024x1792,1408x1024,1024x1408) to GeminiaspectRatio. Unknown sizes default to1:1with a debug log. Imagen continues to ignoresize:.params:deep-merges into the payload so users can override any nestedgenerationConfig/imageConfigfield without clobbering the rest.usageMetadatafrom Gemini Image responses is passed through toImage#usage.The public signature
RubyLLM.paint(prompt, model:, with:, size:, params:)is unchanged.Reproduction (before this PR):
Type of change
Scope check
Quality check
overcommit --installand all hooks passbundle exec rake vcr:record[gemini]bundle exec rspecmodels.json,aliases.json)Tests added
Integration (VCR-backed, via
IMAGE_GENERATION_MODELS):gemini-2.5-flash-imagepaint + image edit withwith:gemini-3.1-flash-image-previewpaint (the exact model from the bug report)Unit-level (
spec/ruby_llm/providers/gemini/images_spec.rb, no network):with:raisesUnsupportedAttachmentErrorbytesBase64EncodedraisesRubyLLM::Error:unknownattachment type raisesUnsupportedAttachmentErrorsize:defaultsaspectRatioto1:1Coverage:
lib/ruby_llm/providers/gemini/images.rbat 100% line / 79% branch.AI-generated code
API changes
Out of scope
images.rbat all and needs OAuth — separate PR.:streamGenerateContent).pricing.images;Image#total_costfalls back tooutput_price_per_million).