feat: SeFi-Image support by fszontagh · Pull Request #1707 · leejet/stable-diffusion.cpp

fszontagh · 2026-06-24T20:01:27Z

Summary

Adds inference support for SeFi-Image, a dual-time flow-matching T2I family built on the Flux2 backbone with a Qwen3-VL text encoder. Tech report: arXiv:2606.22568. See docs/sefi_image.md.

What's in:

VERSION_SEFI_IMAGE + version detection
Dual-time embedding block (semantic_embedder + texture_embedder, concat)
Per-stream Euler sampler with alpha-shift + delta_t
SeFi-aware Qwen3-VL conditioning (chat template, layers 9/18/27)
VAE BN normalization on packed texture latents
script/convert_sefi.py for converting diffusers checkpoint to single sd.cpp safetensors
--extra-sample-args sefi_alpha=0.3 / sefi_delta_t=0.1 overrides
Filename heuristic: turbo in path => alpha=1.0, else alpha=0.3

Related Issue / Discussion

Closes #1702.

Additional Information

Example

./build/bin/sd-cli \
  --model /path/to/sefi_1b_turbo.safetensors \
  --llm   /path/to/qwen3_vl_2b.safetensors \
  -p "a photograph of an orange tabby cat sitting on a couch" \
  --cfg-scale 1.0 --steps 4 -W 1024 -H 1024 -s 42 \
  --diffusion-fa --offload-to-cpu \
  -o out.png

Tested variants (all 7 from huggingface.co/SeFi-Image)

Variant	Encoder	Baseline (12GB VRAM)	`--max-vram 8 --stream-layers`
1B-Base	qwen3_vl_2b	ok 109s	ok 172s
1B-turbo	qwen3_vl_2b	ok 14s	ok 17s
2B-Base	qwen3_vl_2b	ok 229s	ok 296s
2B-turbo	qwen3_vl_2b	ok 29s	ok 25s
5B-Base	qwen3_vl_4b	OOM	ok 563s
5B-turbo	qwen3_vl_4b	OOM	ok 170s
5B-RL	qwen3_vl_4b	OOM	ok 587s

5B variants use Qwen3-VL-4B-Instruct as the text encoder (1B/2B use 2B). 5B needs streaming on 12GB-class GPUs.

Checklist

I have read and confirmed this PR follows the contribution guidelines.

…mbed block)

…el. prefix)

…_embed block

…ux2 flow)

…_t-shifted semantic timestep

…e format

…ing mismatch

…+ VAE semantic slice)

…mestep tensor

… 1000x rescale

… (shift,scale)

…irical)

…n Base

…ry default)

…lta_t via --extra-sample-args

…rs into SefiFlowDenoiser constants

GreenShadows · 2026-06-24T20:07:56Z

The quality seems surprisingly good for such a small model.

…README entry

…-diffusion-model + --vae like krea2/flux2)

leejet · 2026-06-25T13:51:36Z

        bool double_z            = true;
    } dd_config;

+    void init_params(ggml_context* ctx, const String2TensorStorage& tensor_storage_map = {}, std::string prefix = "") override {


bn.running_mean and bn.running_var should be extracted as constants and provided through get_latents_mean_std for calls to vae_to_diffusion_latents and diffusion_to_vae_latents.

It looks like the SeFi-image uses the standard FLUX.2 VAE. If so, no special handling is needed here.

leejet · 2026-06-25T14:04:58Z

                case FLUX2_FLOW_PRED: {
-                    LOG_INFO("running in Flux2 FLOW mode");
-                    denoiser = std::make_shared<Flux2FlowDenoiser>();
+                    if (sd_version_is_sefi_image(version)) {


Specifying FLUX2_FLOW_PRED means using Flux2FlowDenoiser, rather than switching the pred based on the model version.

leejet · 2026-06-25T14:07:06Z

+                            sefi_path = SAFE_STR(sd_ctx_params->model_path);
+                        }
+                        bool is_turbo                       = sefi_path.find("turbo") != std::string::npos;
+                        sefi_denoiser->timestep_shift_alpha = is_turbo ? SefiFlowDenoiser::kAlphaTurbo


It should support specifying alpha directly, rather than adding a parameter like turbo.

leejet · 2026-06-25T14:08:29Z

                                                                  : timesteps_vec;
            adjust_sample_step_scalings(shifted_timestep, scaling_timesteps_vec, c_in, &c_skip, &c_out);

+            if (auto sefi_denoiser = std::dynamic_pointer_cast<SefiFlowDenoiser>(denoiser)) {


This should be placed inside process_timesteps.

leejet · 2026-06-25T14:26:09Z


 ## 🔥Important News

+* **2026/06/26** 🚀 stable-diffusion.cpp now supports **SeFi-Image**


Import news should only add models with high community discussion. Based on the current level of interest, SeFi-Image does not meet the requirement.

leejet · 2026-06-25T14:29:12Z

        prefix_map["te1."] = "text_encoders.clip_l.transformer.";
    }

+    if (sd_version_is_sefi_image(version)) {


The --diffusion-model parameter should be used instead of hardcoding the prefix here.

fszontagh added 18 commits June 23, 2026 19:53

feat: scaffold SeFi-Image prototype (VERSION_SEFI_IMAGE + dual-time e…

780d91f

…mbed block)

fix: SeFi detection match raw checkpoint keys (no model.diffusion_mod…

bf68811

…el. prefix)

feat: route SeFi tensors through Flux2 path and instantiate dual_time…

fcbc16b

…_embed block

feat: SeFi-Image denoiser + model/conditioner dispatch (Qwen3-VL + Fl…

0b438ba

…ux2 flow)

feat: SeFi-Image VAE BN normalization on packed texture latents

5840e42

feat: SeFi Flux forward routes vec through dual_time_embed with delta…

2287e91

…_t-shifted semantic timestep

feat: Python conversion script for SeFi-Image HF -> sd.cpp single-fil…

846ea77

…e format

feat: SeFi-Image prototype loads end-to-end up to text encoder embedd…

1260aec

…ing mismatch

feat: end-to-end SeFi-Image generation works (Qwen3-VL-2B head_count …

c854a98

…+ VAE semantic slice)

feat: SeFi dual-time schedule with α-shift + delta_t and 2-element ti…

feb2146

…mestep tensor

feat: SeFi per-stream Euler sampler with proper dual-time integration

f101029

fix: SEFIDualTimestepEmbeddings — timesteps already in [0,1000], drop…

6ea02a2

… 1000x rescale

fix: SeFi LastLayer adaLN uses diffusers (scale,shift) order, not BFL…

83976e6

… (shift,scale)

fix: SeFi dual-time slot 0 = texture timestep, slot 1 = semantic (emp…

e2708a7

…irical)

docs: clarify SeFi dual-time swap is required for turbo, equivalent o…

d2b8f88

…n Base

fix: SeFi-Image turbo timestep_shift_alpha=1.0 (matches python regist…

4560627

…ry default)

fix: SeFi-Image text tokens need Flux2 RoPE on axis 3; plumb alpha/de…

91547e6

…lta_t via --extra-sample-args

chore: SeFi-Image cleanup — drop narrative comments, fold magic numbe…

4a597a3

…rs into SefiFlowDenoiser constants

fszontagh added 3 commits June 24, 2026 22:16

Merge upstream/master into feat/sefi-image-prototype

b769aa2

docs(sefi): rename SEFI types to PascalCase, add docs/sefi_image.md, …

9813ba9

…README entry

fix(sefi): split conversion into transformer + VAE files (loads via -…

f4cde38

…-diffusion-model + --vae like krea2/flux2)

leejet requested changes Jun 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: SeFi-Image support#1707

feat: SeFi-Image support#1707
fszontagh wants to merge 21 commits into
leejet:masterfrom
fszontagh:feat/sefi-image-prototype

fszontagh commented Jun 24, 2026 •

edited

Loading

Uh oh!

GreenShadows commented Jun 24, 2026

Uh oh!

leejet Jun 25, 2026

Uh oh!

leejet Jun 25, 2026

Uh oh!

leejet Jun 25, 2026

Uh oh!

leejet Jun 25, 2026

Uh oh!

leejet Jun 25, 2026

Uh oh!

leejet Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		## 🔥Important News

		* 2026/06/26 🚀 stable-diffusion.cpp now supports SeFi-Image

Conversation

fszontagh commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue / Discussion

Additional Information

Example

Tested variants (all 7 from huggingface.co/SeFi-Image)

Checklist

Uh oh!

GreenShadows commented Jun 24, 2026

Uh oh!

leejet Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

leejet Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

leejet Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

leejet Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

leejet Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

leejet Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fszontagh commented Jun 24, 2026 •

edited

Loading