Skip to content

Implement virtio-gpu 2D#133

Open
Mes0903 wants to merge 7 commits intosysprog21:masterfrom
Mes0903:vgpu-pr
Open

Implement virtio-gpu 2D#133
Mes0903 wants to merge 7 commits intosysprog21:masterfrom
Mes0903:vgpu-pr

Conversation

@Mes0903
Copy link
Copy Markdown
Collaborator

@Mes0903 Mes0903 commented Apr 27, 2026

This pull request implements a minimal VirtIO-GPU 2D device for semu. The device uses the VirtIO MMIO transport and provides DRM/KMS support for Linux to bind the virtio_gpu driver and expose /dev/dri/card0. The software backend supports 2D resource management, scanout updates, and cursor presentation through the SDL window backend.

When the SDL window grabs the pointer, press Ctrl+Alt+G to release the mouse cursor back to the host.

VirtIO-GPU

  • Adds a VirtIO-GPU MMIO device.
  • Implements the basic 2D command path:
    • display information and EDID queries
    • 2D resource creation and destruction
    • backing attachment and detachment
    • guest-to-host framebuffer transfer
    • scanout configuration and resource flushing
    • cursor set, move, and clear operations
  • Adds a software 2D backend.
  • Presents scanout and cursor updates through the SDL window backend.
  • Adds a virtio-gpu CI smoke test that checks Linux driver binding,
    /dev/dri/card0, and the DirectFB2 DRM/KMS path.

Tested workloads

The implementation has been tested with:

  • Linux virtio_gpu
  • libdrm test tools such as modetest
  • DirectFB2 using its DRM/KMS backend
  • X11
  • kmscube
  • glmark2
  • glxgears

Build and test

Install SDL2 development headers if they are not already present:

sudo apt install libsdl2-dev libsdl2-2.0-0

Build the emulator:

make

Run the virtio-gpu CI smoke test:

.ci/test-gpu.sh

To rebuild the default guest artifacts locally:

./scripts/build-image.sh --all

To build the DirectFB2 test tool disk used by the GPU smoke test:

./scripts/build-image.sh --directfb2-test

For the temporary X11-enabled test image used during review and CI testing:

./scripts/build-image.sh --x11 --directfb2-test

For a full manual GPU test run, build or download the kernel and tool disk, then launch semu with the tool disk as the guest root filesystem:

make semu minimal.dtb Image tool-ext4.img
./semu -k Image -c 1 -b minimal.dtb -d tool-ext4.img

Additional test scaffolding

This PR currently carries two DO NOT MERGE commits to make review and CI easier while the final guest artifact flow is being prepared:

  • DO NOT MERGE: Add X11 image support for CI testing adds an optional X11-enabled Buildroot configuration. The normal rootfs.cpio and ext4.img stay on the default Buildroot configuration, while the heavier X11 payload is built into tool-ext4.img for development and CI coverage.
  • DO NOT MERGE: Cache PR-built guest artifacts keeps PR-built Image/rootfs.cpio/tool-ext4.img artifacts cached by the live guest-input hash, then re-uploads them as the normal prebuilt-pr workflow artifact for downstream CI jobs. It also prevents restored PR artifacts from falling back to the upstream release download path before tool-ext4.img exists there.

These commits are test scaffolding only. They are expected to be dropped or replaced before merge and should not be treated as part of the final user-facing build flow. The temporary CI/cache/X11 adjustments are kept in DO NOT MERGE commits and are not intended to land in master.

Manual checks

Check DRM/KMS binding

Inside the guest:

ls -l /dev/dri/
modetest -M virtio_gpu

Run modetest

Use the connector ID reported by modetest -M virtio_gpu:

modetest -M virtio_gpu -s <connector-id>:1024x768

A test image should appear in the SDL window.

If you see Invalid argument, re-check the connector ID. If you see Permission denied or the DRM device is busy, stop the process that currently owns /dev/dri/card0:

ps | grep -E "Xorg|weston|kmscube|DirectFB"
kill <PID>

Run DirectFB2

If the DirectFB2 test payload is present in the guest disk:

. /root/local-env.sh
df_drivertest

Other DirectFB2 examples can also be run from the same environment, such as:

df_fire
df_matrix
df_window
df_input
df_layers

The installed DirectFB2 examples come from the upstream DirectFB-examples project and can be listed in the guest with:

ls /usr/local/bin/df_*

If X11 is running and owns /dev/dri/card0, stop Xorg first before running DirectFB2:

ps | grep Xorg
kill <PID>

Run kmscube

If the test image includes kmscube, stop X11 first if it is running, then run:

kmscube

Run X11 workloads

If using the temporary X11-enabled test image from the DO NOT MERGE scaffolding, start X11 inside the guest:

startx

After X11 is running, launch the test programs from the guest:

glmark2
glxgears

cubic-dev-ai[bot]

This comment was marked as resolved.

Comment thread .ci/test-gpu.sh
Comment thread scripts/build-image.sh Outdated
cubic-dev-ai[bot]

This comment was marked as resolved.

Comment thread scripts/build-image.sh Outdated
cubic-dev-ai[bot]

This comment was marked as resolved.

Copy link
Copy Markdown
Collaborator

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebase latest 'master' branch, as deployment and build system are refined.

@Mes0903
Copy link
Copy Markdown
Collaborator Author

Mes0903 commented Apr 28, 2026

@jserv Since master now rebuilds PR prebuilts from source and publishes only Image/rootfs.cpio, should the DirectFB2 GPU test program become part of the normal Buildroot rootfs, or should test-gpu build/inject a separate GPU-only ext4 image during CI?

@jserv
Copy link
Copy Markdown
Collaborator

jserv commented Apr 28, 2026

Since master now rebuilds PR prebuilts from source and publishes only Image/rootfs.cpio, should the DirectFB2 GPU test program become part of the normal Buildroot rootfs, or should test-gpu build/inject a separate GPU-only ext4 image during CI?

Both are doable. You can even propose alternatives. The only consideration is to minimize kernel/rootfs images when possible.

Comment thread configs/buildroot.config Outdated
Mes0903 and others added 2 commits May 2, 2026 02:10
Enable libdrm in the Buildroot configuration. This is required for
DRM/KMS support used by DirectFB2 and the virtio-gpu test path.

Co-authored-by: Shengwen Cheng <shengwen1997.tw@gmail.com>
Enable DRM/KMS, virtio-gpu, and virtio DMA shared buffer in the Linux
kernel configuration to support the virtio-gpu 2D device.

Co-authored-by: Shengwen Cheng <shengwen1997.tw@gmail.com>
@Mes0903 Mes0903 force-pushed the vgpu-pr branch 3 times, most recently from f7ac88c to 539d553 Compare May 2, 2026 06:31
@Mes0903
Copy link
Copy Markdown
Collaborator Author

Mes0903 commented May 2, 2026

I decided to keep the normal kernel/rootfs path small and keep the GPU test payload out of the default Buildroot rootfs.

The current direction is to keep rootfs.cpio and the default ext4.img built from the normal Buildroot config, then build a separate tool-ext4.img only for GPU-oriented CI tests. test-gpu.sh boots that tool disk directly when it needs the DirectFB2 test programs. This keeps the default boot artifacts small, while still letting CI exercise the DRM/KMS path end to end.

For the temporary X11/glxgears validation, the DO NOT MERGE commit extends that same tool-image path with --x11; it does not put X11 into the normal rootfs. That part is only for PR testing and should not go into the mainline series.

@Mes0903
Copy link
Copy Markdown
Collaborator Author

Mes0903 commented May 2, 2026

I also made a few CI-only adjustments after testing the PR prebuilt path.

The PR prebuilt job now caches the generated guest artifacts by the live guest-input hash. When the PR changes Buildroot/Linux/rootfs inputs, the first run still rebuilds Image/rootfs.cpio/tool-ext4.img from source, but later pushes with the same guest inputs can restore those files from cache instead of running Buildroot again.

I also kept the downstream jobs artifact-based: pr-prebuilt-build still uploads Image/rootfs.cpio/tool-ext4.img as prebuilt-pr, so the Linux/macOS test jobs do not need to know whether the files came from a fresh build or from cache.

One related Makefile issue showed up during CI: external artifacts depended on prebuilt.sha1 as a normal prerequisite, and prebuilt.sha1 is refreshed via FORCE. That made make check treat already-restored PR artifacts as stale and try to download tool-ext4.img.bz2 from the upstream release, where it does not exist yet. I changed that dependency to order-only, so the manifest is still available when an artifact has to be downloaded, but existing PR-built artifacts are not redownloaded just because the manifest was refreshed.

Finally, test-gpu.sh now checks whether Image/rootfs.cpio/tool-ext4.img are already present before invoking make for them. This lets the workflow artifact path work correctly and avoids accidentally falling back to the release download path.

All of these CI adjustments are currently kept in DO NOT MERGE commits, so they are only for validating this PR and are not intended to land in master.

cubic-dev-ai[bot]

This comment was marked as resolved.

@Mes0903
Copy link
Copy Markdown
Collaborator Author

Mes0903 commented May 2, 2026

For convenience when manually testing the PR, I have also uploaded the prebuilt files to my own blob branch: https://github.com/Mes0903/semu/tree/blob

@Mes0903 Mes0903 requested review from jserv and shengwen-tw May 2, 2026 06:55
Comment thread main.c Outdated
Comment thread main.c Outdated
Comment thread virtio-gpu.c Outdated
Comment thread virtio-gpu.c Outdated
Comment thread virtio-gpu.c Outdated
Comment thread virtio-gpu.c Outdated
Comment thread virtio-gpu.c Outdated
Comment thread virtio-gpu.c Outdated
Comment thread virtio-gpu.c Outdated
Comment thread virtio-gpu.c Outdated
@sysprog21 sysprog21 deleted a comment from Mes0903 May 2, 2026
@sysprog21 sysprog21 deleted a comment from cubic-dev-ai Bot May 2, 2026
@sysprog21 sysprog21 deleted a comment from Mes0903 May 2, 2026
@sysprog21 sysprog21 deleted a comment from cubic-dev-ai Bot May 2, 2026
Comment thread .ci/publish-prebuilt.sh Outdated
@sysprog21 sysprog21 deleted a comment from cubic-dev-ai Bot May 3, 2026
@sysprog21 sysprog21 deleted a comment from Mes0903 May 3, 2026
@jserv jserv requested a review from visitorckw May 3, 2026 12:15
Comment thread mk/external.mk
scripts/rootfs_ext4.sh \
target/init
target/init \
target/local-env.sh
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is local-env.sh required?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on how we want users to invoke the test tools. local-env.sh is the guest-side environment hook for tools that are overlaid through the shared test-tools.img. Now it adds /usr/local/bin to PATH and /usr/local/lib to LD_LIBRARY_PATH.

The idea is that the default rootfs stays small, while optional test tools are collected in the same test-tools.img. When we add a new test tool later, the image build should stage that tool into test-tools.img, and local-env.sh should be updated if that tool needs an additional binary or library search path.

With that convention, after booting the VM the user only needs to source /root/local-env.sh once, instead of typing full paths such as /usr/local/bin/df_* for the overlaid tools.

Comment thread vgpu-display.c Outdated
Comment thread vgpu-display.h Outdated
Comment thread main.c Outdated
Comment thread main.c Outdated
@Mes0903 Mes0903 force-pushed the vgpu-pr branch 2 times, most recently from 13eff97 to 2218c08 Compare May 4, 2026 16:08
@shengwen-tw
Copy link
Copy Markdown
Collaborator

Aside from the CI/test change which I’m not familiar enough, the implementation looks legitimate to me.

Comment thread README.md
Mes0903 and others added 5 commits May 5, 2026 14:21
Introduce an optional 'test-tools.img' artifact alongside 'Image',
'rootfs.cpio' and 'ext4.img'. The normal 'ext4.img' remains built
directly from 'rootfs.cpio' so the default '/dev/vda' root stays small.
'--directfb2-test' builds DirectFB2 and DirectFB-examples, stages them
under 'extra_packages' with 'target/local-env.sh', and asks
'rootfs_ext4.sh' to produce 'test-tools.img' from 'rootfs.cpio' plus
that overlay.

At the moment the test tools disk only carries the DirectFB2 test
programs used by the virtio-gpu test path. It is published as
'test-tools.img.bz2' with the other prebuilt artifacts so tests can boot
it directly when the larger payload is needed.

Size 'test-tools.img' at 192 MiB. Local measurement with the current
'rootfs.cpio' plus 'extra_packages' showed 163 MiB fails to populate and
164 MiB succeeds; 192 MiB leaves roughly 28 MiB of headroom without
carrying the previous 1 GiB image size.

Co-authored-by: Shengwen Cheng <shengwen1997.tw@gmail.com>
Add a minimal virtio-gpu implementation sufficient for Linux DRM
dumb-buffer scanout, backed by a software 2D path and SDL presentation.
3D, blob, and virgl support remain explicitly out of scope.

The main constraint here is threading. semu already reserves the main
thread for SDL, while guest execution and device emulation run on the
emulator thread. That makes it a poor fit for the window side to touch
live virtio-gpu resource state directly, or for the emulator side to
depend on SDL-owned objects. This commit therefore keeps the display
path split around thread ownership so scanout updates do not share
mutable GPU state across the two threads.

The new structure is layered as follows:

- 'virtio-gpu.c' provides the guest-visible virtio-mmio device and
  virtqueue transport.
- 'virtio-gpu-sw.c' implements the software 2D backend and owns
  host-side resources, attached backing, and scanout/cursor state.
- 'vgpu-display.c' acts as the display bridge between the GPU backend
  and the window frontend.
- 'window-sw.c' remains the SDL window backend and consumes display
  commands on the frontend thread.

The important part is how work moves through those queues. The device
exposes the usual control and cursor queues, but all virtio-gpu command
processing stays on the emulator thread. Linux probes display
information and EDID there, creates dumb-buffer resources, attaches
guest backing pages, transfers framebuffer contents into host-side 2D
resources, binds those resources to scanouts, and later updates or
moves the cursor through the same device path.

Presentation is deliberately decoupled from that command path. When the
software backend needs to expose a new primary frame or cursor image, it
does not hand the SDL side a live resource pointer or shared mutable GPU
state. Instead it snapshots the relevant image into an immutable CPU
payload and publishes it through 'vgpu-display.c'. The trade-off is that
frame and cursor payloads are deep-copied before publication, which
costs memory bandwidth and temporary host memory, but keeps frontend
presentation independent from live backend resource lifetime.

That bridge keeps plane clear/removal events as reliable generation
state, while primary/cursor frame payloads and cursor moves use a
bounded lossy SPSC queue. The window backend drains those commands,
filters queued updates whose generation is stale, updates its retained
textures, and presents them without taking ownership of backend state.
This keeps rendering latency out of the virtio-gpu hot path; if the
frontend falls behind, stale lossy frame and cursor updates are dropped
while clear state remains reliable.

The same split extends the existing window integration. The wake pipe is
available whenever a window-backed device is enabled, so SDL shutdown
can wake a blocked emulator 'poll(-1)' path promptly while SDL event
handling and rendering remain on the main thread.

CI is extended at the same time with a virtio-gpu smoke test and the
DirectFB2 guest payload used to exercise the DRM/KMS path end to end.
After this change, Linux can bind 'virtio_gpu', expose '/dev/dri/card0',
and drive the new software 2D scanout path.

Co-authored-by: Shengwen Cheng <shengwen1997.tw@gmail.com>
virtio-input and virtio-gpu MMIO reads do not modify InterruptStatus or
other PLIC-visible interrupt state. Updating the interrupt line after
those reads only recomputes the same state and makes the read path look
side-effectful.

Keep interrupt synchronization on the write paths, where QueueNotify,
InterruptACK, status changes, or device failure handling can change the
pending interrupt state.

'emu_tick_peripherals()' still has no vgpu interrupt update. This
remains correct for the current vgpu path because vgpu interrupt changes
are raised from MMIO write / queue-notify handling and synchronized by
the write dispatch.
Add a PR-only '--x11' Buildroot config fragment for CI testing
and local virtio-gpu experiments with Xorg, Mesa demos, and
glmark2.

The build script keeps the published 'rootfs.cpio' and default
'ext4.img' on the normal Buildroot config. It rebuilds Buildroot with
the X11 fragment only to feed 'test-tools.img'. When
'--directfb2-test' is also enabled, the DirectFB2 overlay is applied
while creating that test tools disk.

Keep C++ disabled in the default config so the legacy initramfs stays
small. The X11 test-tools image still needs C++, and explicitly stages
'libstdc++.so' from the internal toolchain because glmark2 needs it at
runtime.

Build the PR and prebuilt workflow artifacts with '--x11', and include
'configs/x11.config' in the input fingerprint so drift detection and
cache keys track the X11 test-tools config.

This remains DO NOT MERGE; mainline prebuilt generation should continue
without X11.
Avoid rebuilding the PR guest artifacts on every push when the
guest input hash has not changed.

Expose the live input hash from the drift detector, restore Image,
rootfs.cpio, and test-tools.img from an actions/cache entry keyed by
that hash, and only run build-image on cache misses.

Keep the release manifest as an order-only prerequisite for external
artifact downloads. That lets make check use PR-restored artifacts
instead of treating them as stale after refreshing prebuilt.sha1.

The job still uploads the artifacts for downstream jobs, so their
download path does not need to know whether the files came from cache
or a fresh Buildroot run.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants