Fix packed-ring completion ordering and buffer ID#50
Merged
Conversation
Per Virtio 1.x packed virtqueue, the device must publish the buffer ID and used.len before flipping the USED flag, with a release barrier separating the body writes from the flag publish. Today virtio-blk and virtio-net XOR the USED bit in desc->flags with a plain store, never write desc->id, and on virtio-blk the desc->len write is also unfenced. A guest CPU can therefore observe USED set with stale len/id, and drivers that demux completions by buffer ID misbehave. Each completion site now writes id (taken from the chain's last descriptor — Virtio 1.x packed places the buffer ID on the descriptor without F_NEXT, matching QEMU's virtqueue_packed_pop) and len, then publishes the new flags via __atomic_store_n with __ATOMIC_RELEASE on a value precomputed from a plain load. The slot is single-writer between AVAIL and USED, so the load is safe non-atomic, and the release store compiles to a plain mov on x86: no lock prefix. virtio-net tx explicitly sets desc->len = 0 since transmit chains have no device-writable bytes. virtq_get_avail now reads desc->flags with __ATOMIC_ACQUIRE so addr/len/id reads after the AVAIL check observe the driver's release write. virtq_handle_avail does the same for guest_event->flags. isr_status writes were a non-atomic read-modify-write shared between virtio-net rx and tx workers; concurrent updates could lose the QUEUE bit. Switch the writers to __atomic_fetch_or with __ATOMIC_RELEASE and the guest-side read-clear in virtio_pci_space_read to __atomic_exchange_n to zero with __ATOMIC_ACQUIRE so a worker's fetch_or that races the guest read is captured rather than dropped.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Per Virtio 1.x packed virtqueue, the device must publish the buffer ID and used.len before flipping the USED flag, with a release barrier separating the body writes from the flag publish. Today virtio-blk and virtio-net XOR the USED bit in desc->flags with a plain store, never write desc->id, and on virtio-blk the desc->len write is also unfenced. A guest CPU can therefore observe USED set with stale len/id, and drivers that demux completions by buffer ID misbehave.
Each completion site now writes id (taken from the chain's last descriptor — Virtio 1.x packed places the buffer ID on the descriptor without F_NEXT, matching QEMU's virtqueue_packed_pop) and len, then publishes the new flags via __atomic_store_n with __ATOMIC_RELEASE on a value precomputed from a plain load. The slot is single-writer between AVAIL and USED, so the load is safe non-atomic, and the release store compiles to a plain mov on x86: no lock prefix. virtio-net tx explicitly sets desc->len = 0 since transmit chains have no device-writable bytes.
virtq_get_avail now reads desc->flags with __ATOMIC_ACQUIRE so addr/len/id reads after the AVAIL check observe the driver's release write. virtq_handle_avail does the same for guest_event->flags.
isr_status writes were a non-atomic read-modify-write shared between virtio-net rx and tx workers; concurrent updates could lose the QUEUE bit. Switch the writers to __atomic_fetch_or with __ATOMIC_RELEASE and the guest-side read-clear in virtio_pci_space_read to __atomic_exchange_n to zero with __ATOMIC_ACQUIRE so a worker's fetch_or that races the guest read is captured rather than dropped.
Summary by cubic
Fixes packed virtqueue completion ordering to publish buffer ID and used.len before flipping USED, per Virtio 1.x. Prevents guests from seeing stale len/id and avoids dropped QUEUE interrupts under concurrency.
used_desc->idfrom the chain tail, setused_desc->len, then release-storeflags(USED bit).desc->len, release-storeflags, and use atomicfetch_orto set QUEUE inisr_status.desc->len = 0, release-storeflags, and use atomicfetch_orforisr_status.desc->flagsinvirtq_get_availandguest_event->flagsinvirtq_handle_avail.isr_statuswith an atomic exchange to avoid losing concurrent updates.Written for commit 4b26e80. Summary will update on new commits. Review in cubic