vfio: add VFIO cdev + iommufd device assignment path#3465
Draft
jstarks wants to merge 6 commits into
Draft
Conversation
|
This PR modifies files containing For more on why we check whole files, instead of just diffs, check out the Rustonomicon |
jstarks
commented
May 12, 2026
jstarks
commented
May 12, 2026
jstarks
commented
May 12, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a new Linux VFIO device-assignment backend based on the modern VFIO cdev + iommufd interfaces, alongside the existing legacy VFIO group/container + Type1v2 path. The launcher CLI is updated to select the backend per device and to support creating named iommufd contexts.
Changes:
- Introduces
vfio_sysbindings/wrappers for VFIO cdev ioctls and iommufd IOAS map/unmap ioctls. - Adds a new
VfioCdevDeviceHandleresource type plus aVfioCdevDeviceResolver/binding path that allocates an IOAS, binds/attaches the VFIO cdev device, and registers an iommufd-backed DMA target. - Updates OpenVMM CLI: adds
--iommu id=<name>and switches--vfiotohost=<bdf>,port=<name>[,iommu=<id>], removing the old positional<port>:<bdf>syntax.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| vm/devices/user_driver/vfio_sys/src/lib.rs | Exposes new cdev and iommufd modules from vfio_sys. |
| vm/devices/user_driver/vfio_sys/src/iommufd.rs | New iommufd (/dev/iommu) ioctl wrappers for IOAS alloc/map/unmap/destroy. |
| vm/devices/user_driver/vfio_sys/src/cdev.rs | New VFIO cdev (/dev/vfio/devices/vfioN) ioctl wrappers (bind/attach/detach) and device open helper. |
| vm/devices/pci/vfio_assigned_device/src/resolver.rs | Adds an async resource resolver for the new cdev+iommufd device handle. |
| vm/devices/pci/vfio_assigned_device/src/manager.rs | Adds iommufd-based DMA target + binding that wires cdev devices into the region manager’s DMA mapping flow. |
| vm/devices/pci/vfio_assigned_device/src/lib.rs | Generalizes assigned-device lifetime handling to support either legacy binding or cdev binding. |
| vm/devices/pci/vfio_assigned_device_resources/src/lib.rs | Adds VfioCdevDeviceHandle resource ID (vfio-cdev) carrying cdev + iommufd fds. |
| openvmm/openvmm_entry/src/lib.rs | Launcher opens/dups /dev/iommu and opens the VFIO cdev node when iommu= is specified; otherwise keeps legacy group open behavior. |
| openvmm/openvmm_entry/src/cli_args.rs | CLI surface change: new --iommu, new --vfio key-value format, removes old positional parsing. |
| openvmm/openvmm_core/src/worker/dispatch.rs | Registers the new cdev resolver with the VM resource resolver on Linux. |
jstarks
commented
May 13, 2026
jstarks
commented
May 13, 2026
Comment on lines
+131
to
+136
| pub fn ioas_alloc(&self) -> anyhow::Result<u32> { | ||
| let mut cmd = IommuIoasAlloc { | ||
| size: size_of::<IommuIoasAlloc>() as u32, | ||
| flags: 0, | ||
| out_ioas_id: 0, | ||
| }; |
Comment on lines
+95
to
+101
| pub fn bind_iommufd(&self, iommufd_fd: RawFd) -> anyhow::Result<u32> { | ||
| let mut cmd = VfioDeviceBindIommufd { | ||
| argsz: size_of::<VfioDeviceBindIommufd>() as u32, | ||
| flags: 0, | ||
| iommufd: iommufd_fd, | ||
| out_devid: 0, | ||
| }; |
Comment on lines
+503
to
+505
| self.ctx | ||
| .ioas_unmap(self.ioas_id, range.start(), range.len()) | ||
| .context("iommufd IOAS DMA unmap failed")?; |
Comment on lines
+1848
to
1863
| // Register the VFIO cdev + iommufd resolver for devices opened | ||
| // via the cdev interface. Spawns a VfioCdevManager task that | ||
| // shares IOAS contexts across devices with the same --iommu ID. | ||
| let cdev_resolver = vfio_assigned_device::resolver::VfioCdevDeviceResolver::new( | ||
| driver_source.builder().build("vfio-cdev-mgr"), | ||
| dma_mapper_client, | ||
| ); | ||
| resolver.add_async_resolver::< | ||
| vm_resource::kind::PciDeviceHandleKind, | ||
| _, | ||
| vfio_assigned_device_resources::VfioCdevDeviceHandle, | ||
| _, | ||
| >(cdev_resolver); | ||
|
|
||
| Some(handle) | ||
| }; |
Comment on lines
+249
to
+283
| // On aarch64, the physical IOMMU reserves 128MB..129MB for the MSI | ||
| // doorbell window. iommufd inherits this and rejects DMA mappings | ||
| // that overlap it. Split any RAM range that crosses this window so | ||
| // that the region manager never maps it. | ||
| // | ||
| // TODO: query the actual reserved ranges from iommufd at runtime | ||
| // via `IOMMU_IOAS_IOVA_RANGES` instead of hardcoding. | ||
| #[cfg(guest_arch = "aarch64")] | ||
| { | ||
| const IOMMU_MSI_RESERVED: MemoryRange = MemoryRange::new(0x800_0000..0x810_0000); | ||
| let mut split_ram = Vec::with_capacity(ram.len() + 2); | ||
| for entry in ram { | ||
| if !entry.range.overlaps(&IOMMU_MSI_RESERVED) { | ||
| split_ram.push(entry); | ||
| } else { | ||
| // Part before the reserved window. | ||
| if entry.range.start() < IOMMU_MSI_RESERVED.start() { | ||
| split_ram.push(MemoryRangeWithNode { | ||
| range: MemoryRange::new( | ||
| entry.range.start()..IOMMU_MSI_RESERVED.start(), | ||
| ), | ||
| vnode: entry.vnode, | ||
| }); | ||
| } | ||
| // Part after the reserved window. | ||
| if entry.range.end() > IOMMU_MSI_RESERVED.end() { | ||
| split_ram.push(MemoryRangeWithNode { | ||
| range: MemoryRange::new(IOMMU_MSI_RESERVED.end()..entry.range.end()), | ||
| vnode: entry.vnode, | ||
| }); | ||
| } | ||
| } | ||
| } | ||
| ram = split_ram; | ||
| } |
Comment on lines
+131
to
+136
| pub fn ioas_alloc(&self) -> anyhow::Result<u32> { | ||
| let mut cmd = IommuIoasAlloc { | ||
| size: size_of::<IommuIoasAlloc>() as u32, | ||
| flags: 0, | ||
| out_ioas_id: 0, | ||
| }; |
Comment on lines
+95
to
+101
| pub fn bind_iommufd(&self, iommufd_fd: RawFd) -> anyhow::Result<u32> { | ||
| let mut cmd = VfioDeviceBindIommufd { | ||
| argsz: size_of::<VfioDeviceBindIommufd>() as u32, | ||
| flags: 0, | ||
| iommufd: iommufd_fd, | ||
| out_devid: 0, | ||
| }; |
Comment on lines
+78
to
+84
| #[repr(C)] | ||
| pub struct IommuIoasAlloc { | ||
| pub size: u32, | ||
| pub flags: u32, | ||
| pub out_ioas_id: u32, | ||
| } | ||
|
|
Comment on lines
+96
to
+102
| #[repr(C)] | ||
| pub struct IommuIoasUnmap { | ||
| pub size: u32, | ||
| pub ioas_id: u32, | ||
| pub iova: u64, | ||
| pub length: u64, | ||
| } |
Comment on lines
+504
to
+506
| self.ctx | ||
| .ioas_unmap(self.ioas_id, range.start(), range.len()) | ||
| .context("iommufd IOAS DMA unmap failed")?; |
Comment on lines
+810
to
+835
| } else { | ||
| let mut ioas_recv: mesh::Receiver<IoasManagerRpc> = mesh::Receiver::new(); | ||
| let sender = ioas_recv.sender(); | ||
|
|
||
| let dma_mapper_client = self.dma_mapper_client.clone(); | ||
| let iommu_id2 = iommu_id.clone(); | ||
| let task = self | ||
| .spawner | ||
| .spawn(format!("vfio-ioas-{iommu_id}"), async move { | ||
| match IoasManager::new(iommu_id2, iommufd, &dma_mapper_client, ioas_recv).await | ||
| { | ||
| Ok(mgr) => mgr.run().await, | ||
| Err(e) => { | ||
| tracing::error!( | ||
| error = format!("{e:#}"), | ||
| "failed to initialize iommufd IOAS manager" | ||
| ); | ||
| // The recv will be dropped, causing all pending | ||
| // and future RPCs to fail with channel-closed. | ||
| } | ||
| } | ||
| }); | ||
| self._tasks.push(task); | ||
| self.managers.insert(iommu_id, sender.clone()); | ||
| sender | ||
| }; |
Comment on lines
+838
to
+848
| let entry = std::fs::read_dir(&vfio_dev_dir) | ||
| .with_context(|| { | ||
| format!( | ||
| "failed to read {}: is {} bound to vfio-pci?", | ||
| vfio_dev_dir.display(), | ||
| cli_cfg.pci_id | ||
| ) | ||
| })? | ||
| .next() | ||
| .context("no vfio-dev entry found")? | ||
| .context("failed to read vfio-dev entry")?; |
Add support for the modern VFIO cdev + iommufd device assignment interface alongside the existing legacy group/container + Type1v2 path. The cdev interface (/dev/vfio/devices/vfioN) provides per-device file descriptors instead of the legacy group model, and iommufd (/dev/iommu) replaces the VFIO Type1v2 container for DMA mapping via IOAS objects. Both paths coexist--the user selects the backend per device via CLI. The CLI adds two new flags: - `--iommu id=<name>` creates an iommufd context - `--vfio host=<bdf>,port=<name>[,iommu=<id>]` assigns a device, optionally referencing an iommufd context for the cdev path The old `--vfio <port>:<bdf>` positional syntax is removed in favor of the key-value format. When `iommu=` is specified, the launcher opens the cdev device node and an iommufd fd, producing a `VfioCdevDeviceHandle` resource. A new `VfioCdevDeviceResolver` handles resolution: it allocates an IOAS, binds the cdev device to iommufd, attaches the device to the IOAS, and registers an `IommufdDmaTarget` with the region manager for identity DMA mapping. The resulting `VfioAssignedPciDevice` is identical regardless of which path opened the device. Config space, BAR mapping, and MSI-X emulation are shared.
The fields were prefixed with _ as if unused, but the inspected fields appear in inspect output and device_id/sender are explicitly read in the Drop impl. Drop the prefixes so the names reflect actual use.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add support for the modern VFIO cdev + iommufd device assignment interface alongside the existing legacy group/container + Type1v2 path.
The cdev interface (/dev/vfio/devices/vfioN) provides per-device file descriptors instead of the legacy group model, and iommufd (/dev/iommu) replaces the VFIO Type1v2 container for DMA mapping via IOAS objects. Both paths coexist--the user selects the backend per device via CLI.
The CLI adds two new flags:
--iommu id=<name>creates an iommufd context--vfio host=<bdf>,port=<name>[,iommu=<id>]assigns a device, optionally referencing an iommufd context for the cdev pathThe old
--vfio <port>:<bdf>positional syntax is removed in favor of the key-value format.When
iommu=is specified, the launcher opens the cdev device node and an iommufd fd, producing aVfioCdevDeviceHandleresource. A newVfioCdevDeviceResolverhandles resolution: it allocates an IOAS, binds the cdev device to iommufd, attaches the device to the IOAS, and registers anIommufdDmaTargetwith the region manager for identity DMA mapping. The resultingVfioAssignedPciDeviceis identical regardless of which path opened the device. Config space, BAR mapping, and MSI-X emulation are shared.