Skip to content

Complete parity with wgpu#594

Draft
inner-daemons wants to merge 71 commits into
gfx-rs:trunkfrom
inner-daemons:wgpu-native-complete
Draft

Complete parity with wgpu#594
inner-daemons wants to merge 71 commits into
gfx-rs:trunkfrom
inner-daemons:wgpu-native-complete

Conversation

@inner-daemons
Copy link
Copy Markdown

@inner-daemons inner-daemons commented May 26, 2026

Checklist

  • cargo clippy reports no issues
  • cargo doc reports no issues
  • cargo deny issues have been fixed or added to deny.toml
  • Human-readable change descriptions added to CHANGELOG.md under the "Unreleased" heading.
    • If the change does not affect the user (or is a process change), preface the change with "Internal:"
    • Add credit to yourself for each change: Added new functionality. @githubname

Description

This is this sister PR to gfx-rs/wgpu#9570. It brings wgpu-native into full parity with wgpu such that all features are supported, and all tests pass.

I cannot guarantee that everything is 100% right but I plan to look over every change and claude has also looked over every change, and like I mentioned it passes the full test suite of wgpu.

The only real breaking change is the removal of wgpuDeviceCreateShaderModuleSpirV which has been replaced with wgpu's new passthrough API. It is easily replaced.

Related Issues

inner-daemons and others added 16 commits May 14, 2026 23:55
- RenderBundleEncoder::new: add None device arg (new optional param)
- VertexBufferLayout: wrap in Some (now Option<VertexBufferLayout>)
- RequestAdapterOptions: add apply_limit_buckets: false field
- enumerate_adapters: add apply_limit_buckets bool arg
- render_bundle_encoder_finish: pass Box directly, not *encoder
- wgpu_render_bundle_set_vertex_buffer: buffer_id → Some(buffer_id)
- render_pass_set_vertex_buffer: buffer_id → Some(buffer_id)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Exposes per-format GPU texture capabilities (allowed usages + feature
flags) that wgpu-core already queries via the HAL but previously had no
C API surface. Adds WGPUNativeTextureFormatCapabilities struct and
WGPUNativeTextureFormatFeatureFlags bitmask to wgpu.h.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
inner-daemons and others added 13 commits May 26, 2026 19:09
- Add 14 WGPUNativeTextureFormat_Astc*Sfloat codes (0x0003000A..0x00030017)
  starting after R64Uint (0x00030009) to avoid value collision
- Add WGPUNativeFeature_VulkanExternalMemoryFd (0x00030041)
- Extend WGPUNativeLimits with mesh shader and ray tracing limit fields
- to_native_texture_format: expand ASTC HDR channel to 14 native codes
- write_limits_struct: fill all new extended limit fields
- map_required_limits: read all new extended limit fields

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds WGPUNativeFeature_VulkanExternalMemoryDmaBuf (0x00030042) to expose
the Linux DMA-BUF external memory Vulkan feature through the C API.
Maps to wgpu::Features::VULKAN_EXTERNAL_MEMORY_DMA_BUF in both
features_to_native and feature_from_native.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
These formats exist in the standard WebGPU C header (WGPUTextureFormat_*)
but were missing from wgpu-native's Rust conv.rs, causing a panic when
wgpuAdapterGetTextureFormatCapabilities was called with these formats.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ctureArray binding

- ffi/wgpu.h: Add WGPUWgslLanguageFeatures bitflag typedef with 3 constants
  (ReadOnlyAndReadWriteStorageTextures, Packed4x8IntegerDotProduct, PointerCompositeAccess)
- ffi/wgpu.h: Declare wgpuGetWgslLanguageFeatures() and wgpuSurfaceDiscardTexture()
- ffi/wgpu.h: Add tlases/tlasCount array fields to WGPUBindGroupEntryExtras
- src/lib.rs: Implement wgpuGetWgslLanguageFeatures via naga ImplementedLanguageExtension
- src/lib.rs: Implement wgpuSurfaceDiscardTexture with double-discard guard
- src/conv.rs: Handle AccelerationStructureArray in map_bind_group_entry

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- ffi/wgpu.h: Add WGPUDownlevelFlags bitmask, WGPUShaderModel enum,
  WGPUDownlevelCapabilities struct, and wgpuAdapterGetDownlevelCapabilities()
- ffi/wgpu.h: Add WGPUAllocationReport, WGPUMemoryBlockReport, WGPUAllocatorReport
  structs with wgpuDeviceGetAllocatorReport() and wgpuAllocatorReportFreeMembers()
- src/lib.rs: Implement wgpuAdapterGetDownlevelCapabilities via wgpu-core
- src/lib.rs: Implement wgpuDeviceGetAllocatorReport and wgpuAllocatorReportFreeMembers;
  allocations array and name strings are heap-allocated and freed by the caller
- src/conv.rs: Add map_downlevel_capabilities helper

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds NON_POWER_OF_TWO_MIPMAPPED_TEXTURES, INDEPENDENT_BLEND,
DEPTH_TEXTURE_AND_BUFFER_COPIES, WEBGPU_TEXTURE_FORMAT_SUPPORT,
BUFFER_BINDINGS_NOT_16_BYTE_ALIGNED, FULL_DRAW_INDEX_UINT32,
VIEW_FORMATS, SURFACE_VIEW_FORMATS, NONBLOCKING_QUERY_RESOLVE,
SHADER_F16_IN_F32, and MSL2_1 to WGPUDownlevelFlags and updates
map_downlevel_capabilities to include them.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
map_bind_group_layout_entry previously only read the binding array
count from the legacy WGPUBindGroupLayoutEntryExtras.count extension.
The WebGPU spec now puts this value on WGPUBindGroupLayoutEntry.bindingArraySize
directly. Fall through to bindingArraySize if the extension count is 0,
so both old and new callers work correctly.

Without this fix, callers using the standard field got count=None
in the wgt::BindGroupLayoutEntry, causing Metal's bindless path to
be skipped and naga MSL translation to fail with "mapping of
ResourceBinding is missing".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously, wgpuBlasPrepareCompactAsync just logged errors on failure
(e.g. CompactionUnsupported) without pushing them to the device's
validation error scope. Callers using push_error_scope/pop_error_scope
to detect validation errors would see no error, causing tests to fail.

Fix by adding error_sink to WGPUBlasImpl (propagated from the device at
creation time) and calling handle_error() in wgpuBlasPrepareCompactAsync
and wgpuQueueCompactBlas so errors reach the active error scope.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ceCreateBlas

- Handle WGPUBlasGeometryKind_AABBs by mapping aabbDescriptors to
  wgt::BlasAABBGeometrySizeDescriptor; previously panicked unconditionally
  on any non-triangle geometry kind.
- Preserve the None/Some distinction for index_count that wgpu-core
  uses to detect mismatched index descriptors:
  - indexFormat=Undefined + indexCount>0  → (None, Some(count))
  - indexFormat!=Undefined + indexCount=0 → (Some(format), None)
  Both cases hit wgpu-core's MissingIndexData validation error.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
inner-daemons and others added 5 commits May 28, 2026 02:25
1. Align BlasTriangleGeometrySizeDescriptor index mapping in
   wgpuCommandEncoderBuildAccelerationStructures with wgpuDeviceCreateBlas.
   The build path was using a simpler two-branch check that differed from
   creation in two edge cases:
   - indexFormat=Undefined, indexCount>0: build silently dropped the count
     (gave (None,None)) while creation passed (None,Some(N)) to wgpu-core.
   - indexFormat=defined, indexCount=0: build gave (Some(fmt),Some(0)) while
     creation gave (Some(fmt),None), causing a descriptor mismatch in
     wgpu-core validation at build time.

2. Fix aliased &T / &mut T UB in all three AdapterInfoExtras nextInChain
   blocks (wgpuAdapterGetInfo, wgpuAdapterInfoFreeMembers,
   wgpuDeviceGetAdapterInfo). The previous pattern called
   `ptr.as_ref()` to check sType (creating a live shared reference) and
   then immediately created `&mut *(ptr as *mut WGPUAdapterInfoExtras)` to
   the same allocation — aliased &/&mut is UB under Rust's memory model.
   Fixed by reading sType directly via raw pointer dereference
   (`(*ptr).sType`) so no shared reference is live when the &mut is made.

3. Document that wgpuShaderModuleGetCompilationInfo always invokes its
   callback synchronously regardless of callback_info.mode, consistent with
   the same known limitation in wgpuBufferMapAsync. Changed the inline
   `WGPUFuture { id: 0 }` literal to NULL_FUTURE for consistency.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
inner-daemons and others added 7 commits May 29, 2026 02:20
- Remove stale FIXME on wgpuDevicePoll (function was already reworked)
- Fix misleading comment in create_shader_module_passthrough: WGSL routes
  through the validated path because wgpu-native has no unvalidated WGSL;
  binary formats use WGPUShaderModuleDescriptorPassthrough
- Move map_cooperative_scalar_type and map_state_to_u32 from lib.rs into
  conv.rs where conversion helpers belong
- Add comment above SetLabel no-ops explaining they are permanent because
  wgpu-core does not expose a set-label API for these resource types

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
device.rs
- Replace 17× four-line label boilerplate (let label + let label_sv) with
  single conv::opt_str_to_string_view(desc.label) call (~51 lines saved)
- Extract FragmentStateStorage struct and build_fragment_state() helper,
  eliminating ~80 lines of identical fragment state build code duplicated
  between create_render_pipeline and create_mesh_pipeline
- Remove #[allow(unused_assignments)] from create_render_pipeline (no longer
  needed); keep it on create_mesh_pipeline for task stage with accurate comment
- Trim Drop for CDevice comment: 5 lines → 2
- Merge two separate AccelerationStructure chain-pointer comments into one
- Collapse "Wire chain pointers" comment blocks: 3 lines → 1 each
- Trim set_device_lost_callback log message and preamble comment
- Trim destroy() comment block: 9 lines → 2 (points to adapter.rs for detail)
- Trim CQueue field doc comments: 3-line prose → 1-line each

adapter.rs
- Remove "Build the feature list" and "Build the limits structs" section
  comments (variable names are self-documenting)
- Trim device_lost_cb block comment: 11 lines → 4 (remove CURRENT STATE /
  FORWARD COMPAT / KNOWN LIMITATION sub-headings, keep the core fact)
- Merge two near-identical SAFETY comments on device_lost_ptr / handler_ptr
  into one comment before the struct literal
- Remove "Capture adapter info before creating device" comment

command.rs
- Remove six obvious section comments: "Color attachments.", "Hole in color
  attachments array.", "Depth stencil attachment.", "Timestamp writes.",
  "Occlusion query set.", "Convert optional timestamp writes." — all restate
  the immediately following variable names
- Remove "None means copy to end" comment on unwrap_or(u64::MAX)
- Trim TLAS build comment: 2 lines → 1
- Replace 3× label boilerplate with conv::opt_str_to_string_view

pass.rs
- Add CBuffer to the use crate::resource import
- Replace 15× crate::resource::CBuffer with CBuffer (CBuffer was imported
  everywhere else in the crate but missing from this file's use statement)

lib.rs (wgpu-c-backend)
- Replace println! with log::debug! (log is already a dependency)
- Trim instance_factory comment: 4 lines → 1
- Trim panic propagation module comment: remove last 2 lines about
  spontaneous callbacks being "inherently racy" (design-doc content)
- Extract not_found() local function to deduplicate the 13-line
  RequestAdapterError::NotFound struct literal constructed twice in
  request_adapter (once in cb callback, once as fallback)

resource.rs
- Trim CBuffer.is_mapped field comment: 4 lines → 1
- Trim map_async spin-poll comment: 6 lines → 2
- Trim get_mapped_range guard comment: 3 lines → 1
- Trim CBufferMappedRange.read_only field comment: 4 lines → 1
- Trim CQueueWriteBuffer module comment: 3 lines → 1
- Replace label boilerplate in CTexture::create_view with opt_str_to_string_view

src/lib.rs
- Fix misplaced doc comments on wgpuDevicePoll: /// lines must precede all
  attributes, not appear between #[no_mangle] and pub fn
- Simplify wgpuDevicePoll wait branch: fold two PollType::Wait arms (differing
  only in submission_index: Some/None) into one using .copied(); replace
  if/else timeout block with (timeout_ns != 0).then(|| ...) (15 lines → 3)
- Trim wgpuShaderModuleGetCompilationInfo TODO comment: 4 lines → 2

src/conv.rs
- Rewrite WGPU_NATIVE_STORAGE_TEXTURE_ACCESS_ATOMIC and
  WGPU_NATIVE_TEXTURE_FORMAT_R64_UINT comments to lead with the non-obvious
  numeric value/range position rather than restating the constant name

CHANGELOG.md
- Tighten feature-parity paragraph: remove vague "and more. Additionally..."
  trailing sentence; replace with "Now tested against wgpu's test suite."

wgpu-c-backend/readme.md + explanation.txt
- Merge explanation.txt content into readme.md under a "Directory structure"
  section; delete explanation.txt (.txt file for doc content is non-standard)
- Trim redundant opening sentence from readme.md description

run-wgpu-tests.py
- Collapse 3 print statements in copy_c_backend (Removing / Copying) into
  one "Syncing" line printed before the operation

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Address 21 doc/comment issues found in audit of PR gfx-rs#594:

Wrong/misleading:
- adapter.rs: rewrite SAFETY comment on callback userdata pointers; the
  previous text claimed the Boxes were "stored in CDevice" at the point
  they are registered, but they are still local stack allocations at that
  point — moved into CDevice only on the success path
- device.rs: s/wgpu-core/wgpu-native/ in create_buffer validation comment;
  the validation happens at the wgpu-native layer, not wgpu-core directly
- resource.rs: mark the null-return behaviour of wgpuBufferGetMappedRange
  for Read buffers as "observed, not a specified API contract"

Redundant/inconsistent:
- device.rs: remove 9 section-label comments (// Vertex state., // Primitive
  state. ×2, // Depth stencil. ×2, // Multisample. ×2, // Task stage
  (optional)., // Mesh stage (required).) that restate variable names
- device.rs: unify the duplicate "Wire chain pointers" comment so both
  occurrences carry the Box stability explanation
- pass.rs: align transition_resources no-op comment with the more
  informative command.rs version ("Metal, Vulkan, etc.")

Missing documentation:
- adapter.rs: expand Box-vs-Arc comment to explain that handler_ptr is
  the userdata1 raw pointer for the C callback
- device.rs: add /// doc comment on set_device_lost_callback explaining
  that GPU-initiated loss never fires it (only Device::destroy() does)
- device.rs: add comment to create_staging_buffer explaining the CPU
  staging model and cross-referencing CQueueWriteBuffer
- device.rs: condense CQueue field docs for shared fields to cross-
  reference CDevice rather than duplicating the explanation
- pass.rs: add End-in-Drop explanation to CComputePass and CRenderPass
  Drop impls (wgpu ends a pass by dropping the pass object)
- resource.rs: expand spin-poll comment in map_async to explain that
  it semi-blocks the caller and what the 50 ms cap means
- surface.rs: document the silent Fifo fallback in configure() when
  present_mode_to_native returns None

Organisation:
- src/lib.rs: move wgpuExternalTextureSetLabel into the grouped SetLabel
  no-op block; update the block comment from "most resource types" to
  "these resource types"

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@cwfitzgerald cwfitzgerald marked this pull request as ready for review May 30, 2026 20:42
@inner-daemons inner-daemons marked this pull request as draft May 30, 2026 21:08
@inner-daemons
Copy link
Copy Markdown
Author

@Vipitis Do you have a discord or something? I'd like to be able to communicate with you about this since I think you will be one of the main users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant