Add CommittedPrimitiveIndex test on a multi-triangle BLAS by MarijnS95 · Pull Request #1272 · llvm/offload-test-suite

MarijnS95 · 2026-06-03T09:12:23Z

Summary

Stacks on top of #1232 / #1245. Adds the first InlineRT test with a non-trivial BLAS layout — three triangles tiled along x at x = -4, 0, +4 — and a 3-lane dispatch that fires one ray per lane straight down at its own triangle. Each lane's CommittedPrimitiveIndex() must equal its lane index. Also exercises divergent rays per thread for free.

Seed test for the multi-primitive / multi-geometry BLAS bullets in the inline-RT coverage epic (#1258).

Independent of the other InlineRT test PRs (#1271, #1274) — only adds a new test file.

Marked draft because it depends on #1232 / #1245 landing first.

Test plan

ninja check-hlsl-mtl-feature-inlinert locally on macOS — passes
Vulkan run on a separate machine
D3D12 run on a Windows machine
Linux Vulkan via the native offloader against an NVIDIA RTX 3060 — passes
D3D12 via Wine + vkd3d-proton + cross-compiled offloader.exe on the same GPU — passes

…rce allocation Introduce the foundational types for ray tracing acceleration structures: abstract `AccelerationStructure` base class, geometry/instance descriptors, BLAS/TLAS build-request structs with size queries, the `AccelerationStructureBuildFlags` bitmask (using `LLVM_DECLARE_ENUM_AS_BITMASK` since `TextureUsage` already uses the intrusive `LLVM_MARK_AS_BITMASK_ENUM`; `TextureUsage` also gains its previously-missing `LLVM_ENABLE_BITMASK_ENUMS_IN_NAMESPACE()`), and AS resource allocation across DX12, Vulkan, and Metal. Recording build commands lands in a follow-up commit on top of the ComputeEncoder abstraction. Vulkan device creation switches to a single `vkGetPhysicalDeviceFeatures2` call covering every extension feature struct we care about (atomic-int64, mesh-shader, acceleration-structure, BDA on 1.1): each struct is chained into `pNext` before the query, and post-query we verify the gating bool and clear the sub-features we don't enable (capture-replay, indirect-build, multiview, etc.). Drive-by: rather than letting `vkCreateDevice` reject the device with a generic `VK_ERROR_FEATURE_NOT_PRESENT`, the code now returns a descriptive `llvm::Error` naming the extension and the bool that came back zero — pinpointing the case where a driver advertises an extension but reports its base feature as `VK_FALSE`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…lper Move acceleration-structure build commands behind the abstract ComputeEncoder interface so the orchestration (data upload, build-request creation, AS allocation, build recording) can live in one place rather than splitting across three backends. ComputeEncoder gains a single batchBuildAS(ArrayRef<ASBuildItem>) method. Each item carries an AccelerationStructure plus a BLAS or TLAS build request via PointerUnion. The caller guarantees no inter-item memory dependencies inside a batch — backends record the whole batch with one barrier slot, no per-element barriers. - Vulkan: single vkCmdBuildAccelerationStructuresKHR call covering the whole batch. TLAS items serialize VkAccelerationStructureInstanceKHR into a device-address upload buffer, BLAS items pull addresses from each VulkanBuffer (new getDeviceAddress accessor). Storage buffers transparently gain SHADER_DEVICE_ADDRESS + ACCEL_BUILD_INPUT_READ_ONLY flags when ray tracing is supported, with the matching VkMemoryAllocateFlagsInfo chained on every allocation. - DX12: loop calling BuildRaytracingAccelerationStructure per item with no intermediate barriers; D3D12_RAYTRACING_INSTANCE_DESC is bit-identical to the Vulkan instance struct. - Metal: lazy transition to MTL::AccelerationStructureCommandEncoder, deduplicates BLAS handles into the MTL::InstanceAccelerationStructureDescriptor's instancedAccelera- tionStructures array (Metal references BLASes by index, not GPU address). Each backend's CommandBuffer now carries a back-pointer to its owning Device so the encoder can reach device-loaded entry points and helpers, plus a keep-alive list for AS scratch and instance buffers. A shared helper buildPipelineAccelerationStructures in lib/API/Device.cpp walks Pipeline::AccelStructs, uploads vertex/index data via the new createBufferWithData, builds requests, allocates AS objects, and issues two batchBuildAS calls (BLAS batch then TLAS batch — VUID-03403 forbids referencing a sibling dstAccelerationStructure in one command). Each backend's executeProgram calls this helper to build the pipeline's AS objects. Descriptor binding for AS resources is intentionally still missing — the tests progress past AS-build now and surface only the descriptor-write gap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Wire up acceleration-structure descriptor binding end-to-end across all three backends so shaders can actually consume the TLAS that buildPipelineAccelerationStructures produced — completing the stack and promoting the three InlineRT tests from XFAIL to passing. Vulkan: createDescriptorPool counts AS descriptors in a separate scalar (the KHR enum value 1000150000 doesn't fit in the indexed array used for the core types) and emits one VkDescriptorPoolSize for them. createDescriptorSets resolves each AS resource via Resource::TLASPtr, locates the matching VulkanAccelerationStructure in InvocationState::AccelStructs (BLASes-then-TLASes layout, matching the helper's documented declaration order), and writes the handle through a VkWriteDescriptorSetAccelerationStructureKHR chained on the descriptor write's pNext. The dispatch's pre-barrier dst access now includes VK_ACCESS_ACCELERATION_STRUCTURE_READ_BIT_KHR so the prior AS-build's writes are made visible to the shader's RayQuery reads. Device creation also enables VK_KHR_ray_query when supported so the RayQuery shader instructions actually function. DX12: writes a D3D12_SRV_DIMENSION_RAYTRACING_ACCELERATION_STRUCTURE SRV with the AS GPU virtual address as Location into the heap slot that createBuffers reserved (CreateShaderResourceView with a null resource — the AS data lives in the buffer pointed to by Location). Metal: the Metal shader converter doesn't bind the AS directly; the shader reads a buffer containing an IRRaytracingAccelerationStructure- GPUHeader that holds the AS's gpuResourceID plus a pointer to an instance-contributions array. createBuffers allocates and fills both buffers per AS-descriptor entry, then points the descriptor at the header buffer's GPU address. The TLAS itself is built with the UserID instance-descriptor variant so HLSL CommittedInstanceID() returns the YAML-specified per-instance ID instead of the array index. The three InlineRT tests now actually exercise the AS end-to-end: TraceRayInline issues a RayQuery against `Scene` and writes a hit-dependent value into `Output` (the instance ID for multi-instance, 1/0 otherwise). The catch-all `XFAIL: *` is dropped; `XFAIL: Clang` remains. The test shaders gain explicit `[[vk::binding]]` annotations since their `t0`/`u0` registers would otherwise collide under the default dxc HLSL→SPIR-V mapping. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Introduces a 3-triangle BLAS (tiled along x at x = -4, 0, +4) and a 3-lane dispatch that fires one ray per lane straight down at its own triangle. Each lane's CommittedPrimitiveIndex() must equal its lane index. Also exercises divergent rays per thread. Part of the inline-RT test coverage epic (llvm#1258). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Four small tests stacked on top of llvm#1275, each isolating one shader-observable PSO raytracing surface. They follow the same shape as the inline-RT batch already in llvm#1271 / llvm#1272 / llvm#1274 / llvm#1276 — one .test file per behavior, single-purpose shader, exact buffer comparison. - `dispatch-rays-index.test` — 4x1x1 dispatch, raygen writes `DispatchRaysIndex().x` into `Output[index]`. Confirms the dispatch grid plumbs through to the per-lane system value with no BLAS / TLAS / hit groups in play (RT-pipeline-only, no AS binding). - `dispatch-rays-dimensions.test` — 2x3x1 dispatch, raygen packs the constant `DispatchRaysDimensions()` into one uint per lane. Confirms every lane sees the host-side `{W, H, D}` even when only one dimension > 1. - `miss-shader-index.test` — two miss shaders writing distinct sentinels (0xAA / 0xBB). 2-lane dispatch picks `MissShaderIndex` 0 and 1 respectively; rays start far enough from the geometry that every ray misses. Verifies the SBT miss region's per-record routing. - `ray-contribution-to-hit-group-index.test` — two hit groups with distinct closest-hit shaders (0xA1 / 0xB2). 2-lane dispatch picks `RayContributionToHitGroupIndex` 0 and 1, every ray hits the same triangle. Verifies the SBT hit-group region's per-record routing. The first two have no AS / Miss / HitGroup in their pipeline at all — just a raygen + a UAV — which exercises the minimum viable RT pipeline shape (one raygen group, zero-sized miss / hit / callable SBT regions). The latter two reuse the single-triangle BLAS/TLAS from `raygen-roundtrip.test`. All four tests are `# REQUIRES: raytracing-pipeline` with `# XFAIL: Clang` — Clang (`clang-dxc`) doesn't yet lower `[shader("…")]` entry points to either DXIL libraries or SPIR-V. With the Metal RT bring-up rebased on top, all four pass natively on Apple Silicon and Metal is dropped from the XFAIL list. Locally verified end-to-end on the user's Linux box: - Vulkan via the native offloader against an NVIDIA RTX 3060: all four tests PASS. - D3D12 via Wine + vkd3d-proton + the cross-compiled offloader.exe on the same GPU: all four tests PASS. And on macOS 15 / metal-irconverter 3.1.1: - Metal via the native offloader: all four tests PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

MarijnS95 and others added 3 commits June 1, 2026 15:16

This was referenced Jun 3, 2026

Add InstanceInclusionMask filtering test #1274

Draft

[EPIC]: Improve inline raytracing test coverage #1258

Open

MarijnS95 force-pushed the inlinert-primitive-index branch from dda00e6 to 9e5141e Compare June 3, 2026 09:25

MarijnS95 mentioned this pull request Jun 3, 2026

Add RAY_FLAG_CULL_BACK_FACING_TRIANGLES test #1276

Draft

3 tasks

MarijnS95 requested review from EmilioLaiso and manon-traverse June 3, 2026 10:01

MarijnS95 mentioned this pull request Jun 3, 2026

Cover DispatchRays index/dimensions + SBT miss/hit-group routing #1277

Draft

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CommittedPrimitiveIndex test on a multi-triangle BLAS#1272

Add CommittedPrimitiveIndex test on a multi-triangle BLAS#1272
MarijnS95 wants to merge 4 commits into
llvm:mainfrom
Traverse-Research:inlinert-primitive-index

MarijnS95 commented Jun 3, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MarijnS95 commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

MarijnS95 commented Jun 3, 2026 •

edited

Loading