Quick Clusters: A GPU-Parallel Partitioning for Efficient Path Tracing of Unstructured Volumetric Grids

IEEE Trans Vis Comput Graph. 2023 Jan;29(1):537-547. doi: 10.1109/TVCG.2022.3209418. Epub 2022 Dec 16.

Abstract

We propose a simple yet effective method for clustering finite elements to improve preprocessing times and rendering performance of unstructured volumetric grids without requiring auxiliary connectivity data. Rather than building bounding volume hierarchies (BVHs) over individual elements, we sort elements along with a Hilbert curve and aggregate neighboring elements together, improving BVH memory consumption by over an order of magnitude. Then to further reduce memory consumption, we cluster the mesh on the fly into sub-meshes with smaller indices using a series of efficient parallel mesh re-indexing operations. These clusters are then passed to a highly optimized ray tracing API for point containment queries and ray-cluster intersection testing. Each cluster is assigned a maximum extinction value for adaptive sampling, which we rasterize into non-overlapping view-aligned bins allocated along the ray. These maximum extinction bins are then used to guide the placement of samples along the ray during visualization, reducing the number of samples required by multiple orders of magnitude (depending on the dataset), thereby improving overall visualization interactivity. Using our approach, we improve rendering performance over a competitive baseline on the NASA Mars Lander dataset from 6× (1 frame per second (fps) and 1.0 M rays per second (rps) up to now 6 fps and 12.4 M rps, now including volumetric shadows) while simultaneously reducing memory consumption by 3×(33 GB down to 11 GB) and avoiding any offline preprocessing steps, enabling high-quality interactive visualization on consumer graphics cards. Then by utilizing the full 48 GB of an RTX 8000, we improve the performance of Lander by 17 × (1 fps up to 17 fps, 1.0 M rps up to 35.6 M rps).