High-Performance Graphics 2021
Permanent URI for this collection
High Performance Graphics 2021 CGF 40-8: Frontmatter
[meta data] [files: ]
Binder, Nikolaus
;
Ritschel, Tobias
High-Performance Rendering
Hardware Adaptive High-Order Interpolation for Real-Time Graphics
[meta data] [files: ]
Lin, Daqi
;
Seiler, Larry
;
Yuksel, Cem
Rendering
ReSTIR GI: Path Resampling for Real-Time Path Tracing
[meta data] [files: ]
Ouyang, Yaobin
;
Liu, Shiqiu
;
Kettunen, Markus
;
Pharr, Matt
;
Pantaleoni, Jacopo
Geometry and Optimization
Sampling from Quadric-Based CSG Surfaces
[meta data] [files: ]
Trettner, Philip
;
Kobbelt, Leif
Geometry and Optimization
Cooperative Profile Guided Optimizations
[meta data] [files: ]
Stephenson, Mark
;
Rangan, Ram
;
Keckler, Stephen W.
Geometry and Optimization
A Halfedge Refinement Rule for Parallel Catmull-Clark Subdivision
[meta data] [files: ]
Dupuy, Jonathan
;
Vanhoey, Kenneth
Browse
Recent Submissions
Item High Performance Graphics 2021 CGF 40-8: Frontmatter(The Eurographics Association and John Wiley & Sons Ltd., 2021) Binder, Nikolaus; Ritschel, Tobias; Binder, Nikolaus and Ritschel, TobiasItem Hardware Adaptive High-Order Interpolation for Real-Time Graphics(The Eurographics Association and John Wiley & Sons Ltd., 2021) Lin, Daqi; Seiler, Larry; Yuksel, Cem; Binder, Nikolaus and Ritschel, TobiasInterpolation is a core operation that has widespread use in computer graphics. Though higher-order interpolation provides better quality, linear interpolation is often preferred due to its simplicity, performance, and hardware support. We present a unified refactoring of quadratic and cubic interpolations as standard linear interpolation plus linear interpolations of higher-order terms and show how they can be applied to regular grids and (triangular/tetrahedral) simplexes Our formulations can provide significant reduction in computation cost, as compared to typical higher-order interpolations and prior approaches that utilize existing hardware linear interpolation support to achieve higher-order interpolation. In addition, our formulation allows approximating the results by dynamically skipping some higher order terms with low weights for further savings in both computation and storage. Thus, higher-order interpolation can be performed adaptively, as needed. We also describe how relatively minor modifications to existing GPU hardware could provide hardware support for quadratic and cubic interpolations using our approach for both texture filtering operations and barycentric interpolation. We present a variety of examples using triangular, rectangular, tetrahedral, and cuboidal interpolations, showing the effectiveness of our higher-order interpolations in different applications.Item ReSTIR GI: Path Resampling for Real-Time Path Tracing(The Eurographics Association and John Wiley & Sons Ltd., 2021) Ouyang, Yaobin; Liu, Shiqiu; Kettunen, Markus; Pharr, Matt; Pantaleoni, Jacopo; Binder, Nikolaus and Ritschel, TobiasEven with the advent of hardware-accelerated ray tracing in modern GPUs, only a small number of rays can be traced at each pixel in real-time applications. This presents a significant challenge for path tracing, even when augmented with state-of-the art denoising algorithms. While the recently-developed ReSTIR algorithm [BWP*20] enables high-quality renderings of scenes with millions of light sources using just a few shadow rays at each pixel, there remains a need for effective algorithms to sample indirect illumination. We introduce an effective path sampling algorithm for indirect lighting that is suitable to highly parallel GPU architectures. Building on the screen-space spatio-temporal resampling principles of ReSTIR, our approach resamples multi-bounce indirect lighting paths obtained by path tracing. Doing so allows sharing information about important paths that contribute to lighting both across time and pixels in the image. The resulting algorithm achieves a substantial error reduction compared to path tracing: at a single sample per pixel every frame, our algorithm achieves MSE improvements ranging from 9.3x to 166x in our test scenes. In conjunction with a denoiser, it leads to high-quality path traced global illumination at real-time frame rates on modern GPUs.Item BRDF Importance Sampling for Linear Lights(The Eurographics Association and John Wiley & Sons Ltd., 2021) Peters, Christoph; Binder, Nikolaus and Ritschel, TobiasWe introduce an efficient method to sample linear lights, i.e. infinitesimally thin cylinders, proportional to projected solid angle. Our method uses inverse function sampling with a specialized iterative procedure that converges to high accuracy in only two iterations. It also allows us to sample proportional to a linearly transformed cosine. By combining both sampling techniques through suitable multiple importance sampling heuristics and by using good stratification, we achieve unbiased diffuse and specular real-time shading with low variance outside penumbrae at two samples per pixel. Additionally, we provide a fast method for solid angle sampling.Item Sampling from Quadric-Based CSG Surfaces(The Eurographics Association and John Wiley & Sons Ltd., 2021) Trettner, Philip; Kobbelt, Leif; Binder, Nikolaus and Ritschel, TobiasWe present an efficient method to create samples directly on surfaces defined by constructive solid geometry (CSG) trees or graphs. The generated samples can be used for visualization or as an approximation to the actual surface with strong guarantees. We chose to use quadric surfaces as CSG primitives as they can model classical primitives such as planes, cubes, spheres, cylinders, and ellipsoids, but also certain saddle surfaces. More importantly, they are closed under affine transformations, a desirable property for a modeling system. We also propose a rendering method that performs local quadric ray-tracing and clipping to achieve pixel-perfect accuracy and hole-free rendering.Item Cooperative Profile Guided Optimizations(The Eurographics Association and John Wiley & Sons Ltd., 2021) Stephenson, Mark; Rangan, Ram; Keckler, Stephen W.; Binder, Nikolaus and Ritschel, TobiasExisting feedback-driven optimization frameworks are not suitable for video games, which tend to push the limits of performance of gaming platforms and have real-time constraints that preclude all but the simplest execution profiling. While Profile Guided Optimization (PGO) is a well-established optimization approach, existing PGO techniques are ill-suited for games for a number of reasons, particularly because heavyweight profiling makes interactive applications unresponsive. Adaptive optimization frameworks continually collect metrics that guide code specialization optimizations during program execution but have similarly high overheads. We emulate a system, which we call Cooperative PGO, in which the gaming platform collects piecemeal profiles by sampling in both time and space during actual gameplay across many users; stitches the piecemeal profiles together statistically; and creates policies to guide future gameplay. We introduce a three-level hierarchical profiler that is well-suited to graphics APIs, that commonly operates with no overhead and occasionally introduces an average overhead of less than 0.5% during periods of active profiling. This paper examines the practicality of Cooperative PGO using three PGOs as case studies. A PGO that exploits likely zeros is particularly effective, achieving an average speedup of 5%, with a maximum speedup of 15%, over a highly-tuned baseline.Item A Halfedge Refinement Rule for Parallel Catmull-Clark Subdivision(The Eurographics Association and John Wiley & Sons Ltd., 2021) Dupuy, Jonathan; Vanhoey, Kenneth; Binder, Nikolaus and Ritschel, TobiasWe show that Catmull-Clark subdivision induces an invariant one-to-four refinement rule for halfedges that reduces to simple algebraic expressions. This has two important consequences. First, it allows to refine the halfedges of the input mesh, which completely describe its topology, concurrently in breadth-first order. Second, it makes the computation of the vertex points straightforward as the halfedges provide all the information that is needed. We leverage these results to derive a novel parallel implementation of Catmull-Clark subdivision suitable for the GPU. Our implementation supports non-quad faces, extraordinary vertices, boundaries and semi-sharp creases seamlessly. Moreover, we show that its speed scales linearly with the number of processors, and yields state-of-the-art performances on modern GPUs.