EGGH: SIGGRAPH/Eurographics Workshop on Graphics Hardware
Permanent URI for this community
Browse
Browsing EGGH: SIGGRAPH/Eurographics Workshop on Graphics Hardware by Title
Now showing 1 - 20 of 546
Results Per Page
Sort Options
Item A 2nd generation autostereoscopic 3-D display(The Eurographics Association, 1992) Lang, S.R.; Travis, A.R.L.; Castle, O.M.; Moore, l.R.; P F ListerItem A 33MHz 16. Bit Gradient Calculator for Real-Time Volume Imaging(The Eurographics Association, 1994) Margala, Martin; Durdle, Nelson G.; Raso, V. James; Hill, Doug L.; W. StrasserThis paper describes a gradient calculator which forms an important part of a shading processor being developed for a high resolution high performance real-time general purpose volume imaging system. The proposed architecture overcomes cunent image resolution and frame-rate limitations through the use of custom high-speed processors. The gradient calculator evaluates three arithmetic operations: a square and add operation, square-root, and three division operations. Input output delay time is 30 os with an accuracy of ±O.78%. The algorithms and implementation in silicon are described in detail.Item 3D Graphics For Consumer Applications -How Realistic Does it Have to Be?(The Eurographics Association, 1987) Winser, Paul; Fons Kuijk and Wolfgang StrasserThe design of graphics Ie's for the consumer market has performance limitations imposed by the need to maintain low cost, and must be driven by consideration of the potential applications. The likely requirements for a consumer aimed real time 3D graphics system are stated in terms of performance and rendering techniques, and a research prototype of a 3D display processor is presented. The processor performs polygon drawing with smooth shading, Z buffer, and texture mapping into standard memory components. Limitations of the system and necessary image quality improvements are discussed.Item 3D Graphics LSI Core for Mobile Phone "Z3D"(The Eurographics Association, 2003) Kameyama, Masatoshi; Kato, Yoshiyuki; Fujimoto, Hitoshi; Negishi, Hiroyasu; Kodama, Yukio; Inoue, Yoshitsugu; Kawai, Hiroyuki; M. Doggett and W. Heidrich and W. Mark and A. SchillingIn this paper we describe the architecture of the 3D graphics LSI core for mobile phone "Z3D". The major 3D graphics applications on mobile phones are character animation and games. While a character animation or a game is running, the CPU has to be used for the communication to the center machine and CPU clock frequency is low. Therefore, the requirement of Z3D is small, low power, and CPU free. The pipeline of Z3D is composed of a geometry engine, rendering engine, and pixel engine. Generally, these modules run in pipeline on PC, but running in pipeline, power consumption rises. We used gated clock control for low power consumption and having the pipeline work in sequential, we could reduce 40% of the power consumption with 10% performance decreasing. The geometry processing performance of Z3D is up to 185Kvertex/sec and the pixel performance is 5Mpixel/sec. This performance is enough to have character animations and games run. The data of 3D shape and animation can be defined using common 3D modeler and contents program can be described by Java. Low level rendering interface and animation rendering interface are provided as a Java API. All the contents programs can be downloaded from network.Item Accelerated Single Ray Tracing for Wide Vector Units(ACM, 2017) Fuetterling, Valentin; Lojewski, Carsten; Pfreundt, Franz-Josef; Hamann, Bernd; Ebert, Achim; Vlastimil Havran and Karthik VaiyanathanUtilizing the vector units of current processors for ray tracing single rays through Bounding Volume Hierarchies has been accomplished by increasing the branching factor of the acceleration structure to match the vector width. A high branching factor allows vectorized bounding box tests but requires a complex control flow for the calculation of a front-to-back traversal order. We propose a novel algorithm for single rays entirely based on vector operations that performs a complete traversal iteration in constant time, ideally suited for current and future micro architectures featuring wide vector units. In addition we use our single ray technique as a building block to construct a fast packet traversal for coherent rays. We validate our algorithms with implementations utilizing the AVX2 and AVX-512 instruction sets and demonstrate significant performance gains over state-of-the-art solutions.Item Accelerating Polygon Clipping(The Eurographics Association, 1992) Schneider, Bengt-Olaf; P F ListerPolygon clipping is a central part of image generation and image visualization systems.In spite of its algorithmic simplicity it consumes a considerable amount of hardware or software resources. Polygon clipping performance is dominated by two processes: intersection calculations and data transfers. The paper analyzes the prevalent Sutherland-Hodgman algorithm for polygon clippingand identifies cases for which this algorithm performs inefficiently. Such casesare characterized by subsequent vertices in the input polygon that share a commonregion, e. g. a common halfspace.The paper will present new techniques that detect such constellations and simplifythe input polygon such that the Sutherland-Hodgman algorithm runs more efficiently. Block diagrams and pseudo-code demonstrate that the new techniques are well suited for both hardware and software implementations. Finally, the paper discusses the results of a prototype implementation of the presented techniques. The analysis compares the performance of the new techniquesto the traditional Sutherland-Hodgman algorithm for different test scenes. The new techniques reduce the number data transfers by up to 90 % and the number of intersection calculations by up to 60 %.Item Accelerating Real-Time Shading with Reverse Reprojection Caching(The Eurographics Association, 2007) Nehab, Diego; Sander, Pedro V.; Lawrence, Jason; Tatarchuk, Natalya; Isidoro, John R.; Mark Segal and Timo AilaEvaluating pixel shaders consumes a growing share of the computational budget for real-time applications. However, the significant temporal coherence in visible surface regions, lighting conditions, and camera location allows reusing computationally-intensive shading calculations between frames to achieve significant performance improvements at little degradation in visual quality. This paper investigates a caching scheme based on reverse reprojection which allows pixel shaders to store and reuse calculations performed at visible surface points. We provide guidelines to help programmers select appropriate values to cache and present several policies for keeping cached entries up-to-date. Our results confirm this approach offers substantial performance gains for many common real-time effects, including precomputed global lighting effects, stereoscopic rendering, motion blur, depth of field, and shadow mapping.Item Accelerating Shadow Rays Using Volumetric Occluders and Modified kd-Tree Traversal(The Eurographics Association, 2009) Djeu, Peter; Keely, Sean; Hunt, Warren; David Luebke and Philipp SlusallekMonte Carlo ray tracing remains a simple and elegant method for generating robust shadows. This approach, however, is often hampered by the time needed to evaluate the numerous shadow ray queries required to generate a high-quality image. We propose the use of volumetric occluders stored within a kd-tree in order to accelerate shadow rays cast on a closed, watertight mesh. Intersection with a volumetric occluder is much cheaper than intersection with mesh geometry, although performing these intersections requires modification to the traversal order through the kd-tree. We propose two such modifications, both of which enable the use of volumetric occluders for cheap shadow ray termination. We also propose using a software-managed cache to store and reuse volumetric occluders for even earlier termination. Our approach provides a performance improvement of up to 2.0x for our test scenes while producing images identical to those produced by the unaccelerated baseline.Item Accommodating Memory Latency In A Low-cost Rasterizer(The Eurographics Association, 1997) Anderson, Bruce; MacAulay, Rob; Stewart, Andy; Whitted, Turner; A. Kaufmann and W. Strasser and S. Molnar and B.-O. SchneiderThis paper describes design tradeoffs in a very low cost rasterizer circuit targeted for use in a video game console. The greatest single factor affecting such a design is the character of memory to which the image generator is connected. Low costs generally constrain the memory dedicated to image generation to be a single package with a single set of address and data lines. While overall memory bandwidth determines the upper limit of performance in such a small image generator, memory latency has a far greater effect on the design. The use of Rambus memory provides more than enough aggregate bandwidth for a frame buffer as long as blocks of pixels are moved in each transfer, but its high latency can stall any processor not matched to the memory. The design described here utilizes a long pixel pipeline to match its internal processing latency to the external frame buffer memory latency.Item Accurate Scanconversion of Triangulated Surfaces(The Eurographics Association, 1991) Rossignac, Jarek R.; A. KaufmanScanconverting a planar face produces depth-values for pixels totally or partly covered by the projection of that face. State-of-the-art hardware-supported scanconversion techniques use sub pixel adjustment and extended precision calculations to achieve an acceptable depth-accuracy despite numeric round-off errors. Unfortunately, this depth-accuracy only holds for the interior pixels of the face. During the scanconversion of the boundaries of polyhedral solids or of the tesselations of curved surfaces, significantly larger depth-errors may occur at pixels traversed by the projection of the bounding edges. These errors are due to the use of the wrong surface equations resulting from an erroneous classification of pixels with respect to the projections of faces. They may lead to logical mistakes of serious consequences for hidden-surface removal and for solid-modeling applications. To address this problem, a new scanconversion technique is presented, which exploits surface data and face/face adjacency information to infer face-projections. For simplicity, the exposition is confined to triangular faces of manifolds, where each edge is adjacent to two triangles. At pixels covered by the projection of an edge, the surface depth computed in the standard manner is compared to the depth of the surface supporting the adjacent triangle. Pixel classification is obtained by taking into account the result of this comparison and the orientations of both faces.Item Active Thread Compaction for GPU Path Tracing(ACM, 2011) Wald, Ingo; Carsten Dachsbacher and William Mark and Jacopo PantaleoniModern GPUs like NVidia s Fermi internally operate in a SIMD manner by ganging multiple (32) scalar threads together into SIMD warps; if a warp s threads diverge, the warp serially executes both branches, temporarily disabling threads that are not on that path. In this paper, we explore and thoroughly analyze the concept of active thread compaction i.e., the process of taking multiple partially-filled warps and compacting them to fewer but fully utilized warps in the context of a CUDA path tracer. Our results show that this technique can indeed lead to significant improvements in SIMD utilization, and corresponding savings in theamount of work performed; however, they also show that certain inadequacies of today s hardware wipe out most of the achieved gains, leaving bottom-up speed-ups of a mere 12 16%. We believe our analysis of why this is the case will provide insight to otherresearchers experimenting with this technique in different contexts.Item An Adaptive Acceleration Structure for Screen-space Ray Tracing(ACM Siggraph, 2015) Widmer, S.; Pajak, D.; Schulz, A.; Pulli, K.; Kautz, J.; Goesele, M.; Luebke, D.; Petrik Clarberg and Elmar EisemannWe propose an efficient acceleration structure for real-time screenspace ray tracing. The hybrid data structure represents the scene geometry by combining a bounding volume hierarchy with local planar approximations. This enables fast empty space skipping while tracing and yields exact intersection points for the planar approximation. In combination with an occlusion-aware ray traversal our algorithm is capable to quickly trace even multiple depth layers. Compared to prior work, our technique improves the accuracy of the results, is more general, and allows for advanced image transformations, as all pixels can cast rays to arbitrary directions. We demonstrate real-time performance for several applications, including depth-of-field rendering, stereo warping, and screen-space ray traced reflections.Item Adaptive Hierarchical Visibility in a Tiled Architecture(The Eurographics Association, 1999) Xie, Feng; Shantz, Michael; A. Kaufmann and W. Strasser and S. Molnar and B.- O. SchneiderThis paper describes a method for occlusion culling in a tiled 3D graphics hardware architecture. Adaptive hierarchical visibility (AHV) is a simplified method for occlusion culling that is integrated into a tiled architecture for hardware rendering. AI-IV constructs a list of polygon bins for each tile where the bins are bucket sorted in order of increasing depth or Z. Polygon bins are rendered starting with the bin closest to the viewer. After some number of bins are rendered, a one layer, hierarchical Zbuffer (HZ) is constructed from the Z-buffer thus far accumulated for the rendered bins. Subsequent bins are rendered by first testing their polygons against the HZ to see if they are hidden. AHV is far simpler to implement in hardware and gives performance that matches or surpasses progressive hierarchical visibility (PHV) methods which update the HZ for each rendered pixel. Results show that AI-IV is superior on scenes with high depth complexity and small polygons. For tiles of widely ranging statistics, AHV competes surprisingly well with PHV. It offers dramatic performance improvement on low cost hardware for scenes of high depth complexity.Item Adaptive Image Space Shading for Motion and Defocus Blur(The Eurographics Association, 2012) Vaidyanathan, Karthik; Toth, Robert; Salvi, Marco; Boulos, Solomon; Lefohn, Aaron; Carsten Dachsbacher and Jacob Munkberg and Jacopo PantaleoniWe present a novel anisotropic sampling algorithm for image space shading which builds upon recent advancements in decoupled sampling for stochastic rasterization pipelines. First, we analyze the frequency content of a pixel in the presence of motion and defocus blur.We use this analysis to derive bounds for the spectrum of a surface defined over a two-dimensional and motion-aligned shading space. Second, we present a simple algorithm that uses the new frequency bounds to reduce the number of shaded quads and the size of decoupling cache respectively by 2X and 16X, while largely preserving image detail and minimizing additional aliasing.Item Adaptive Sampling for On-The-Fly Ray Casting of Particle-based Fluids(The Eurographics Association, 2016) Hochstetter, Hendrik; Orthmann, Jens; Kolb, Andreas; Ulf Assarsson and Warren HuntWe present a fast and accurate ray casting technique for unstructured and dynamic particle sets. Our technique focuses on efficient, high quality volume rendering of fluids for computer animation and scientific applications. Our novel adaptive sampling scheme allows to locally adjust sampling rates both along rays and in lateral direction and is driven by a user-controlled screen space error tolerance. In order to determine appropriate local sampling rates, we propose a sampling error analysis framework based on hierarchical interval arithmetic. We show that our approach leads to significant rendering speed-ups with controllable screen space errors. Efficient particle access is achieved using a sparse view-aligned grid which is constructed on-the-fly without any pre-processing.Item Adaptive Scalable Texture Compression(The Eurographics Association, 2012) Nystad, Jorn; Lassen, Anders; Pomianowski, Andy; Ellis, Sean; Olson, Tom; Carsten Dachsbacher and Jacob Munkberg and Jacopo PantaleoniWe describe a fixed-rate, lossy texture compression system that is designed to offer an unusual degree of flexibility and to support a very wide range of use cases, while providing better image quality than most formats in common use today. The system supports both 2D and 3D textures, at both standard and high dynamic range, at bit rates ranging from eight bits per pixel down to less than one bit per pixel in very fine steps. At any bit rate, texels can have from one to four color components. The system's flexibility results from a number of novel features. Color spaces and weights are represented using an encoding scheme that allows flexible allocation of bits between different types of information. The system uses bilinear interpolation to derive color space coordinates for a texel from sparse samples, and uses a procedural partition function to map texels to color spaces.Item Adaptive Temporal Antialiasing(ACM, 2018) Marrs, Adam; Spjut, Josef; Gruen, Holger; Sathe, Rahul; McGuire, Morgan; Patney, Anjul and Niessner, MatthiasWe introduce a pragmatic algorithm for real-time adaptive supersampling in games. It extends temporal antialiasing of rasterized images with adaptive ray tracing, and conforms to the constraints of a commercial game engine and today's GPU ray tracing APIs. The algorithm removes blurring and ghosting artifacts associated with standard temporal antialiasing and achieves quality approaching 8× supersampling of geometry, shading, and materials while staying within the 33ms frame budget required of most games.Item Adaptive Texture Maps(The Eurographics Association, 2002) Kraus, Martin; Ertl, Thomas; Thomas Ertl and Wolfgang Heidrich and Michael DoggettWe introduce several new variants of hardware-based adaptive texture maps and present applications in two, three, and four dimensions. In particular, we discuss representations of images and volumes with locally adaptive resolution, lossless compression of light fields, and vector quantization of volume data. All corresponding texture decoders were successfully integrated into the programmable texturing pipeline of commercial off-the-shelf graphics hardware.Item Adaptive Transparency(ACM, 2011) Salvi, Marco; Montgomery, Jefferson; Lefohn, Aaron; Carsten Dachsbacher and William Mark and Jacopo PantaleoniAdaptive transparency is a new solution to order-independent transparency that closely approximates the ground-truth results obtained with A-buffer compositing but, like a Z-buffer, operates in bounded memory and exhibits consistent performance. The key contributionof our method is an adaptively compressed visibility representation that can be efficiently constructed and queried while rendering. The algorithm supports a wide range and combination of transparent geometry (e.g., foliage, windows, hair, and smoke). We demonstrate that adaptive transparency is five to forty times faster than realtimeA-buffer implementations, closely matches the image quality, and is both higher quality and faster than other approximate orderindependent transparency techniques: stochastic transparency, uniform opacity shadow maps, and Fourier opacity mapping.Item Adaptive View Dependent Tessellation of Displacement Maps(The Eurographics Association, 2000) Doggett, Michael; Hirche, Johannes; I. Buck and G. Humphreys and P. HanrahanDisplacement Mapping is an effective technique for encoding the high levels of detail found in today s triangle based surface models. Extending the hardware rendering pipeline to be capable of handling displacement maps as geometric primitives, will allow highly detailed models to be constructed without requiring large numbers of triangles to be passed from the CPU to the graphics pipeline. We present a new approach based on recursive tessellation that adapts to the surface complexity described by the displacement map. We also ensure that the resolution of the displaced mesh is tessellated with respect to the current view point. Our tessellation scheme performs all tests only on triangle edges to avoid generating cracks on the displaced surface. The main decision for vertex insertion is based on two comparisons involving the average height surrounding the vertices and the normals at the vertices. Individually, the tests will fail to tessellate a mesh satisfactorily, but their combination achieves good results. We propose several additions to the typical hardware rendering pipeline in order to achieve displacement map rendering in hardware. The mesh tessellation is placed within the rendering pipeline so that we can take advantage of the pre-existing vertex transformation units to perform the setup calculations for our view dependent test. Our method adds only simple arithmetic and comparison operations to the graphics pipeline and makes use of existing units for calculations wherever possible.