EGGH98: SIGGRAPH/Eurographics Workshop on Graphics Hardware 1998
Permanent URI for this collection
Browse
Browsing EGGH98: SIGGRAPH/Eurographics Workshop on Graphics Hardware 1998 by Issue Date
Now showing 1 - 15 of 15
Results Per Page
Sort Options
Item IMEM: An Intelligent Memory for Bump- and Reflection-Mapping(The Eurographics Association, 1998) Kugler, Anders; S. N. SpencerData path simplification in the context of reflection- and bumpmapping hardware opens new solutions in the design of rendering and shading circuits. We are proposing a novel approach to rendering bump- and reflection-mapped surfaces, where the local geometry defining bump-maps is transformed on-the-fly prior to surface shading. Applying angular encoding to normal vectors results in narrower data paths and permits hardware integration of look-up tables of acceptable size. A special-purpose logic-embedded memory architecture is presented, where bump- and reflection-mapping of textured surfaces are executed by an intelligent memory device. High-performance surface shading is achieved by making use of precomputed shading- and reflection-map coordinate generation tables, and considering cache coherence of pixel-to-pixel normal vectors. Such a dedicated memory chip can easily be interfaced to a standard rasterizer, in place of texture memory to offer bump-, texture- and reflection-mapping hardware support.Item Performance Issues of a Distributed Frame Buffer on a Multicomputer(The Eurographics Association, 1998) Wei, Bin; Clark, Douglas W.; Felten, Edward W.; Li, Kai; S. N. SpencerA multiple-port, distributed frame buffer has been recently proposed to support parallel rendering on multicomputers. This paper describes an implementation of such a distributed frame buffer for the Intel Paragon routing network, and reports its performance results. We have conducted several experiments with the system we have developed. Our results indicate that placing a multipleport, distributed frame buffer directly on the host internal routing network can provide high throughput to eliminate the bottleneck of merging a final image from multiple processors to a frame buffer. This architectural approach can also effectively support image composition for sort-last. The synchronization algorithm we have developed requires only one-way communication and minimizes receive overhead for message passing to the frame buffer.Item Unsolved Problems and Opportunities for High-quality, High-performance 3D Graphics on a PC Platform(The Eurographics Association, 1998) Kirk, David B.; S. N. SpencerIn the late 1990 s, graphics hardware is experiencing a dramatic board-to-chip integration reminiscent to the minicomputer-to-microprocessor revolution of the 1980 s. Today, mass-market PCs are beginning to match the 3D polygon and pixel rendering of a 1992 Silicon Graphics Reality EngineTM system. The extreme pace of technology evolution in the PC market is such that within 1 or 2 years the performance of a mainstream PC will be very close to the highest performance 3D workstations. At that time, the quality and performance demands will dictate serious changes in PC architecture as well as changes in rendering pipeline and algorithms. This paper will discuss several potential areas of change.Item Neon: A Single-Chip 3D Workstation Graphics Accelerator(The Eurographics Association, 1998) McCormack, Joel; McNamara, Robert; Gianos, Christopher; Seiler, Larry; Jouppi, Norman P.; Correll, Ken; S. N. SpencerHigh-performance 3D graphics accelerators traditionally require multiple chips on multiple boards, including geometry, rasterizing, pixel processing, and texture mapping chips. These designs are often scalable: they can increase performance by using more chips. Scalability has obvious costs: a minimal configuration needs several chips, and some configurations must replicate texture maps. A less obvious cost is the almost irresistible temptation to replicate chips to increase performance, rather than to design individual chips for higher performance in the first place. In contrast, Neon is a single chip that performs like a multichip design. Neon accelerates OpenGL [19] 3D rendering, as well as X11 [20] and Windows/NT 2D rendering. Since our pin budget limited peak memory bandwidth, we designed Neon from the memory system upward in order to reduce bandwidth requirements. Neon has no special-purpose memories; its eight independent 32-bit memory controllers can access color buffers, 1. depth buffers, stencil buffers, and texture data. To fit our gate budget, we shared logic among different operations with similar implementation requirements, and left floating point calculations to Digital s Alpha CPUs. Neon s performance is between HP s Visualize fx<sup>4</sup> and fx<sup>6</sup>, and is well above SGI s MXE for most operations. Neon-based boards cost much less than these competitors, due to a small part count and use of commodity SDRAMs.Item PAVLOV: A Programmable Architecture for Volume Processing(The Eurographics Association, 1998) Kreeger, Kevin; Kaufman, Arie; S. N. SpencerWe present a parallel 2D mesh connected architecture with SIMD processing elements. The design allows for real-time volume rendering as well as interactive 30 segmentation and 1D feature extraction. This is possible because the SIMD processing elements are programmable, a feature which also allows the use of many different rendering algorithms. We present an algorithm which, with the addition of hardware resources, provides conflict free access to volume slices along any of the three major axes. The volume access conflict has been the main reason why previous similar architectures could not perform real-time volume rendering. We present the performance of preliminary algorithms on a software simulator of the architecture design.Item Prefetching in a Texture Cache Architecture(The Eurographics Association, 1998) lgehy, Homan; Eldridge, Matthew; Proudfoot, Kekoa; S. N. SpencerTexture mapping has become so ubiquitous in real-time graphics hardware that many systems are able to perform filtered texturing without any penalty in fill rate. The computation rates available in hardware have been outpacing the memory access rates, and texture systems are becoming constrained by memory bandwidth and latency. Caching in conjunction with prefetching can be used to alleviate this problem. In this paper, WC introduce a prefetching texture cache architecture designed to take advantage of the access characteristics of texture mapping. The structures needed are relatively simple and arc amenable to high clock rates. To quantify the robustness of our architecture, we identify a set of six scenes whose texture locality varies over nearly two orders of magnitude and a set 01 four memory systems with varying bandwidths and latencies. Through the use of a cycle-accurate simulation, we demonstrate that even in the presence of a high-latency memory system, our architecture can attain at least 97% of the performance of a zerolatency memory system.Item An Improved Z-Buffer CSG Rendering Algorithm(The Eurographics Association, 1998) Stewart, Nigel; Leach, Geoff; John, Sabu; S. N. SpencerWe present an improved z-buffer based CSG rendering algorithm, based on previous techniques using z-buffer parity based surface clipping. We show that while this type of algorithm has been reported as requiring O(n2), (where n is the number of primitives), an O(lcn) (where k is depth complexity) algorithm may be substituted. For cases where k is less than n this translates into a significant performance gain.Item Quadratic Bezier Triangles As Drawing Primitives(The Eurographics Association, 1998) Bruijns, J.; S. N. SpencerWe propose to use quadratic Bezier triangles as additional drawing primitives: quadratic Bezier triangles require much less model data for faithful representation of curved surfaces than planar triangles. Therefore, they require less storage and/or transmission capacity. Furthermore, they allow automatic level-of-detail. Finally, they result in considerable savings in model-view transformations and lighting calculations. We present two algorithms for rendering these triangles, each of which can be easily incorporated in hardware render systems currently used for planar triangles.Item Simple Models of the Impact of Overlap in Bucket Rendering(The Eurographics Association, 1998) Chen, Milton; Stall, Gordon; Igehy, Homan; Proudfoot, Kekoa; Hanrahan, Pat; S. N. SpencerBucket rendering is a technique in which the framebuffer is subdivided into coherent regions that are rendered independently. The primary benelits of this technique are the decrease in the size of the working set of framebuffer memory required during rendering and the possibility of processing multiple regions in parallel. The drawbacks of this technique are the cost of computing the regions overlapped by each triangle and the redundant work required in processing triangles multiple times when they overlap multiple regions, Tile size is a critical parameter in bucket rendering systems: smaller tile sizes allow smaller memory footprints and better parallel load balancing but exacerbate the problem of redundant computation. In this paper, we use mathematical models, instrumentation, and trace-driven simulation to evaluate the impact of overlap and conclude that the problem of overlap is limited in scope. If triangles are small, the overlap factor itself is also small. If triangles are large, overlap is high but pixel work dominates the rendering time. In pipelined rendering systems, the worst-case impact of overlap occurs when the area of an input triangle is equal to the area for which the pipeline is balanced-that is, the trianglerelated computation time is equal to the pixel-related computation time. Thus, as the current trends of exponentially increasing triangle rate, slowly increasing screen resolution, and increasing per-pixel computation continue to push this balance point toward triangles with smaller area, bucket rendering systems will be able to utilize smaller tiles efficiently.Item Extending Graphics Hardware For Occlusion Queries In OpenGL(The Eurographics Association, 1998) Bartz, Dirk; Meißner, Michael; Hüttner, Tobias; S. N. SpencerFor interactive rendering of large polygonal objects, fast visibility queries are necessary to quickly decide whether polygonal objects are visible and need to be rendered. None of the numerous published algorithms provide visibility performance for interactive rendering of large models. In this paper, we propose an OpenGL extension for fast occlusion queries. Added after the depth test stage of the OpenGL rendering pipeline. our algorithm provides fast queries to establish the occlusion of polygonal objects. Furthermore, hardware aspects of this proposal are discussed and possible implementations on two different graphics architectures are presented.Item High-Quality Volume Rendering Using Texture Mapping Hardware(The Eurographics Association, 1998) Dachille, Frank; Kreeger, Kevin; Chen, Baoquan; Bitter, Ingmar; Kaufman, Arie; S. N. SpencerWe present a method Jor volume rendering of regular grids which takes advantage of 3D texture mapping hardware currently, available on graphics workstations. Our method products accurate shading for arbitrary and dynamically changing directional lights, viewing parameters, and transfer functions. This is achieved by hardware interpolating the data values and gradients before software classification and shading. The method works equally well for parallel and perspective projections. We present two approaches for OUT method: one which takes advantage of software ray casting optimizations and another which takes advantage of hardware blending acceleration.Item View-independent Environment Maps(The Eurographics Association, 1998) Heidrich, Wolfgang; Seidel, Hans-Peter; S. N. SpencerEnvironment maps are widely used for approximating reflections in hardware-accelerated rendering applications. Unfortunately, the parameterizations for environment maps used in today s graphics hardware severely undersample certain directions, and can thus not be used from multiple viewing directions. Other parameterizations exist, but require operations that would be too expensive for hardware implementations. In this paper we introduce an inexpensive new parameterization for environment maps that allows us to reuse the environment map for any given viewing direction. We describe how, under certain restrictions, these maps can be used today in standard OpenGL implementations. Furthermore, we explore how OpenGL could be extended to support this kind of environment map more directly.Item Texture Tile Visibility Determination For Dynamic Texture Loading(The Eurographics Association, 1998) Goss, Michael E.; Yuasa, Kei; S. N. SpencerThree-dimensional scenes have become an important form of content deliverable through the Internet. Standard formats such as Virtual Reality Modeling Language (VRML) make it possible to dynamically download complex scenes from a server directly to a web browser. However, limited bandwidth between servers and clients presents an obstacle to the availability of more complex scenes, since geometry and texture maps for a reasonably complex scene may take many minutes to transfer over a typical telephone modem link. This paper addresses one part of the bandwidth bottleneck, texture transmission. Current display methods transmit an entire texture to the client before it can be used for rendering. We present an alternative method which subdivides each texture into tiles, and dynamically determines on the client which tiles are visible to the user. Texture tiles are requested by the client in an order determined by the number of screen pixels affected by the texture tile, so that texture tiles which affect the greatest number of screen pixels are transmitted first. The client can render images during texture loading using tiles which have already been loaded. The tile visibility calculations take full account of occlusion and multiple texture image resolution levels, and are dynamically recalculated each time a new frame is rendered. We show how a few additions to the standard graphics hardware pipeline can add this capability without radical architecture changes, and with only moderate hardware cost. The addition of this capability makes it practical to use large textures even over relatively slow network connections.Item A Breadth-First Approach To Efficient Mesh Traversal(The Eurographics Association, 1998) Mitra, Tulika; Chiueh, Tzi-cker; S. N. SpencerComplex 3D polygonal models are typically represented as triangular meshes, especially when they are generated procedurally, or created from volumetric data sets through surface extraction. Existing 3D rendering hardware, on the other hand, processes one triangle at a time. Therefore triangle meshes need to be converted to individual triangles when they are fed to the graphics pipeline. The design goal of such conversion algorithms is to minimize the number of vertices that are sent redundantly to the rendering pipeline. This paper proposes a breadth-first approach to traverse triangle meshes that reduces vertex redundancy to very close to the theoretical minimum. With the proposed scheme, no triangle vertices need to be specified multiple times, barring exceptional cases. In addition, owing to a prefetching technique, the on-chip storage requirement for effective mesh traversal remains small and largely constant regardless of the mesh size. Our experimental results show that assuming a 64-vertex buffer, the redundant transformation overhead associated with the proposed approach is between 1.00% and 7.33%, for a set of 8 triangle meshes whose size ranges from 2,992 to 40,000 triangles.Item Gouraud Bump Mapping(The Eurographics Association, 1998) Ernst, I.; Rüsseler, H.; Schulz, H.; Wittig, 0.; S. N. SpencerIn this paper a new low cost bump mapping hardware is prcsented. The new hardware approach does not rely on per pixel lighting, but instead uses Gouraud interpolated triangles. The bump mapping effect is applied by blending the calculated per pixel bump map color onto the fragment s color. This allows realtime animated distant light-sources to react on the specified bump map. The paper further investigates a number of different variants of recently proposed bump engines. These variants range from lowend PC solution to highest quality high-end solutions.