Eurographics Digital Library
This is the DSpace 7 platform of the Eurographics Digital Library.
- The contents of the Eurographics Digital Library Archive are freely accessible. Only access to the full-text documents of the journal Computer Graphics Forum (joint property of Wiley and Eurographics) is restricted to Eurographics members, people from institutions who have an Institutional Membership at Eurographics, or users of the TIB Hannover. On the item pages you will find so-called purchase links to the TIB Hannover.
- As a Eurographics member, you can log in with your email address and password from https://services.eg.org. If you are part of an institutional member and you are on a computer with a Eurographics registered IP domain, you can proceed immediately.
- From 2022, all new releases published by Eurographics will be licensed under Creative Commons. Publishing with Eurographics is Plan-S compliant. Please visit Eurographics Licensing and Open Access Policy for more details.
Recent Submissions
Neural Point-based Rendering for Immersive Novel View Synthesis
(Open FAU, 2025-05-26) Franke, Linus
Recent advances in neural rendering have greatly improved the realism and efficiency of digitizing real-world environments, enabling new possibilities for virtual experiences. However, achieving high-quality digital replicas of physical spaces is challenging due to the need for advanced 3D reconstruction and real-time rendering techniques, with visual outputs often deteriorating in challenging capturing conditions. Thus, this thesis explores point-based neural rendering approaches to address key challenges such as geometric inconsistencies, scalability, and perceptual fidelity, ultimately enabling realistic and interactive virtual scene exploration. The vision here is to enable immersive virtual reality (VR) scene exploration and virtual teleportation with the best perceptual quality for the user. This work introduces techniques to improve point-based Novel View Synthesis (NVS) by refining geometric accuracy and reducing visual artifacts. By detecting and correcting errors in point-cloud-based reconstructions, this approach improves rendering stability and accuracy. Additionally, an efficient rendering pipeline is proposed that combines rasterization with neural refinement to achieve high-quality results at real-time frame rates, ensuring smooth and consistent visual output across diverse scenes. To extend the scalability of neural point representations, a hierarchical structure is presented that efficiently organizes and renders massive point clouds, enabling real-time NVS of city-scale environments. Furthermore, a perceptually optimized foveated rendering technique is developed for VR applications, leveraging the characteristics of the human visual system to balance performance and perceptual quality. Lastly, a real-time neural reconstruction technique is proposed that eliminates preprocessing requirements, allowing for immediate virtual teleportation and interactive scene exploration. Through these advances, this thesis pushes the boundaries of neural point-based rendering, offering solutions that balance quality, efficiency, and scalability. The findings pave the way for more interactive and immersive virtual experiences, with applications spanning VR, augmented reality (AR), and digital content exploration.
Image-based 3D Reconstructions via Differentiable Rendering of Neural Implicit Representations
(2025-02-14) Tianhao Wu
Modeling objects in 3D is critical for various graphics and metaverse applications and is a fundamental step towards 3D machine reasoning, and the ability to reconstruct objects from RGB images only significantly enhances its applications. Representing objects in 3D involves learning two distinct aspects of the objects: geometry, which represents where the mass is located; and appearance, which affects the exact pixel colors to be rendered on the screen. While learning approximated appearance with known geometry is straightforward, obtaining correct geometry or recovering both simultaneously from RGB images alone has been a challenging task for a long period. The recent advancements in Differentiable Rendering and Neural Implicit Representations have significantly pushed the limits of geometry and appearance reconstruction from RGB images. Utilizing their continuous, differentiable, and less restrictive representations, it is possible to optimize geometry and appearance simultaneously from the ground truth images, leading to much better reconstruction accuracy and re-rendering quality. As one of the major neural implicit representations that have received great attention, Neural Radiance Field (NeRF) achieves clean and straightforward reconstruction of volumetric geometry and non-Lambertian appearance together from a dense set of RGB images. Various other forms of representations or modifications have also been proposed to handle specific tasks such as smooth surface modeling, sparse view reconstruction, or dynamic scene reconstruction. However, existing methods still make strict assumptions about the scenes captured and reconstructed, significantly constraining their application scenarios. For instance, current reconstructions typically assume the scene to be perfectly opaque with no semi-transparent effects, or assume no dynamic noise or occluders are included in the capture, or do not optimize rendering efficiency for high-frequency appearance in the scene. In this dissertation, we present three advancements to push the quality of image-based 3D reconstruction towards robust, reliable, and user-friendly real-world solutions. Our improvements cover all of the representation, architecture, and optimization of image-based 3D reconstruction approaches. First, we introduce AlphaSurf, a novel implicit representation with decoupled geometry and surface opacity and a grid-based architecture to enable accurate surface reconstruction of intricate or semi-transparent objects. Compared to a traditional image-based 3D reconstruction pipeline that considers only geometry and appearance, it distinguishes the calculation of the ray-surface intersection and intersection opacity differently while maintaining both to be naturally differentiable, supporting decoupled optimization from photometric loss. Specifically, intersections on AlphaSurf are found in closed-form via analytical solutions of cubic polynomials, avoiding Monte-Carlo sampling, and are therefore fully differentiable by construction, whereas additional grid-based opacity and radiance field are incorporated to allow reconstruction from RGB images only. We then consider the dynamic noise and occluders accidentally included in capture for static 3D reconstruction, as this is a common challenge encountered in the real world. This issue is particularly problematic for street scans or scenes with potential dynamic noises, such as cars, humans, or plants. We propose D^2NeRF, a method that reconstructs 3D scenes from casual mobile phone videos with all dynamic occluders decoupled from the static scene. This approach incorporates modeling of both 3D and 4D objects from RGB images and utilizes freedom constraints to achieve dynamic decoupling without semantic-based guidance. Hence, it can work on uncommon dynamic noises such as pouring liquid and moving shadows. Finally, we look into the efficiency constraint of 3D reconstruction and rendering, and specifically propose a solution for light-weight representation of scene components with simple geometry but high-frequency textures. We utilize a sparse set of anchors with correspondences from 3D to 2D texture space, enabling the high-frequency clothes on a forward-facing neural avatar to be modeled using 2D texture with neural deformation as a simplified and constrained representation. This dissertation provides a comprehensive overview of neural implicit representations and various applications in 3D reconstruction from RGB images, along with several advancements for achieving more robust and efficient reconstruction in various challenging real-world scenarios. We demonstrate that the representation, architecture, and optimization need to be specifically designed to deal with challenging obstacles in the image-based reconstruction task due to the severely ill-posed nature of the problem. With the correct design of the method, we can reconstruct translucent surfaces, remove dynamic occluders in the capture, and efficiently model high-frequency appearance from only posed multiview images or monocular video.
A micrograin formalism for the rendering of porous materials
(2024-12-06) Lucas, Simon
This thesis focuses on the impact of microscopic structures on material appearance, with a particular emphasis on porous materials. We first evaluated existing appearance models by conducting light transport simulations on sphere aggregates representing porous volumes. We found that none of the existing models accurately matched the simulations, with most errors arising from surface effects. This opened the path to the development of a specialized Bidirectional Scattering Distribution Function (BSDF) model for rendering porous layers, such as those found on surfaces covered with dust, rust, or dirt. Our model extends the Trowbridge-Reitz (GGX) distribution to handle pores between elliptical opaque micrograins and introduces a view- and light-dependent filling factor to blend porous and base layers. By adding height-normal and light-view correlations in the masking and shadowing terms, our model produces realistic effects seen in real world materials that were previously hardly obtainable like retro-reflection and height-color correlations. To improve the rendering efficiency of micrograin materials, we introduce an efficient importance sampling routine for visible Normal Distribution Functions (vNDF). Through numerical simulations, we validate the accuracy of our model. Finally, our work provides a comprehensive formalism for rendering porous layers and opens many perspectives of futur work.
How to Train Your Renderer: Optimized Methods for Learning Path Distributions in Monte Carlo Light Transport
(2025-05-06) Rath, Alexander
Light transport simulation allows us to preview architectural marvels before they break ground, practice complex surgeries without a living subject, and explore alien worlds from the comfort of our homes. Fueled by the steady advancements in computer hardware, rendering virtual scenes is more accessible than ever, and is met by an unprecedented demand for such content. Light interacts with our world in various intricate ways, hence the challenge in realistic rendering lies in tracing all the possible paths that light could take within a given virtual scene. Contemporary approaches predominantly rely on Monte Carlo integration, for which countless sampling procedures have been proposed to handle certain families of effects robustly. Handling all effects holistically through specialized sampling routines, however, remains an unsolved problem.
A promising alternative is to use learning techniques that automatically adapt to the effects present in the scene. However, such approaches require many complex design choices to be made, which existing works commonly resort to heuristics for. In this work, we investigate what constitutes effective learning algorithms for rendering – from data representation and the quantities to be learned, to the fitting process itself. By strategically optimizing these components for desirable goals, such as overall render efficiency, we demonstrate significant improvements over existing approaches.
Semantic understanding of urban scenes from textured meshes
(TU Delft Library, 2024-11-04) Gao, Weixiao
The thesis explores the semantic understanding of urban textured meshes derived from photogrammetric methods. It primarily addresses three aspects with regard to urban textured meshes: 1) semantic annotation and the creation of benchmark datasets, 2) semantic segmentation, and 3) the automation of lightweight 3D city modeling using semantic information.
The first focus of the thesis is the development of a benchmark dataset to evaluate the performance of advanced 3D semantic segmentation methods in urban settings. An interactive 3D annotation framework has been proposed to assign ground truth labels to the urban meshes' triangle faces and texture pixels. This framework achieves efficient and accurate semi-automatic annotation through segment classification and structure-aware interactive selection. In the center of Helsinki, Finland, object-level annotations were made over approximately 4 km\(^2\) (including buildings, vegetation, and vehicles, etc.), and part-level annotations over about 2.5 km\(^2\) (including building parts like doors, windows, and road markings, etc.). The design of the annotation tools improves user operation and enables quick annotation of large scenes, while the resulting datasets allow researchers to refine their deep learning models for urban analysis.
Another research focus is on mesh segmentation algorithms. A novel semantic mesh segmentation algorithm has been introduced for large-scale urban environments, employing plane-sensitive over-segmentation combined with graph-based methods for contextual data integration. This approach, which utilizes graph convolutional networks for classification, significantly improves performance over traditional techniques based on our proposed benchmark datasets.
Finally, leveraging this semantic information, a pipeline for reconstructing lightweight 3D city models has been designed. This facilitates the automated reconstruction of CityGML-based LoD2 and LoD3 city models, ensuring high fidelity in geometric detail and semantic accuracy. The reconstructed large-scale, lightweight, and semantic city models significantly broaden applications in urban spatial intelligence, including automatic geometric measurements, interactive spatial computations, spatial analysis based on external data, and environment simulation using physical engines.
This thesis enhances the practicality of 3D data in real-world applications by utilizing semantic parsing of urban textured meshes to generate lightweight 3D urban semantic models, greatly enriching their usability. It also lays a solid foundation for future progress in understanding, modeling, and analyzing 3D urban scenes.