Point Based Representations for Novel View Synthesis
dc.contributor.author | Georgios Kopanas | |
dc.date.accessioned | 2024-12-20T06:49:37Z | |
dc.date.available | 2024-12-20T06:49:37Z | |
dc.date.issued | 2023-11-03 | |
dc.description.abstract | The primary goal of inverse rendering is to recover 3D information from a set of 2D observations, usually a set of images or videos. Observing a 3D scene from different viewpoints can provide rich information about the underlying geometry, materials, and physical properties of the objects. Having access to this information allows many downstream applications. In this dissertation, we will focus on free-viewpoint navigation and novel view synthesis, which is the task of re-rendering the captured 3D scenes from unobserved viewpoints.The field has gained incredible momentum after the introduction of Neural Radiance Fields or NeRFs. While NeRFs achieve exceptional image quality on novel view synthesis it is not the only reason they managed to attract high engagement with the community. Another important property is the simplicity of the optimization, since they frame the problem of 3D reconstruction as a continuous optimization problem over the parameters of a scene representation with a simple photometric objective function.In this thesis, we will keep these two advantages of NeRFs and propose points as new way to represent Radiance Fields that not only achieves state-of-the-art results on image quality but also achieves real-time rendering at over 100 frames per second. Our solution also offers fast training with tractable memory footprint, and is easily integrated into graphics engines.In a traditional image-based rendering context, we propose a point-based representation with a differentiable rasterization pipeline, that optimizes geometry and appearance to achieve high visual quality for novel view synthesis. Next, we use points to tackle highly reflective curved objects -- arguably one of the hardest cases of novel view synthesis-- by learning the trajectory of reflections. In our latest work we show for the first time that points, augmented to become anisotropic 3D Gaussians, can maintain the differentiable properties of NeRFs but also manage to recover higher frequency signals and represent empty space more efficiently. At the same time, their Lagrangian nature and the fact that, in the most recent method, we manage to omit the use of Neural Networks allows us to have an explicit and interpretable geometry and appearance representation.Finally, in this dissertation, we will also briefly touch two other interesting topics. The first topic is how to place cameras to efficiently capture a scene for the purpose of 3D reconstruction. In complicated non-object-centric environments, we provide theoretical and practical intuition on how to place cameras that allow good reconstruction.The second topic is motivated by the recent success of generative models: we study how we can utilize point-based representations with diffusion models. Current methods impose extreme limitations on the number of points. In exploratory work, we suggest an architecture that leverages multi-view information as a tool to de-correlate the number of points with the speed and performance of the model.We conclude this dissertation by reflecting on the work performed throughout the thesis and sketching some exciting directions for future work. | |
dc.identifier.uri | https://diglib.eg.org/handle/10.2312/3607094 | |
dc.title | Point Based Representations for Novel View Synthesis |