38-Issue 7

Permanent URI for this collection

https://diglib.eg.org/handle/10.2312/2632813

Browse

Now showing 1 - 20 of 70

Active Scene Understanding via Online Semantic Reconstruction
(The Eurographics Association and John Wiley & Sons Ltd., 2019) Zheng, Lintao; Zhu, Chenyang; Zhang, Jiazhao; Zhao, Hang; Huang, Hui; Niessner, Matthias; Xu, Kai; Lee, Jehee and Theobalt, Christian and Wetzstein, Gordon
We propose a novel approach to robot-operated active understanding of unknown indoor scenes, based on online RGBD reconstruction with semantic segmentation. In our method, the exploratory robot scanning is both driven by and targeting at the recognition and segmentation of semantic objects from the scene. Our algorithm is built on top of a volumetric depth fusion framework and performs real-time voxel-based semantic labeling over the online reconstructed volume. The robot is guided by an online estimated discrete viewing score field (VSF) parameterized over the 3D space of 2D location and azimuth rotation. VSF stores for each grid the score of the corresponding view, which measures how much it reduces the uncertainty (entropy) of both geometric reconstruction and semantic labeling. Based on VSF, we select the next best views (NBV) as the target for each time step. We then jointly optimize the traverse path and camera trajectory between two adjacent NBVs, through maximizing the integral viewing score (information gain) along path and trajectory. Through extensive evaluation, we show that our method achieves efficient and accurate online scene parsing during exploratory scanning.
Anisotropic Surface Remeshing without Obtuse Angles
(The Eurographics Association and John Wiley & Sons Ltd., 2019) Xu, Qun-Ce; Yan, Dong-Ming; Li, Wenbin; Yang, Yong-Liang; Lee, Jehee and Theobalt, Christian and Wetzstein, Gordon
We present a novel anisotropic surface remeshing method that can efficiently eliminate obtuse angles. Unlike previous work that can only suppress obtuse angles with expensive resampling and Lloyd-type iterations, our method relies on a simple yet efficient connectivity and geometry refinement, which can not only remove all the obtuse angles, but also preserves the original mesh connectivity as much as possible. Our method can be directly used as a post-processing step for anisotropic meshes generated from existing algorithms to improve mesh quality. We evaluate our method by testing on a variety of meshes with different geometry and topology, and comparing with representative prior work. The results demonstrate the effectiveness and efficiency of our approach.
Appearance Flow Completion for Novel View Synthesis
(The Eurographics Association and John Wiley & Sons Ltd., 2019) Le, Hoang; Liu, Feng; Lee, Jehee and Theobalt, Christian and Wetzstein, Gordon
Novel view synthesis from sparse and unstructured input views faces challenges like the difficulty with dense 3D reconstruction and large occlusion. This paper addresses these problems by estimating proper appearance flows from the target to input views to warp and blend the input views. Our method first estimates a sparse set 3D scene points using an off-the-shelf 3D reconstruction method and calculates sparse flows from the target to input views. Our method then performs appearance flow completion to estimate the dense flows from the corresponding sparse ones. Specifically, we design a deep fully convolutional neural network that takes sparse flows and input views as input and outputs the dense flows. Furthermore, we estimate the optical flows between input views as references to guide the estimation of dense flows between the target view and input views. Besides the dense flows, our network also estimates the masks to blend multiple warped inputs to render the target view. Experiments on the KITTI benchmark show that our method can generate high quality novel views from sparse and unstructured input views.
Automatic Modeling of Cluttered Multi-room Floor Plans From Panoramic Images
(The Eurographics Association and John Wiley & Sons Ltd., 2019) Pintore, Giovanni; Ganovelli, Fabio; Villanueva, Alberto Jaspe; Gobbetti, Enrico; Lee, Jehee and Theobalt, Christian and Wetzstein, Gordon
We present a novel and light-weight approach to capture and reconstruct structured 3D models of multi-room floor plans. Starting from a small set of registered panoramic images, we automatically generate a 3D layout of the rooms and of all the main objects inside. Such a 3D layout is directly suitable for use in a number of real-world applications, such as guidance, location, routing, or content creation for security and energy management. Our novel pipeline introduces several contributions to indoor reconstruction from purely visual data. In particular, we automatically partition panoramic images in a connectivity graph, according to the visual layout of the rooms, and exploit this graph to support object recovery and rooms boundaries extraction. Moreover, we introduce a plane-sweeping approach to jointly reason about the content of multiple images and solve the problem of object inference in a top-down 2D domain. Finally, we combine these methods in a fully automated pipeline for creating a structured 3D model of a multi-room floor plan and of the location and extent of clutter objects. These contribution make our pipeline able to handle cluttered scenes with complex geometry that are challenging to existing techniques. The effectiveness and performance of our approach is evaluated on both real-world and synthetic models.
A Color-Pair Based Approach for Accurate Color Harmony Estimation
(The Eurographics Association and John Wiley & Sons Ltd., 2019) Yang, Bailin; Wei, Tianxiang; Fang, Xianyong; Deng, Zhigang; Li, Frederick W. B.; Ling, Yun; Wang, Xun; Lee, Jehee and Theobalt, Christian and Wetzstein, Gordon
Harmonious color combinations can stimulate positive user emotional responses. However, a widely open research question is: how can we establish a robust and accurate color harmony measure for the public and professional designers to identify the harmony level of a color theme or color set. Building upon the key discovery that color pairs play an important role in harmony estimation, in this paper we present a novel color-pair based estimation model to accurately measure the color harmony. It first takes a two-layer maximum likelihood estimation (MLE) based method to compute an initial prediction of color harmony by statistically modeling the pair-wise color preferences from existing datasets. Then, the initial scores are refined through a back-propagation neural network (BPNN) with a variety of color features extracted in different color spaces, so that an accurate harmony estimation can be obtained at the end. Our extensive experiments, including performance comparisons of harmony estimation applications, show the advantages of our method in comparison with the state of the art methods.
Compacting Voxelized Polyhedra via Tree Stacking
(The Eurographics Association and John Wiley & Sons Ltd., 2019) Hao, Yue; Lien, Jyh-Ming; Lee, Jehee and Theobalt, Christian and Wetzstein, Gordon
Volume compaction is a geometric problem that aims to reduce the volume of a polyhedron via shape transform. Compactable structures are easier to transport and in some cases easier to manufacture, therefore, they are commonly found in our daily life (e.g. collapsible containers) and advanced technology industries (e.g., the recent launch of 60 Starlink satellites compacted in a single rocket by SpaceX). It is known in the literature that finding a universal solution to compact an arbitrary 3D shape is computationally challenging. Previous approaches showed that stripifying mesh surface can lead to optimal compaction, but the resulting structures were often impractical. In this paper, we propose an algorithm that cuts the 3D orthogonal polyhedron, tessellated by thick square panels, into a tree structure that can be transformed into compact piles by folding and stacking. We call this process tree stacking. Our research found that it is possible to decompose the problem into a pipeline of several solvable local optimizations. We also provide an efficient algorithm to check if the solution exists by avoiding the computational bottleneck of the pipeline. Our results show that tree stacking can efficiently generate stackable structures that have better folding accuracy and similar compactness comparing to the most compact stacking using strips.
Computing Surface PolyCube-Maps by Constrained Voxelization
(The Eurographics Association and John Wiley & Sons Ltd., 2019) Yang, Yang; Fu, Xiao-Ming; Liu, Ligang; Lee, Jehee and Theobalt, Christian and Wetzstein, Gordon
We present a novel method to compute bijective PolyCube-maps with low isometric distortion. Given a surface and its preaxis- aligned shape that is not an exact PolyCube shape, the algorithm contains two steps: (i) construct a PolyCube shape to approximate the pre-axis-aligned shape; and (ii) generate a bijective, low isometric distortion mapping between the constructed PolyCube shape and the input surface. The PolyCube construction is formulated as a constrained optimization problem, where the objective is the number of corners in the constructed PolyCube, and the constraint is to bound the approximation error between the constructed PolyCube and the input pre-axis-aligned shape while ensuring topological validity. A novel erasing-and-filling solver is proposed to solve this challenging problem. Centeral to the algorithm for computing bijective PolyCube-maps is a quad mesh optimization process that projects the constructed PolyCube onto the input surface with high-quality quads. We demonstrate the efficacy of our algorithm on a data set containing 300 closed meshes. Compared to state-of-the-art methods, our method achieves higher practical robustness and lower mapping distortion.
Deep Line Drawing Vectorization via Line Subdivision and Topology Reconstruction
(The Eurographics Association and John Wiley & Sons Ltd., 2019) Guo, Yi; Zhang, Zhuming; Han, Chu; Hu, Wenbo; Li, Chengze; Wong, Tien-Tsin; Lee, Jehee and Theobalt, Christian and Wetzstein, Gordon
Vectorizing line drawing is necessary for the digital workflows of 2D animation and engineering design. But it is challenging due to the ambiguity of topology, especially at junctions. Existing vectorization methods either suffer from low accuracy or cannot deal with high-resolution images. To deal with a variety of challenging containing different kinds of complex junctions, we propose a two-phase line drawing vectorization method that analyzes the global and local topology. In the first phase, we subdivide the lines into partial curves, and in the second phase, we reconstruct the topology at junctions. With the overall topology estimated in the two phases, we can trace and vectorize the curves. To qualitatively and quantitatively evaluate our method and compare it with the existing methods, we conduct extensive experiments on not only existing datasets but also our newly synthesized dataset which contains different types of complex and ambiguous junctions. Experimental statistics show that our method greatly outperforms existing methods in terms of computational speed and achieves visually better topology reconstruction accuracy.
Deep Video-Based Performance Synthesis from Sparse Multi-View Capture
(The Eurographics Association and John Wiley & Sons Ltd., 2019) Chen, Mingjia; Wang, Changbo; Liu, Ligang; Lee, Jehee and Theobalt, Christian and Wetzstein, Gordon
We present a deep learning based technique that enables novel-view videos of human performances to be synthesized from sparse multi-view captures. While performance capturing from a sparse set of videos has received significant attention, there has been relatively less progress which is about non-rigid objects (e.g., human bodies). The rich articulation modes of human body make it rather challenging to synthesize and interpolate the model well. To address this problem, we propose a novel deep learning based framework that directly predicts novel-view videos of human performances without explicit 3D reconstruction. Our method is a composition of two steps: novel-view prediction and detail enhancement. We first learn a novel deep generative query network for view prediction. We synthesize novel-view performances from a sparse set of just five or less camera videos. Then, we use a new generative adversarial network to enhance fine-scale details of the first step results. This opens up the possibility of high-quality low-cost video-based performance synthesis, which is gaining popularity for VA and AR applications. We demonstrate a variety of promising results, where our method is able to synthesis more robust and accurate performances than existing state-of-the-art approaches when only sparse views are available.
Desertscapes Simulation
(The Eurographics Association and John Wiley & Sons Ltd., 2019) Paris, Axel; Peytavie, Adrien; Guérin, Eric; Argudo, Oscar; Galin, Eric; Lee, Jehee and Theobalt, Christian and Wetzstein, Gordon
We present an interactive aeolian simulation to author hot desert scenery. Wind is an important erosion agent in deserts which, despite its importance, has been neglected in computer graphics. Our framework overcomes this and allows generating a variety of sand dunes, including barchans, longitudinal and anchored dunes, and simulates abrasion which erodes bedrock and sculpts complex landforms. Given an input time varying high altitude wind field, we compute the wind field at the surface of the terrain according to the relief, and simulate the transport of sand blown by the wind. The user can interactively model complex desert landscapes, and control their evolution throughout time either by using a variety of interactive brushes or by prescribing events along a user-defined time-line.
Discrete Calabi Flow: A Unified Conformal Parameterization Method
(The Eurographics Association and John Wiley & Sons Ltd., 2019) Su, Kehua; Li, Chenchen; Zhou, Yuming; Xu, Xu; Gu, Xianfeng; Lee, Jehee and Theobalt, Christian and Wetzstein, Gordon
Conformal parameterization for surfaces into various parameter domains is a fundamental task in computer graphics. Prior research on discrete Ricci flow provided us with promising inspirations from methods derived via Riemannian geometry, which is rigorous in theory and effective in practice. In this paper, we propose a unified conformal parameterization approach for turning triangle meshes into planar and spherical domains using discrete Calabi flow on piecewise linear metric. We incorporate edgeflipping surgery to guarantee convergence as well as other significant improvements including approximate Newton's method, optimal step-lengths, priority embedding and boundary customizing, which achieve better performance and functionality with robustness and accuracy.
Distribution Update of Deformable Patches for Texture Synthesis on the Free Surface of Fluids
(The Eurographics Association and John Wiley & Sons Ltd., 2019) Gagnon, jonathan; Guzmán, Julián E.; Vervondel, Valentin; Dagenais, François; Mould, David; Paquette, Eric; Lee, Jehee and Theobalt, Christian and Wetzstein, Gordon
We propose an approach for temporally coherent patch-based texture synthesis on the free surface of fluids. Our approach is applied as a post-process, using the surface and velocity field from any fluid simulator. We apply the texture from the exemplar through multiple local mesh patches fitted to the surface and mapped to the exemplar. Our patches are constructed from the fluid free surface by taking a subsection of the free surface mesh. As such, they are initially very well adapted to the fluid's surface, and can later deform according to the free surface velocity field, allowing a greater ability to represent surface motion than rigid or 2D grid-based patches. From one frame to the next, the patch centers and surrounding patch vertices are advected according to the velocity field. We seek to maintain a Poisson disk distribution of patches, and following advection, the Poisson disk criterion determines where to add new patches and which patches should e flagged for removal. The removal considers the local number of patches: in regions containing too many patches, we accelerate the temporal removal. This reduces the number of patches while still meeting the Poisson disk criterion. Reducing areas with too many patches speeds up the computation and avoids patch-blending artifacts. The final step of our approach creates the overall texture in an atlas where each texel is computed from the patches using a contrast-preserving blending function. Our tests show that the approach works well on free surfaces undergoing significant deformation and topological changes. Furthermore, we show that our approach provides good results for many fluid simulation scenarios, and with many texture exemplars. We also confirm that the optical flow from the resulting texture matches the fluid velocity field. Overall, our approach compares favorably against recent work in this area.
Dual Illumination Estimation for Robust Exposure Correction
(The Eurographics Association and John Wiley & Sons Ltd., 2019) Zhang, Qing; Nie, Yongwei; Zheng, Wei-Shi; Lee, Jehee and Theobalt, Christian and Wetzstein, Gordon
Exposure correction is one of the fundamental tasks in image processing and computational photography. While various methods have been proposed, they either fail to produce visually pleasing results, or only work well for limited types of image (e.g., underexposed images). In this paper, we present a novel automatic exposure correction method, which is able to robustly produce high-quality results for images of various exposure conditions (e.g., underexposed, overexposed, and partially under- and over-exposed). At the core of our approach is the proposed dual illumination estimation, where we separately cast the underand over-exposure correction as trivial illumination estimation of the input image and the inverted input image. By performing dual illumination estimation, we obtain two intermediate exposure correction results for the input image, with one fixes the underexposed regions and the other one restores the overexposed regions. A multi-exposure image fusion technique is then employed to adaptively blend the visually best exposed parts in the two intermediate exposure correction images and the input image into a globally well-exposed image. Experiments on a number of challenging images demonstrate the effectiveness of the proposed approach and its superiority over the state-of-the-art methods and popular automatic exposure correction tools.
Field-aligned Quadrangulation for Image Vectorization
(The Eurographics Association and John Wiley & Sons Ltd., 2019) Wei, Guangshun; Zhou, Yuanfeng; Gao, Xifeng; Ma, Qian; Xin, Shiqing; He, Ying; Lee, Jehee and Theobalt, Christian and Wetzstein, Gordon
Image vectorization is an important yet challenging problem, especially when the input image has rich content. In this paper, we develop a novel method for automatically vectorizing natural images with feature-aligned quad-dominant meshes. Inspired by the quadrangulation methods in 3D geometry processing, we propose a new directional field optimization technique by encoding the color gradients, sidestepping the explicit computing of salient image features. We further compute the anisotropic scales of the directional field by accommodating the distance among image features. Our method is fully automatic and efficient, which takes only a few seconds for a 400x400 image on a normal laptop. We demonstrate the effectiveness of the proposed method on various image editing applications.
Figure Skating Simulation from Video
(The Eurographics Association and John Wiley & Sons Ltd., 2019) Yu, Ri; Park, Hwangpil; Lee, Jehee; Lee, Jehee and Theobalt, Christian and Wetzstein, Gordon
Figure skating is one of the most popular ice sports at the Winter Olympic Games. The skaters perform several skating skills to express the beauty of the art on ice. Skating involves moving on ice while wearing skate shoes with thin blades; thus, it requires much practice to skate without losing balance. Moreover, figure skating presents dynamic moves, such as jumping, artistically. Therefore, demonstrating figure skating skills is even more difficult to achieve than basic skating, and professional skaters often fall during Winter Olympic performances. We propose a system to demonstrate figure skating motions with a physically simulated human-like character. We simulate skating motions with non-holonomic constraints, which make the skate blade glide on the ice surface. It is difficult to obtain reference motions from figure skaters because figure skating motions are very fast and dynamic. Instead of using motion capture data, we use key poses extracted from videos on YouTube and complete reference motions using trajectory optimization. We demonstrate figure skating skills, such as crossover, three-turn, and even jump. Finally, we use deep reinforcement learning to generate a robust controller for figure skating skills.
FontRNN: Generating Large-scale Chinese Fonts via Recurrent Neural Network
(The Eurographics Association and John Wiley & Sons Ltd., 2019) Tang, Shusen; Xia, Zeqing; Lian, Zhouhui; Tang, Yingmin; Xiao, Jianguo; Lee, Jehee and Theobalt, Christian and Wetzstein, Gordon
Despite the recent impressive development of deep neural networks, using deep learning based methods to generate largescale Chinese fonts is still a rather challenging task due to the huge number of intricate Chinese glyphs, e.g., the official standard Chinese charset GB18030-2000 consists of 27,533 Chinese characters. Until now, most existing models for this task adopt Convolutional Neural Networks (CNNs) to generate bitmap images of Chinese characters due to CNN based models' remarkable success in various applications. However, CNN based models focus more on image-level features while usually ignore stroke order information when writing characters. Instead, we treat Chinese characters as sequences of points (i.e., writing trajectories) and propose to handle this task via an effective Recurrent Neural Network (RNN) model with monotonic attention mechanism, which can learn from as few as hundreds of training samples and then synthesize glyphs for remaining thousands of characters in the same style. Experimental results show that our proposed FontRNN can be used for synthesizing large-scale Chinese fonts as well as generating realistic Chinese handwritings efficiently.
A Generalized Cubemap for Encoding 360° VR Videos using Polynomial Approximation
(The Eurographics Association and John Wiley & Sons Ltd., 2019) Xiao, Jianye; Tang, Jingtao; Zhang, Xinyu; Lee, Jehee and Theobalt, Christian and Wetzstein, Gordon
360° VR videos provide users with an immersive visual experience. To encode 360° VR videos, spherical pixels must be mapped onto a two-dimensional domain to take advantage of the existing video encoding and storage standards. In VR industry, standard cubemap projection is the most widely used projection method for encoding 360° VR videos. However, it exhibits pixel density variation at different regions due to projection distortion. We present a generalized algorithm to improve the efficiency of cubemap projection using polynomial approximation. In our algorithm, standard cubemap projection can be regarded as a special form with 1st-order polynomial. Our experiments show that the generalized cubemap projection can significantly reduce the projection distortion using higher order polynomials. As a result, pixel distribution can be well balanced in the resulting 360° VR videos. We use PSNR, S-PSNR and CPP-PSNR to evaluate the visual quality and the experimental results demonstrate promising performance improvement against standard cubemap projection and Google's equi-angular cubemap.
Generating 3D Faces using Multi-column Graph Convolutional Networks
(The Eurographics Association and John Wiley & Sons Ltd., 2019) Li, Kun; Liu, Jingying; Lai, Yu-Kun; Yang, Jingyu; Lee, Jehee and Theobalt, Christian and Wetzstein, Gordon
In this work, we introduce multi-column graph convolutional networks (MGCNs), a deep generative model for 3D mesh surfaces that effectively learns a non-linear facial representation. We perform spectral decomposition of meshes and apply convolutions directly in the frequency domain. Our network architecture involves multiple columns of graph convolutional networks (GCNs), namely large GCN (L-GCN), medium GCN (M-GCN) and small GCN (S-GCN), with different filter sizes to extract features at different scales. L-GCN is more useful to extract large-scale features, whereas S-GCN is effective for extracting subtle and fine-grained features, and M-GCN captures information in between. Therefore, to obtain a high-quality representation, we propose a selective fusion method that adaptively integrates these three kinds of information. Spatially non-local relationships are also exploited through a self-attention mechanism to further improve the representation ability in the latent vector space. Through extensive experiments, we demonstrate the superiority of our end-to-end framework in improving the accuracy of 3D face reconstruction. Moreover, with the help of variational inference, our model has excellent generating ability.
Generic Interactive Pixel-level Image Editing
(The Eurographics Association and John Wiley & Sons Ltd., 2019) Liang, Yun; Gan, Yibo; Chen, Mingqin; Gutierrez, Diego; Muñoz Orbañanos, Adolfo; Lee, Jehee and Theobalt, Christian and Wetzstein, Gordon
Several image editing methods have been proposed in the past decades, achieving brilliant results. The most sophisticated of them, however, require additional information per-pixel. For instance, dehazing requires a specific transmittance value per pixel, or depth of field blurring requires depth or disparity values per pixel. This additional per-pixel value is obtained either through elaborated heuristics or through additional control over the capture hardware, which is very often tailored for the specific editing application. In contrast, however, we propose a generic editing paradigm that can become the base of several different applications. This paradigm generates both the needed per-pixel values and the resulting edit at interactive rates, with minimal user input that can be iteratively refined. Our key insight for getting per-pixel values at such speed is to cluster them into superpixels, but, instead of a constant value per superpixel (which yields accuracy problems), we have a mathematical expression for pixel values at each superpixel: in our case, an order two multinomial per superpixel. This leads to a linear leastsquares system, effectively enabling specific per-pixel values at fast speeds. We illustrate this approach in three applications: depth of field blurring (from depth values), dehazing (from transmittance values) and tone mapping (from brightness and contrast local values), and our approach proves both favorably interactive and accurate in all three. Our technique is also evaluated with a common dataset and compared favorably.
Global Texture Mapping for Dynamic Objects
(The Eurographics Association and John Wiley & Sons Ltd., 2019) Kim, Jungeon; Kim, Hyomin; Park, Jaesik; Lee, Seungyong; Lee, Jehee and Theobalt, Christian and Wetzstein, Gordon
We propose a novel framework to generate a global texture atlas for a deforming geometry. Our approach distinguishes from prior arts in two aspects. First, instead of generating a texture map for each timestamp to color a dynamic scene, our framework reconstructs a global texture atlas that can be consistently mapped to a deforming object. Second, our approach is based on a single RGB-D camera, without the need of a multiple-camera setup surrounding a scene. In our framework, the input is a 3D template model with an RGB-D image sequence, and geometric warping fields are found using a state-of-the-art non-rigid registration method [GXW*15] to align the template mesh to noisy and incomplete input depth images. With these warping fields, our multi-scale approach for texture coordinate optimization generates a sharp and clear texture atlas that is consistent with multiple color observations over time. Our approach is accelerated by graphical hardware and provides a handy configuration to capture a dynamic geometry along with a clean texture atlas. We demonstrate our approach with practical scenarios, particularly human performance capture. We also show that our approach is resilient on misalignment issues caused by imperfect estimation of warping fields and inaccurate camera parameters.

Browse

Browsing 38-Issue 7 by Title

Results Per Page

Sort Options