42-Issue 7

Permanent URI for this collection

https://diglib.eg.org/handle/10.2312/3543886

Browse

Now showing 1 - 20 of 57

Deep Shape and SVBRDF Estimation using Smartphone Multi-lens Imaging
(The Eurographics Association and John Wiley & Sons Ltd., 2023) Fan, Chongrui; Lin, Yiming; Ghosh, Abhijeet; Chaine, Raphaëlle; Deng, Zhigang; Kim, Min H.
We present a deep neural network-based method that acquires high-quality shape and spatially varying reflectance of 3D objects using smartphone multi-lens imaging. Our method acquires two images simultaneously using a zoom lens and a wide angle lens of a smartphone under either natural illumination or phone flash conditions, effectively functioning like a single-shot method. Unlike traditional multi-view stereo methods which require sufficient differences in viewpoint and only estimate depth at a certain coarse scale, our method estimates fine-scale depth by utilising an optical-flow field extracted from subtle baseline and perspective due to different optics in the two images captured simultaneously. We further guide the SVBRDF estimation using the estimated depth, resulting in superior results compared to existing single-shot methods.
Dissection Puzzles Composed of Multicolor Polyominoes
(The Eurographics Association and John Wiley & Sons Ltd., 2023) Kita, Naoki; Chaine, Raphaëlle; Deng, Zhigang; Kim, Min H.
Dissection puzzles leverage geometric dissections, wherein a set of puzzle pieces can be reassembled in various configurations to yield unique geometric figures. Mathematically, a dissection between two 2D polygons can always be established. Consequently, researchers and puzzle enthusiasts strive to design unique dissection puzzles using the fewest pieces feasible. In this study, we introduce novel dissection puzzles crafted with multi-colored polyominoes. Diverging from the traditional aim of establishing geometric dissection between two 2D polygons with the minimal piece count, we seek to identify a common pool of polyomino pieces with colored faces that can be configured into multiple distinct shapes and appearances. Moreover, we offer a method to identify an optimized sequence for rearranging pieces from one form to another, thus minimizing the total relocation distance. This approach can guide users in puzzle assembly and lessen their physical exertion when manually reconfiguring pieces. It could potentially also decrease power consumption when pieces are reorganized using robotic assistance. We showcase the efficacy of our proposed approach through a wide range of shapes and appearances.
Controllable Garment Image Synthesis Integrated with Frequency Domain Features
(The Eurographics Association and John Wiley & Sons Ltd., 2023) Liang, Xinru; Mo, Haoran; Gao, Chengying; Chaine, Raphaëlle; Deng, Zhigang; Kim, Min H.
Using sketches and textures to synthesize garment images is able to conveniently display the realistic visual effect in the design phase, which greatly increases the efficiency of fashion design. Existing garment image synthesis methods from a sketch and a texture tend to fail in working on complex textures, especially those with periodic patterns. We propose a controllable garment image synthesis framework that takes as inputs an outline sketch and a texture patch and generates garment images with complicated and diverse texture patterns. To improve the performance of global texture expansion, we exploit the frequency domain features in the generative process, which are from a Fast Fourier Transform (FFT) and able to represent the periodic information of the patterns. We also introduce a perceptual loss in the frequency domain to measure the similarity of two texture pattern patches in terms of their intrinsic periodicity and regularity. Comparisons with existing approaches and sufficient ablation studies demonstrate the effectiveness of our method that is capable of synthesizing impressive garment images with diverse texture patterns while guaranteeing proper texture expansion and pattern consistency.
Reconstructing 3D Human Pose from RGB-D Data with Occlusions
(The Eurographics Association and John Wiley & Sons Ltd., 2023) Dang, Bowen; Zhao, Xi; Zhang, Bowen; Wang, He; Chaine, Raphaëlle; Deng, Zhigang; Kim, Min H.
We propose a new method to reconstruct the 3D human body from RGB-D images with occlusions. The foremost challenge is the incompleteness of the RGB-D data due to occlusions between the body and the environment, leading to implausible reconstructions that suffer from severe human-scene penetration. To reconstruct a semantically and physically plausible human body, we propose to reduce the solution space based on scene information and prior knowledge. Our key idea is to constrain the solution space of the human body by considering the occluded body parts and visible body parts separately: modeling all plausible poses where the occluded body parts do not penetrate the scene, and constraining the visible body parts using depth data. Specifically, the first component is realized by a neural network that estimates the candidate region named the "free zone", a region carved out of the open space within which it is safe to search for poses of the invisible body parts without concern for penetration. The second component constrains the visible body parts using the "truncated shadow volume" of the scanned body point cloud. Furthermore, we propose to use a volume matching strategy, which yields better performance than surface matching, to match the human body with the confined region. We conducted experiments on the PROX dataset, and the results demonstrate that our method produces more accurate and plausible results compared with other methods.
CP-NeRF: Conditionally Parameterized Neural Radiance Fields for Cross-scene Novel View Synthesis
(The Eurographics Association and John Wiley & Sons Ltd., 2023) He, Hao; Liang, Yixun; Xiao, Shishi; Chen, Jierun; Chen, Yingcong; Chaine, Raphaëlle; Deng, Zhigang; Kim, Min H.
Neural radiance fields (NeRF) have demonstrated a promising research direction for novel view synthesis. However, the existing approaches either require per-scene optimization that takes significant computation time or condition on local features which overlook the global context of images. To tackle this shortcoming, we propose the Conditionally Parameterized Neural Radiance Fields (CP-NeRF), a plug-in module that enables NeRF to leverage contextual information from different scales. Instead of optimizing the model parameters of NeRFs directly, we train a Feature Pyramid hyperNetwork (FPN) that extracts view-dependent global and local information from images within or across scenes to produce the model parameters. Our model can be trained end-to-end with standard photometric loss from NeRF. Extensive experiments demonstrate that our method can significantly boost the performance of NeRF, achieving state-of-the-art results in various benchmark datasets.
GA-Sketching: Shape Modeling from Multi-View Sketching with Geometry-Aligned Deep Implicit Functions
(The Eurographics Association and John Wiley & Sons Ltd., 2023) Zhou, Jie; Luo, Zhongjin; Yu, Qian; Han, Xiaoguang; Fu, Hongbo; Chaine, Raphaëlle; Deng, Zhigang; Kim, Min H.
Sketch-based shape modeling aims to bridge the gap between 2D drawing and 3D modeling by providing an intuitive and accessible approach to create 3D shapes from 2D sketches. However, existing methods still suffer from limitations in reconstruction quality and multi-view interaction friendliness, hindering their practical application. This paper proposes a faithful and user-friendly iterative solution to tackle these limitations by learning geometry-aligned deep implicit functions from one or multiple sketches. Our method lifts 2D sketches to volume-based feature tensors, which align strongly with the output 3D shape, enabling accurate reconstruction and faithful editing. Such a geometry-aligned feature encoding technique is well-suited to iterative modeling since features from different viewpoints can be easily memorized or aggregated. Based on these advantages, we design a unified interactive system for sketch-based shape modeling. It enables users to generate the desired geometry iteratively by drawing sketches from any number of viewpoints. In addition, it allows users to edit the generated surface by making a few local modifications. We demonstrate the effectiveness and practicality of our method with extensive experiments and user studies, where we found that our method outperformed existing methods in terms of accuracy, efficiency, and user satisfaction. The source code of this project is available at https://github.com/LordLiang/GA-Sketching.
Fast Grayscale Morphology for Circular Window
(The Eurographics Association and John Wiley & Sons Ltd., 2023) Moroto, Yuji; Umetani, Nobuyuki; Chaine, Raphaëlle; Deng, Zhigang; Kim, Min H.
Morphological operations are among the most popular classic image filters. The filter assumes the maximum or minimum value within a window and is often used for light object thickening and thinning operations, which are important components of various workflows, such as object recognition and stylization. Circular windows are preferred over rectangular windows for obtaining isotropic filter results. However, the existing efficient algorithms focus on rectangular or binary input images. Efficient morphological operations with circular windows for grayscale images remain challenging. In this study, we present a fast grayscale morphology heuristic computation algorithm that decomposes circular windows using the convex hull of circles. We significantly accelerate traditional methods based on Minkowski addition by introducing new decomposition rules specialized for circular windows. As our morphological operation using a convex hull can be computed independently for each pixel, the algorithm is efficient for modern multithreaded hardware.
Efficient Interpolation of Rough Line Drawings
(The Eurographics Association and John Wiley & Sons Ltd., 2023) Chen, Jiazhou; Zhu, Xinding; Even, Melvin; Basset, Jean; Bénard, Pierre; Barla, Pascal; Chaine, Raphaëlle; Deng, Zhigang; Kim, Min H.
In traditional 2D animation, sketches drawn at distant keyframes are used to design motion, yet it would be far too laborintensive to draw all the inbetween frames to fully visualize that motion. We propose a novel efficient interpolation algorithm that generates these intermediate frames in the artist's drawing style. Starting from a set of registered rough vector drawings, we first generate a large number of candidate strokes during a pre-process, and then, at each intermediate frame, we select the subset of those that appropriately conveys the underlying interpolated motion, interpolates the stroke distributions of the key drawings, and introduces a minimum amount of temporal artifacts. In addition, we propose quantitative error metrics to objectively evaluate different stroke selection strategies. We demonstrate the potential of our method on various animations and drawing styles, and show its superiority over competing raster- and vector-based methods.
MAPMaN: Multi-Stage U-Shaped Adaptive Pattern Matching Network for Semantic Segmentation of Remote Sensing Images
(The Eurographics Association and John Wiley & Sons Ltd., 2023) Hong, Tingfeng; Ma, Xiaowen; Wang, Xinyu; Che, Rui; Hu, Chenlu; Feng, Tian; Zhang, Wei; Chaine, Raphaëlle; Deng, Zhigang; Kim, Min H.
Remote sensing images (RSIs) often possess obvious background noises, exhibit a multi-scale phenomenon, and are characterized by complex scenes with ground objects in diversely spatial distribution pattern, bringing challenges to the corresponding semantic segmentation. CNN-based methods can hardly address the diverse spatial distributions of ground objects, especially their compositional relationships, while Vision Transformers (ViTs) introduce background noises and have a quadratic time complexity due to dense global matrix multiplications. In this paper, we introduce Adaptive Pattern Matching (APM), a lightweight method for long-range adaptive weight aggregation. Our APM obtains a set of pixels belonging to the same spatial distribution pattern of each pixel, and calculates the adaptive weights according to their compositional relationships. In addition, we design a tiny U-shaped network using the APM as a module to address the large variance of scales of ground objects in RSIs. This network is embedded after each stage in a backbone network to establish a Multi-stage U-shaped Adaptive Pattern Matching Network (MAPMaN), for nested multi-scale modeling of ground objects towards semantic segmentation of RSIs. Experiments on three datasets demonstrate that our MAPMaN can outperform the state-of-the-art methods in common metrics. The code can be available at https://github.com/INiid/MAPMaN.
H-ETC2: Design of a CPU-GPU Hybrid ETC2 Encoder
(The Eurographics Association and John Wiley & Sons Ltd., 2023) Lee, Hyeon-ki; Nah, Jae-Ho; Chaine, Raphaëlle; Deng, Zhigang; Kim, Min H.
This paper proposes a novel CPU-GPU hybrid encoding method based on the ETC2 format, commonly used on mobile platforms. Traditional texture compression techniques often face a trade-off between encoding speed and quality. For a better trade-off, our approach utilizes both the CPU and GPU. In a pipeline we designed, the CPU encoder identifies problematic pixel blocks during the encoding process, and the GPU encoder re-encodes them. Additionally, we carefully improve the base CPU and GPU encoders regarding encoding speed and quality. As a result, our encoder minimizes compression artifacts, increases encoding speed, or achieves both of these goals compared to previous high-quality offline ETC2 encoders.
World-Space Spatiotemporal Path Resampling for Path Tracing
(The Eurographics Association and John Wiley & Sons Ltd., 2023) Zhang, Hangyu; Wang, Beibei; Chaine, Raphaëlle; Deng, Zhigang; Kim, Min H.
With the advent of hardware-accelerated ray tracing, more and more real-time rendering applications tend to render images with ray-traced global illumination (GI). However, the low sample counts at real-time framerates bring enormous challenges to existing path sampling methods. Recent work (ReSTIR GI) samples indirect illumination effectively with a dramatic bias reduction. However, as a screen-space based path resampling approach, it can only reuse the path at the first bounce and brings subtle benefits for complex scenes. To this end, we propose a world-space based spatiotemporal path resampling approach. Our approach caches more path samples into a world-space grid, which allows reusing sub-path starting from non-primary path vertices. Furthermore, we introduce a practical normal-aware hash grid construction approach, providing more efficient candidate samples for path resampling. Eventually, our method achieves improvements ranging from 16.6% to 41.9% in terms of mean squared errors (MSE) compared against the previous method with only 4.4% ~ 8.4% extra time cost.
Enhancing Low-Light Images: A Variation-based Retinex with Modified Bilateral Total Variation and Tensor Sparse Coding
(The Eurographics Association and John Wiley & Sons Ltd., 2023) Yang, Weipeng; Gao, Hongxia; Zou, Wenbin; Huang, Shasha; Chen, Hongsheng; Ma, Jianliang; Chaine, Raphaëlle; Deng, Zhigang; Kim, Min H.
Low-light conditions often result in the presence of significant noise and artifacts in captured images, which can be further exacerbated during the image enhancement process, leading to a decrease in visual quality. This paper aims to present an effective low-light image enhancement model based on the variation Retinex model that successfully suppresses noise and artifacts while preserving image details. To achieve this, we propose a modified Bilateral Total Variation to better smooth out fine textures in the illuminance component while maintaining weak structures. Additionally, tensor sparse coding is employed as a regularization term to remove noise and artifacts from the reflectance component. Experimental results on extensive and challenging datasets demonstrate the effectiveness of the proposed method, exhibiting superior or comparable performance compared to state-ofthe- art approaches. Code, dataset and experimental results are available at https://github.com/YangWeipengscut/BTRetinex.
Fabricatable 90° Pop-ups: Interactive Transformation of a 3D Model into a Pop-up Structure
(The Eurographics Association and John Wiley & Sons Ltd., 2023) Fujikawa, Junpei; Ijiri, Takashi; Chaine, Raphaëlle; Deng, Zhigang; Kim, Min H.
Ninety-degree pop-ups are a type of papercraft on which a three-dimensional (3D) structure pops up when the angle of the base fold is 90°. They are fabricated by cutting and creasing a single sheet of paper. Traditional 90° pop-ups are limited to 3D shapes only comprising planar shapes because they are made of paper. In this paper, we present novel pop-ups, fabricatable 90° pop-ups that employ the 90° pop-up mechanism, consist of components with curved shapes, and can be fabricatable using a 3D printer. We propose a method for converting a 3D model into a fabricatable 90° pop-up. The user first interactively designs a layout of pop-up components, and the system automatically deforms the components using the 3D model. Because the generated pop-ups contain necessary cuts and folds, no additional assembly process is required. To demonstrate the feasibility of the proposed method, we designed and fabricated various 90° pop-ups using a 3D printer.
Multi-Level Implicit Function for Detailed Human Reconstruction by Relaxing SMPL Constraints
(The Eurographics Association and John Wiley & Sons Ltd., 2023) Ma, Xikai; Zhao, Jieyu; Teng, Yiqing; Yao, Li; Chaine, Raphaëlle; Deng, Zhigang; Kim, Min H.
Aiming at enhancing the rationality and robustness of the results of single-view image-based human reconstruction and acquiring richer surface details, we propose a multi-level reconstruction framework based on implicit functions.This framework first utilizes the predicted SMPL model (Skinned Multi-Person Linear Model) as a prior to further predict consistent 2.5D sketches (depth map and normal map), and then obtains a coarse reconstruction result through an Implicit Function fitting network (IF-Net). Subsequently, with a pixel-aligned feature extraction module and a fine IF-Net, the strong constraints imposed by SMPL are relaxed to add more surface details to the reconstruction result and remove noise. Finally, to address the trade-off between surface details and rationality under complex poses, we propose a novel fusion repair algorithm that reuses existing information. This algorithm compensates for the missing parts of the fine reconstruction results with the coarse reconstruction results, leading to a robust, rational, and richly detailed reconstruction. The final experiments prove the effectiveness of our method and demonstrate that it achieves the richest surface details while ensuring rationality. The project website can be found at https://github.com/MXKKK/2.5D-MLIF.
Error-bounded Image Triangulation
(The Eurographics Association and John Wiley & Sons Ltd., 2023) Fang, Zhi-Duo; Guo, Jia-Peng; Xiao, Yanyang; Fu, Xiao-Ming; Chaine, Raphaëlle; Deng, Zhigang; Kim, Min H.
We propose a novel image triangulation method to reduce the complexity of image triangulation under the color error-bounded constraint and the triangle quality constraint. Meanwhile, we realize a variety of visual effects by supporting different types of triangles (e.g., linear or curved) and color approximation functions (e.g., constant, linear, or quadratic). To adapt to these discontinuous and combinatorial objectives and constraints, we formulate it as a constrained optimization problem that is solved by a series of tailored local remeshing operations. The feasibility and practicability of our method are demonstrated over various types of images, such as organisms, landscapes, portraits and cartoons. Compared to state-of-the-art methods, our method generates far fewer triangles for the same color error or much smaller color errors using the same number of triangles.
An Efficient Self-supporting Infill Structure for Computational Fabrication
(The Eurographics Association and John Wiley & Sons Ltd., 2023) Wang, Shengfa; Liu, Zheng; Hu, Jiangbei; Lei, Na; Luo, Zhongxuan; Chaine, Raphaëlle; Deng, Zhigang; Kim, Min H.
Efficiently optimizing the internal structure of 3D printing models is a critical focus in the field of industrial manufacturing, particularly when designing self-supporting structures that offer high stiffness and lightweight characteristics. To tackle this challenge, this research introduces a novel approach featuring a self-supporting polyhedral structure and an efficient optimization algorithm. Specifically, the internal space of the model is filled with a combination of self-supporting octahedrons and tetrahedrons, strategically arranged to maximize structural integrity. Our algorithm optimizes the wall thickness of the polyhedron elements to satisfy specific stiffness requirements, while ensuring efficient alignment of the filled structures in finite element calculations. Our approach results in a considerable decrease in optimization time. The optimization process is stable, converges rapidly, and consistently delivers effective results. Through a series of experiments, we have demonstrated the effectiveness and efficiency of our method in achieving the desired design objectives
A Perceptual Shape Loss for Monocular 3D Face Reconstruction
(The Eurographics Association and John Wiley & Sons Ltd., 2023) Otto, Christopher; Chandran, Prashanth; Zoss, Gaspard; Gross, Markus; Gotardo, Paulo; Bradley, Derek; Chaine, Raphaëlle; Deng, Zhigang; Kim, Min H.
Monocular 3D face reconstruction is a wide-spread topic, and existing approaches tackle the problem either through fast neural network inference or offline iterative reconstruction of face geometry. In either case carefully-designed energy functions are minimized, commonly including loss terms like a photometric loss, a landmark reprojection loss, and others. In this work we propose a new loss function for monocular face capture, inspired by how humans would perceive the quality of a 3D face reconstruction given a particular image. It is widely known that shading provides a strong indicator for 3D shape in the human visual system. As such, our new 'perceptual' shape loss aims to judge the quality of a 3D face estimate using only shading cues. Our loss is implemented as a discriminator-style neural network that takes an input face image and a shaded render of the geometry estimate, and then predicts a score that perceptually evaluates how well the shaded render matches the given image. This 'critic' network operates on the RGB image and geometry render alone, without requiring an estimate of the albedo or illumination in the scene. Furthermore, our loss operates entirely in image space and is thus agnostic to mesh topology. We show how our new perceptual shape loss can be combined with traditional energy terms for monocular 3D face optimization and deep neural network regression, improving upon current state-of-the-art results.
A Differential Diffusion Theory for Participating Media
(The Eurographics Association and John Wiley & Sons Ltd., 2023) Cen, Yunchi; Li, Chen; Li, Frederick W. B.; Yang, Bailin; Liang, Xiaohui; Chaine, Raphaëlle; Deng, Zhigang; Kim, Min H.
We present a novel approach to differentiable rendering for participating media, addressing the challenge of computing scene parameter derivatives. While existing methods focus on derivative computation within volumetric path tracing, they fail to significantly improve computational performance due to the expensive computation of multiply-scattered light. To overcome this limitation, we propose a differential diffusion theory inspired by the classical diffusion equation. Our theory enables real-time computation of arbitrary derivatives such as optical absorption, scattering coefficients, and anisotropic parameters of phase functions. By solving derivatives through the differential form of the diffusion equation, our approach achieves remarkable speed gains compared to Monte Carlo methods. This marks the first differentiable rendering framework to compute scene parameter derivatives based on diffusion approximation. Additionally, we derive the discrete form of diffusion equation derivatives, facilitating efficient numerical solutions. Our experimental results using synthetic and realistic images demonstrate the accurate and efficient estimation of arbitrary scene parameter derivatives. Our work represents a significant advancement in differentiable rendering for participating media, offering a practical and efficient solution to compute derivatives while addressing the limitations of existing approaches.
Combating Spurious Correlations in Loose-fitting Garment Animation Through Joint-Specific Feature Learning
(The Eurographics Association and John Wiley & Sons Ltd., 2023) Diao, Junqi; Xiao, Jun; He, Yihong; Jiang, Haiyong; Chaine, Raphaëlle; Deng, Zhigang; Kim, Min H.
We address the 3D animation of loose-fitting garments from a sequence of body motions. State-of-the-art approaches treat all body joints as a whole to encode motion features, which usually gives rise to learned spurious correlations between garment vertices and irrelevant joints as shown in Fig. 1. To cope with the issue, we encode temporal motion features in a joint-wise manner and learn an association matrix to map human joints only to most related garment regions by encouraging its sparsity. In this way, spurious correlations are mitigated and better performance is achieved. Furthermore, we devise the joint-specific pose space deformation (PSD) to decompose the high-dimensional displacements as the combination of dynamic details caused by individual joint poses. Extensive experiments show that our method outperforms previous works in most indicators. Moreover, garment animations are not interfered with by artifacts caused by spurious correlations, which further validates the effectiveness of our approach. The code is available at https://github.com/qiji77/JointNet.
Structure Learning for 3D Point Cloud Generation from Single RGB Images
(The Eurographics Association and John Wiley & Sons Ltd., 2023) Charrada, Tarek Ben; Laga, Hamid; Tabia, Hedi; Chaine, Raphaëlle; Deng, Zhigang; Kim, Min H.
3D point clouds can represent complex 3D objects of arbitrary topologies and with fine-grained details. They are, however, hard to regress from images using convolutional neural networks, making tasks such as 3D reconstruction from monocular RGB images challenging. In fact, unlike images and volumetric grids, point clouds are unstructured and thus lack proper parameterization, which makes them difficult to process using convolutional operations. Existing point-based 3D reconstruction methods that tried to address this problem rely on complex end-to-end architectures with high computational costs. Instead, we propose in this paper a novel mechanism that decouples the 3D reconstruction problem from the structure (or parameterization) learning task, making the 3D reconstruction of objects of arbitrary topologies tractable and thus easier to learn. We achieve this using a novel Teacher-Student network where the Teacher learns to structure the point clouds. The Student then harnesses the knowledge learned by the Teacher to efficiently regress accurate 3D point clouds. We train the Teacher network using 3D ground-truth supervision and the Student network using the Teacher’'s annotations. Finally, we employ a novel refinement network to overcome the upper-bound performance that is set by the Teacher network. Our extensive experiments on ShapeNet and Pix3D benchmarks, and on in-the-wild images demonstrate that the proposed approach outperforms previous methods in terms of reconstruction accuracy and visual quality.

Browse

Browsing 42-Issue 7 by Issue Date

Results Per Page

Sort Options