Volume 41 (2022)
Permanent URI for this community
Browse
Browsing Volume 41 (2022) by Issue Date
Now showing 1 - 20 of 267
Results Per Page
Sort Options
Item A Stereo Matching Algorithm for High‐Precision Guidance in a Weakly Textured Industrial Robot Environment Dominated by Planar Facets(© 2022 Eurographics ‐ The European Association for Computer Graphics and John Wiley & Sons Ltd, 2022) Wei, Hui; Meng, Lingjiang; Hauser, Helwig and Alliez, PierreAlthough many algorithms perform very well on certain datasets, existing stereo matching algorithms still fail to obtain ideal disparity images with high precision in practical robotic applications with weak or untextured objects. This greatly limits the application of binocular vision for robotic arm guidance. Traditional stereo matching algorithms suffer from disparity loss, dilation and other problems, and deep learning algorithms have weakly generalization ability, making high‐accuracy results impossible with non‐training images. We propose an algorithm that uses segments and edges as matching units. We find the mapping relationship between two‐dimensional images and three‐dimensional scenes using segments. The algorithm obtains highly accurate results in industrial robotic applications with mainly planar facets. We combine it with a deep learning algorithm to obtain very good high‐accuracy results in both general scenes and applications of industrial robots. The algorithm effectively improves the non‐linear optimization ability of traditional algorithms and generalization ability of deep learning, and provides an effective method for the binocular vision guidance of industrial robotic scenes. We used the algorithm to guide the robot arm for threading with a success rate of 70%.Item A Second-Order Explicit Pressure Projection Method for Eulerian Fluid Simulation(The Eurographics Association and John Wiley & Sons Ltd., 2022) Jiang, Junwei; Shen, Xiangda; Gong, Yuning; Fan, Zeng; Liu, Yanli; Xing, Guanyu; Ren, Xiaohua; Zhang, Yanci; Dominik L. Michels; Soeren PirkIn this paper, we propose a novel second-order explicit midpoint method to address the issue of energy loss and vorticity dissipation in Eulerian fluid simulation. The basic idea is to explicitly compute the pressure gradient at the middle time of each time step and apply it to the velocity field after advection. Theoretically, our solver can achieve higher accuracy than the first-order solvers at similar computational cost. On the other hand, our method is twice and even faster than the implicit second-order solvers at the cost of a small loss of accuracy. We have carried out a large number of 2D, 3D and numerical experiments to verify the effectiveness and availability of our algorithm.Item Modelling Surround‐aware Contrast Sensitivity for HDR Displays(© 2022 Eurographics ‐ The European Association for Computer Graphics and John Wiley & Sons Ltd, 2022) Yi, Shinyoung; Jeon, Daniel S.; Serrano, Ana; Jeong, Se‐Yoon; Kim, Hui‐Yong; Gutierrez, Diego; Kim, Min H.; Hauser, Helwig and Alliez, PierreDespite advances in display technology, many existing applications rely on psychophysical datasets of human perception gathered using older, sometimes outdated displays. As a result, there exists the underlying assumption that such measurements can be carried over to the new viewing conditions of more modern technology. We have conducted a series of psychophysical experiments to explore contrast sensitivity using a state‐of‐the‐art HDR display, taking into account not only the spatial frequency and luminance of the stimuli but also their surrounding luminance levels. From our data, we have derived a novel surround‐aware contrast sensitivity function (CSF), which predicts human contrast sensitivity more accurately. We additionally provide a practical version that retains the benefits of our full model, while enabling easy backward compatibility and consistently producing good results across many existing applications that make use of CSF models. We show examples of effective HDR video compression using a transfer function derived from our CSF, tone‐mapping and improved accuracy in visual difference prediction.Item Shape Transformers: Topology-Independent 3D Shape Models Using Transformers(The Eurographics Association and John Wiley & Sons Ltd., 2022) Chandran, Prashanth; Zoss, Gaspard; Gross, Markus; Gotardo, Paulo; Bradley, Derek; Chaine, Raphaëlle; Kim, Min H.Parametric 3D shape models are heavily utilized in computer graphics and vision applications to provide priors on the observed variability of an object's geometry (e.g., for faces). Original models were linear and operated on the entire shape at once. They were later enhanced to provide localized control on different shape parts separately. In deep shape models, nonlinearity was introduced via a sequence of fully-connected layers and activation functions, and locality was introduced in recent models that use mesh convolution networks. As common limitations, these models often dictate, in one way or another, the allowed extent of spatial correlations and also require that a fixed mesh topology be specified ahead of time. To overcome these limitations, we present Shape Transformers, a new nonlinear parametric 3D shape model based on transformer architectures. A key benefit of this new model comes from using the transformer's self-attention mechanism to automatically learn nonlinear spatial correlations for a class of 3D shapes. This is in contrast to global models that correlate everything and local models that dictate the correlation extent. Our transformer 3D shape autoencoder is a better alternative to mesh convolution models, which require specially-crafted convolution, and down/up-sampling operators that can be difficult to design. Our model is also topologically independent: it can be trained once and then evaluated on any mesh topology, unlike most previous methods. We demonstrate the application of our model to different datasets, including 3D faces, 3D hand shapes and full human bodies. Our experiments demonstrate the strong potential of our Shape Transformer model in several applications in computer graphics and vision.Item LMFingerprints: Visual Explanations of Language Model Embedding Spaces through Layerwise Contextualization Scores(The Eurographics Association and John Wiley & Sons Ltd., 2022) Sevastjanova, Rita; Kalouli, Aikaterini-Lida; Beck, Christin; Hauptmann, Hanna; El-Assady, Mennatallah; Borgo, Rita; Marai, G. Elisabeta; Schreck, TobiasLanguage models, such as BERT, construct multiple, contextualized embeddings for each word occurrence in a corpus. Understanding how the contextualization propagates through the model's layers is crucial for deciding which layers to use for a specific analysis task. Currently, most embedding spaces are explained by probing classifiers; however, some findings remain inconclusive. In this paper, we present LMFingerprints, a novel scoring-based technique for the explanation of contextualized word embeddings. We introduce two categories of scoring functions, which measure (1) the degree of contextualization, i.e., the layerwise changes in the embedding vectors, and (2) the type of contextualization, i.e., the captured context information. We integrate these scores into an interactive explanation workspace. By combining visual and verbal elements, we provide an overview of contextualization in six popular transformer-based language models. We evaluate hypotheses from the domain of computational linguistics, and our results not only confirm findings from related work but also reveal new aspects about the information captured in the embedding spaces. For instance, we show that while numbers are poorly contextualized, stopwords have an unexpected high contextualization in the models' upper layers, where their neighborhoods shift from similar functionality tokens to tokens that contribute to the meaning of the surrounding sentences.Item Semi-MoreGAN: Semi-supervised Generative Adversarial Network for Mixture of Rain Removal(The Eurographics Association and John Wiley & Sons Ltd., 2022) Shen, Yiyang; Wang, Yongzhen; Wei, Mingqiang; Chen, Honghua; Xie, Haoran; Cheng, Gary; Wang, Fu Lee; Umetani, Nobuyuki; Wojtan, Chris; Vouga, EtienneReal-world rain is a mixture of rain streaks and rainy haze. However, current efforts formulate image rain streaks removal and rainy haze removal as separated models, worsening the loss of image details. This paper attempts to solve the mixture of rain removal problem in a single model by estimating the scene depths of images. To this end, we propose a novel SEMIsupervised Mixture Of rain REmoval Generative Adversarial Network (Semi-MoreGAN). Unlike most of existing methods, Semi-MoreGAN is a joint learning paradigm of mixture of rain removal and depth estimation; and it effectively integrates the image features with the depth information for better rain removal. Furthermore, it leverages unpaired real-world rainy and clean images to bridge the gap between synthetic and real-world rain. Extensive experiments show clear improvements of our approach over twenty representative state-of-the-arts on both synthetic and real-world rainy images. Source code is available at https://github.com/syy-whu/Semi-MoreGAN.Item Voice2Face: Audio-driven Facial and Tongue Rig Animations with cVAEs(The Eurographics Association and John Wiley & Sons Ltd., 2022) Villanueva Aylagas, Monica; Anadon Leon, Hector; Teye, Mattias; Tollmar, Konrad; Dominik L. Michels; Soeren PirkWe present Voice2Face: a Deep Learning model that generates face and tongue animations directly from recorded speech. Our approach consists of two steps: a conditional Variational Autoencoder generates mesh animations from speech, while a separate module maps the animations to rig controller space. Our contributions include an automated method for speech style control, a method to train a model with data from multiple quality levels, and a method for animating the tongue. Unlike previous works, our model generates animations without speaker-dependent characteristics while allowing speech style control. We demonstrate through a user study that Voice2Face significantly outperforms a comparative state-of-the-art model in terms of perceived animation quality, and our quantitative evaluation suggests that Voice2Face yields more accurate lip closure in speech with bilabials through our speech style optimization. Both evaluations also show that our data quality conditioning scheme outperforms both an unconditioned model and a model trained with a smaller high-quality dataset. Finally, the user study shows a preference for animations including tongue. Results from our model can be seen at https://go.ea.com/voice2face.Item Cognitive Model of Agent Exploration with Vision and Signage Understanding(The Eurographics Association and John Wiley & Sons Ltd., 2022) Johnson, Colin; Haworth, Brandon; Dominik L. Michels; Soeren PirkSignage systems play an essential role in ensuring safe, stress-free, and efficient navigation for the occupants of indoor spaces. Crowd simulations with sufficiently realistic virtual humans provide a convenient and cost-effective approach to evaluating and optimizing signage systems. In this work, we develop an agent model which makes use of image processing on parametric saliency maps to visually identify signage and distractions in the agent's field of view. Information from identified signs is incorporated into a grid-based representation of wayfinding familiarity, which is used to guide informed exploration of the agent's environment using a modified A* algorithm. In areas with low wayfinding familiarity, the agent follows a random exploration behaviour based on sampling a grid of previously observed locations for heuristic values based on space syntax isovist measures. The resulting agent design is evaluated in a variety of test environments and found to be able to reliably navigate towards a goal location using a combination of signage and random exploration.Item Rational Bézier Guarding(The Eurographics Association and John Wiley & Sons Ltd., 2022) Khanteimouri, Payam; Mandad, Manish; Campen, Marcel; Campen, Marcel; Spagnuolo, MichelaWe present a reliable method to generate planar meshes of nonlinear rational triangular elements. The elements are guaranteed to be valid, i.e. defined by injective rational functions. The mesh is guaranteed to conform exactly, without geometric error, to arbitrary rational domain boundary and feature curves. The method generalizes the recent Bézier Guarding technique, which is applicable only to polynomial curves and elements. This generalization enables the accurate handling of practically important cases involving, for instance, circular or elliptic arcs and NURBS curves, which cannot be matched by polynomial elements. Furthermore, although many practical scenarios are concerned with rational functions of quadratic and cubic degree only, our method is fully general and supports arbitrary degree. We demonstrate the method on a variety of test cases.Item Leveraging Analysis History for Improved In Situ Visualization Recommendation(The Eurographics Association and John Wiley & Sons Ltd., 2022) Epperson, Will; Lee, Doris Jung-Lin; Wang, Leijie; Agarwal, Kunal; Parameswaran, Aditya G.; Moritz, Dominik; Perer, Adam; Borgo, Rita; Marai, G. Elisabeta; Schreck, TobiasExisting visualization recommendation systems commonly rely on a single snapshot of a dataset to suggest visualizations to users. However, exploratory data analysis involves a series of related interactions with a dataset over time rather than one-off analytical steps. We present Solas, a tool that tracks the history of a user's data analysis, models their interest in each column, and uses this information to provide visualization recommendations, all within the user's native analytical environment. Recommending with analysis history improves visualizations in three primary ways: task-specific visualizations use the provenance of data to provide sensible encodings for common analysis functions, aggregated history is used to rank visualizations by our model of a user's interest in each column, and column data types are inferred based on applied operations. We present a usage scenario and a user evaluation demonstrating how leveraging analysis history improves in situ visualization recommendations on real-world analysis tasks.Item Ref-ZSSR: Zero-Shot Single Image Superresolution with Reference Image(The Eurographics Association and John Wiley & Sons Ltd., 2022) Han, Xianjun; Wang, Xue; Wang, Huabin; Li, Xuejun; Yang, Hongyu; Umetani, Nobuyuki; Wojtan, Chris; Vouga, EtienneSingle image superresolution (SISR) has achieved substantial progress based on deep learning. Many SISR methods acquire pairs of low-resolution (LR) images from their corresponding high-resolution (HR) counterparts. Being unsupervised, this kind of method also demands large-scale training data. However, these paired images and a large amount of training data are difficult to obtain. Recently, several internal, learning-based methods have been introduced to address this issue. Although requiring a large quantity of training data pairs is solved, the ability to improve the image resolution is limited if only the information of the LR image itself is applied. Therefore, we further expand this kind of approach by using similar HR reference images as prior knowledge to assist the single input image. In this paper, we proposed zero-shot single image superresolution with a reference image (Ref-ZSSR). First, we use an unconditional generative model to learn the internal distribution of the HR reference image. Second, a dual-path architecture that contains a downsampler and an upsampler is introduced to learn the mapping between the input image and its downscaled image. Finally, we combine the reference image learning module and dual-path architecture module to train a new generative model that can generate a superresolution (SR) image with the details of the HR reference image. Such a design encourages a simple and accurate way to transfer relevant textures from the reference high-definition (HD) image to LR image. Compared with using only the image itself, the HD feature of the reference image improves the SR performance. In the experiment, we show that the proposed method outperforms previous image-specific network and internal learning-based methods.Item Local Scale Adaptation to Hand Shape Model for Accurate and Robust Hand Tracking(The Eurographics Association and John Wiley & Sons Ltd., 2022) Kalshetti, Pratik; Chaudhuri, Parag; Dominik L. Michels; Soeren PirkThe accuracy of hand tracking algorithms depends on how closely the geometry of the mesh model resembles the user's hand shape. Most existing methods rely on a learned shape space model; however, this fails to generalize to unseen hand shapes with significant deviations from the training set. We introduce local scale adaptation to augment this data-driven shape model and thus enable modeling hands of substantially different sizes. We also present a framework to calibrate our proposed hand shape model by registering it to depth data and achieve accurate and robust tracking. We demonstrate the capability of our proposed adaptive shape model over the most widely used existing hand model by registering it to subjects from different demographics. We also validate the accuracy and robustness of our tracking framework on challenging public hand datasets where we improve over state-of-the-art methods. Our adaptive hand shape model and tracking framework offer a significant boost towards generalizing the accuracy of hand tracking.Item Synthesizing Get-Up Motions for Physics-based Characters(The Eurographics Association and John Wiley & Sons Ltd., 2022) Frezzato, Anthony; Tangri, Arsh; Andrews, Sheldon; Dominik L. Michels; Soeren PirkWe propose a method for synthesizing get-up motions for physics-based humanoid characters. Beginning from a supine or prone state, our objective is not to imitate individual motion clips, but to produce motions that match input curves describing the style of get-up motion. Our framework uses deep reinforcement learning to learn control policies for the physics-based character. A latent embedding of natural human poses is computed from a motion capture database, and the embedding is furthermore conditioned on the input features. We demonstrate that our approach can synthesize motions that follow the style of user authored curves, as well as curves extracted from reference motions. In the latter case, motions of the physics-based character resemble the original motion clips. New motions can be synthesized easily by changing only a small number of controllable parameters. We also demonstrate the success of our controllers on rough and inclined terrain.Item Once-more Scattered Next Event Estimation for Volume Rendering(The Eurographics Association and John Wiley & Sons Ltd., 2022) Hanika, Johannes; Weidlich, Andrea; Droske, Marc; Ghosh, Abhijeet; Wei, Li-YiWe present a Monte Carlo path tracing technique to sample extended next event estimation contributions in participating media: we consider one additional scattering vertex on the way to the next event, accounting for focused blur, resulting in visually interesting image features. Our technique is tailored to thin homogeneous media with strongly forward scattering phase functions, such as water or atmospheric haze. Previous methods put emphasis on sampling transmittances or geometric factors, and are either limited to isotropic scattering, or used tabulation or polynomial approximation to account for some specific phase functions. We will show how to jointly importance sample the product of an arbitrary phase function with analytic sampling in the solid angle domain and the two reciprocal squared distance terms of the adjacent edges of the transport path. The technique is fast and simple to implement in an existing rendering system. Our estimator is designed specifically for forward scattering, so the new technique has to be combined with other estimators to cover the backward scattering contributions.Item Real-time Deep Radiance Reconstruction from Imperfect Caches(The Eurographics Association and John Wiley & Sons Ltd., 2022) Huang, Tao; Song, Yadong; Guo, Jie; Tao, Chengzhi; Zong, Zijing; Fu, Xihao; Li, Hongshan; Guo, Yanwen; Umetani, Nobuyuki; Wojtan, Chris; Vouga, EtienneReal-time global illumination is a highly desirable yet challenging task in computer graphics. Existing works well solving this problem are mostly based on some kind of precomputed data (caches), while the final results depend significantly on the quality of the caches. In this paper, we propose a learning-based pipeline that can reproduce a wide range of complex light transport phenomena, including high-frequency glossy interreflection, at any viewpoint in real time (> 90 frames per-second), using information from imperfect caches stored at the barycentre of every triangle in a 3D scene. These caches are generated at a precomputation stage by a physically-based offline renderer at a low sampling rate (e.g., 32 samples per-pixel) and a low image resolution (e.g., 64×16). At runtime, a deep radiance reconstruction method based on a dedicated neural network is then involved to reconstruct a high-quality radiance map of full global illumination at any viewpoint from these imperfect caches, without introducing noise and aliasing artifacts. To further improve the reconstruction accuracy, a new feature fusion strategy is designed in the network to better exploit useful contents from cheap G-buffers generated at runtime. The proposed framework ensures high-quality rendering of images for moderate-sized scenes with full global illumination effects, at the cost of reasonable precomputation time. We demonstrate the effectiveness and efficiency of the proposed pipeline by comparing it with alternative strategies, including real-time path tracing and precomputed radiance transfer.Item Contrastive Semantic-Guided Image Smoothing Network(The Eurographics Association and John Wiley & Sons Ltd., 2022) Wang, Jie; Wang, Yongzhen; Feng, Yidan; Gong, Lina; Yan, Xuefeng; Xie, Haoran; Wang, Fu Lee; Wei, Mingqiang; Umetani, Nobuyuki; Wojtan, Chris; Vouga, EtienneImage smoothing is a fundamental low-level vision task that aims to preserve salient structures of an image while removing insignificant details. Deep learning has been explored in image smoothing to deal with the complex entanglement of semantic structures and trivial details. However, current methods neglect two important facts in smoothing: 1) naive pixel-level regression supervised by the limited number of high-quality smoothing ground-truth could lead to domain shift and cause generalization problems towards real-world images; 2) texture appearance is closely related to object semantics, so that image smoothing requires awareness of semantic difference to apply adaptive smoothing strengths. To address these issues, we propose a novel Contrastive Semantic-Guided Image Smoothing Network (CSGIS-Net) that combines both contrastive prior and semantic prior to facilitate robust image smoothing. The supervision signal is augmented by leveraging undesired smoothing effects as negative teachers, and by incorporating segmentation tasks to encourage semantic distinctiveness. To realize the proposed network, we also enrich the original VOC dataset with texture enhancement and smoothing labels, namely VOC-smooth, which first bridges image smoothing and semantic segmentation. Extensive experiments demonstrate that the proposed CSGIS-Net outperforms state-of-the-art algorithms by a large margin. Code and dataset are available at https://github.com/wangjie6866/CSGIS-Net.Item SimilarityNet: A Deep Neural Network for Similarity Analysis Within Spatio-temporal Ensembles(The Eurographics Association and John Wiley & Sons Ltd., 2022) Huesmann, Karim; Linsen, Lars; Borgo, Rita; Marai, G. Elisabeta; Schreck, TobiasLatent feature spaces of deep neural networks are frequently used to effectively capture semantic characteristics of a given dataset. In the context of spatio-temporal ensemble data, the latent space represents a similarity space without the need of an explicit definition of a field similarity measure. Commonly, these networks are trained for specific data within a targeted application. We instead propose a general training strategy in conjunction with a deep neural network architecture, which is readily applicable to any spatio-temporal ensemble data without re-training. The latent-space visualization allows for a comprehensive visual analysis of patterns and temporal evolution within the ensemble. With the use of SimilarityNet, we are able to perform similarity analyses on large-scale spatio-temporal ensembles in less than a second on commodity consumer hardware. We qualitatively compare our results to visualizations with established field similarity measures to document the interpretability of our latent space visualizations and show that they are feasible for an in-depth basic understanding of the underlying temporal evolution of a given ensemble.Item CorpusVis: Visual Analysis of Digital Sheet Music Collections(The Eurographics Association and John Wiley & Sons Ltd., 2022) Miller, Matthias; Rauscher, Julius; Keim, Daniel A.; El-Assady, Mennatallah; Borgo, Rita; Marai, G. Elisabeta; Schreck, TobiasManually investigating sheet music collections is challenging for music analysts due to the magnitude and complexity of underlying features, structures, and contextual information. However, applying sophisticated algorithmic methods would require advanced technical expertise that analysts do not necessarily have. Bridging this gap, we contribute CorpusVis, an interactive visual workspace, enabling scalable and multi-faceted analysis. Our proposed visual analytics dashboard provides access to computational methods, generating varying perspectives on the same data. The proposed application uses metadata including composers, type, epoch, and low-level features, such as pitch, melody, and rhythm. To evaluate our approach, we conducted a pair-analytics study with nine participants. The qualitative results show that CorpusVis supports users in performing exploratory and confirmatory analysis, leading them to new insights and findings. In addition, based on three exemplary workflows, we demonstrate how to apply our approach to different tasks, such as exploring musical features or comparing composers.Item A Survey on Cross‐Virtuality Analytics(© 2022 Eurographics ‐ The European Association for Computer Graphics and John Wiley & Sons Ltd, 2022) Fröhler, B.; Anthes, C.; Pointecker, F.; Friedl, J.; Schwajda, D.; Riegler, A.; Tripathi, S.; Holzmann, C.; Brunner, M.; Jodlbauer, H.; Jetter, H.‐C.; Heinzl, C.; Hauser, Helwig and Alliez, PierreCross‐virtuality analytics (XVA) is a novel field of research within immersive analytics and visual analytics. A broad range of heterogeneous devices across the reality–virtuality continuum, along with respective visual metaphors and analysis techniques, are currently becoming available. The goal of XVA is to enable visual analytics that use transitional and collaborative interfaces to seamlessly integrate different devices and support multiple users. In this work, we take a closer look at XVA and analyse the existing body of work for an overview of its current state. We classify the related literature regarding ways of establishing cross‐virtuality by interconnecting different stages in the reality–virtuality continuum, as well as techniques for transitioning and collaborating between the different stages. We provide insights into visualization and interaction techniques employed in current XVA systems. We report on ways of evaluating such systems, and analyse the domains where such systems are becoming available. Finally, we discuss open challenges in XVA, giving directions for future research.Item UnderPressure: Deep Learning for Foot Contact Detection, Ground Reaction Force Estimation and Footskate Cleanup(The Eurographics Association and John Wiley & Sons Ltd., 2022) Mourot, Lucas; Hoyet, Ludovic; Clerc, François Le; Hellier, Pierre; Dominik L. Michels; Soeren PirkHuman motion synthesis and editing are essential to many applications like video games, virtual reality, and film postproduction. However, they often introduce artefacts in motion capture data, which can be detrimental to the perceived realism. In particular, footskating is a frequent and disturbing artefact, which requires knowledge of foot contacts to be cleaned up. Current approaches to obtain foot contact labels rely either on unreliable threshold-based heuristics or on tedious manual annotation. In this article, we address automatic foot contact label detection from motion capture data with a deep learning based method. To this end, we first publicly release UNDERPRESSURE, a novel motion capture database labelled with pressure insoles data serving as reliable knowledge of foot contact with the ground. Then, we design and train a deep neural network to estimate ground reaction forces exerted on the feet from motion data and then derive accurate foot contact labels. The evaluation of our model shows that we significantly outperform heuristic approaches based on height and velocity thresholds and that our approach is much more robust when applied on motion sequences suffering from perturbations like noise or footskate. We further propose a fully automatic workflow for footskate cleanup: foot contact labels are first derived from estimated ground reaction forces. Then, footskate is removed by solving foot constraints through an optimisation-based inverse kinematics (IK) approach that ensures consistency with the estimated ground reaction forces. Beyond footskate cleanup, both the database and the method we propose could help to improve many approaches based on foot contact labels or ground reaction forces, including inverse dynamics problems like motion reconstruction and learning of deep motion models in motion synthesis or character animation. Our implementation, pre-trained model as well as links to database can be found at github.com/InterDigitalInc/UnderPressure.