Browsing by Author "Yang, Jimei"
Now showing 1 - 5 of 5
Results Per Page
Sort Options
Item Brush Stroke Synthesis with a Generative Adversarial Network Driven by Physically Based Simulation(ACM, 2018) Wu, Rundong; Chen, Zhili; Wang, Zhaowen; Yang, Jimei; Marschner, Steve; Aydın, Tunç and Sýkora, DanielWe introduce a novel approach that uses a generative adversarial network (GAN) to synthesize realistic oil painting brush strokes, where the network is trained with data generated by a high-fidelity simulator. Among approaches to digitally synthesizing natural media painting strokes, methods using physically based simulation by far produce the most realistic visual results and allow the most intuitive control of stroke variations. However, accurate physics simulations are known to be computationally expensive and often cannot meet the performance requirements of painting applications. A few existing simulation-based methods have managed to reach real-time performance at the cost of lower visual quality resulting from simplified models or lower resolution. In our work, we propose to replace the expensive fluid simulation with a neural network generator. The network takes the existing canvas and new brush trajectory information as input and produces the height and color of the paint surface as output. We build a large painting sample training dataset by feeding random strokes from artists' recordings into a high quality offline simulator. The network is able to produce visual quality comparable to the offline simulator with better performance than the existing real-time oil painting simulator. Finally, we implement a real-time painting system using the trained network with stroke splitting and patch blending and show artworks created with the system by artists. Our neural network approach opens up new opportunities for real-time applications of sophisticated and expensive physically based simulation.Item Contact and Human Dynamics from Monocular Video(The Eurographics Association, 2020) Rempe, Davis; Guibas, Leonidas J.; Hertzmann, Aaron; Russell, Bryan; Villegas, Ruben; Yang, Jimei; Holden, DanielExisting methods for human motion from video predict 2D and 3D poses that are approximately accurate, but contain visible errors that violate physical constraints, such as feet penetrating the ground and bodies leaning at extreme angles. We present a physics-based method for inferring 3D human motion from video sequences that takes initial 2D and 3D pose estimates as input. We first estimate ground contact timings with a neural network which is trained without hand-labeled data. A physicsbased trajectory optimization then solves for a physically-plausible motion, based on the inputs. We show this process produces motions that are more realistic than those from purely kinematic methods for character animation from dynamic videos. A detailed report that fully describes our method is available at geometry.stanford.edu/projects/human-dynamics-eccv-2020.Item Learning to Trace: Expressive Line Drawing Generation from Photographs(The Eurographics Association and John Wiley & Sons Ltd., 2019) Inoue, Naoto; Ito, Daichi; Xu, Ning; Yang, Jimei; Price, Brian; Yamasaki, Toshihiko; Lee, Jehee and Theobalt, Christian and Wetzstein, GordonIn this paper, we present a new computational method for automatically tracing high-resolution photographs to create expressive line drawings. We define expressive lines as those that convey important edges, shape contours, and large-scale texture lines that are necessary to accurately depict the overall structure of objects (similar to those found in technical drawings) while still being sparse and artistically pleasing. Given a photograph, our algorithm extracts expressive edges and creates a clean line drawing using a convolutional neural network (CNN). We employ an end-to-end trainable fully-convolutional CNN to learn the model in a data-driven manner. The model consists of two networks to cope with two sub-tasks; extracting coarse lines and refining them to be more clean and expressive. To build a model that is optimal for each domain, we construct two new datasets for face/body and manga background. The experimental results qualitatively and quantitatively demonstrate the effectiveness of our model. We further illustrate two practical applications.Item Single-image Full-body Human Relighting(The Eurographics Association, 2021) Lagunas, Manuel; Sun, Xin; Yang, Jimei; Villegas, Ruben; Zhang, Jianming; Shu, Zhixin; Masia, Belen; Gutierrez, Diego; Bousseau, Adrien and McGuire, MorganWe present a single-image data-driven method to automatically relight images with full-body humans in them. Our framework is based on a realistic scene decomposition leveraging precomputed radiance transfer (PRT) and spherical harmonics (SH) lighting. In contrast to previous work, we lift the assumptions on Lambertian materials and explicitly model diffuse and specular reflectance in our data. Moreover, we introduce an additional light-dependent residual term that accounts for errors in the PRTbased image reconstruction. We propose a new deep learning architecture, tailored to the decomposition performed in PRT, that is trained using a combination of L1, logarithmic, and rendering losses. Our model outperforms the state of the art for full-body human relighting both with synthetic images and photographs.Item Statistics-based Motion Synthesis for Social Conversations(The Eurographics Association and John Wiley & Sons Ltd., 2020) Yang, Yanzhe; Yang, Jimei; Hodgins, Jessica; Bender, Jan and Popa, TiberiuPlausible conversations among characters are required to generate the ambiance of social settings such as a restaurant, hotel lobby, or cocktail party. In this paper, we propose a motion synthesis technique that can rapidly generate animated motion for characters engaged in two-party conversations. Our system synthesizes gestures and other body motions for dyadic conversations that synchronize with novel input audio clips. Human conversations feature many different forms of coordination and synchronization. For example, speakers use hand gestures to emphasize important points, and listeners often nod in agreement or acknowledgment. To achieve the desired degree of realism, our method first constructs a motion graph that preserves the statistics of a database of recorded conversations performed by a pair of actors. This graph is then used to search for a motion sequence that respects three forms of audio-motion coordination in human conversations: coordination to phonemic clause, listener response, and partner's hesitation pause. We assess the quality of the generated animations through a user study that compares them to the originally recorded motion and evaluate the effects of each type of audio-motion coordination via ablation studies.