Fully Controllable Data Generation For Realtime Face Capture

Chandran, Prashanth

Fully Controllable Data Generation For Realtime Face Capture

dc.contributor.author	Chandran, Prashanth
dc.contributor.author		en_US
dc.date.accessioned	2023-12-06T07:43:08Z
dc.date.available	2023-12-06T07:43:08Z
dc.date.issued	2023-01-31
dc.description.abstract	Data driven realtime face capture has gained considerable momentum in the last few years thanks to deep neural networks that leverage specialized datasets to speedup the acquisition of face geometry and appearance. However generalizing such neural solutions to generic in-the-wild face capture continues to remain a challenge due to the lack of, or a means to generate a high quality in-the-wild face database with all forms of groundtruth (geometry, appearance, environment maps, etc.). In this thesis we recognize this data bottleneck and propose a comprehensive framework for controllable, high quality, in-the-wild data generation that can support present and future applications in face capture. We approach this problem in four stages starting with the building of a high quality 3D face database consisting of a few hundred subjects in a studio setting. This database will serve as a strong prior for 3D face geometry and appearance for several methods discussed in this thesis. To build this 3D database and to automate the registration of scans to a template mesh, we propose the first deep facial landmark detector capable of operating on 4K resolution imagery while also achieving state-of-the-art performance on several in-the-wild benchmarks. Our second stage leverages the proposed 3D face database to build powerful nonlinear 3D morphable models for static geometry modelling and synthesis. We propose the first semantic deep face model that combines the semantic interpretability of traditional 3D morphable models with the nonlinear expressivity of neural networks. We later extend this semantic deep face model with a novel transformer based architecture and propose the Shape Transformer, for representing and manipulating face shapes irrespective of their mesh connectivity. The third stage of our data generation pipeline involves extending the approaches for static geometry synthesis to support facial deformations across time so as to synthesize dynamic performances. To synthesize facial performances we propose two parallel approaches, one involving performance retargeting and another based on a data driven 4D (3D + time) morphable model. We propose a local anatomically constrained facial performance retargeting technique that uses only a handful of blendshapes (20 shapes) to achieve production quality results. This retargeting technique can readily be used to create novel animations for any given actor via animation transfer. Our second contribution for generating facial performances is through a transformer based 4D autoencoder that encodes a sequence of expression blend weights into a learned performance latent space. Novel performances can then be generated at inference time by sampling this learned latent space. The fourth and final stage of our data generation pipeline involves the creation of photorealistic imagery that can go along with the facial geometry and animations synthesized thus far. We propose a hybrid rendering approach that leverages state-of-the-art techniques for ray traced skin rendering and a pretrained 2D generative model for photorealistic and consistent inpainting of the skin renders. Our hybrid rendering technique allows for the creation of an infinite number of training samples where the user has full control over the facial geometry, appearance, lighting and viewpoint. The techniques presented in this thesis will serve as the foundation for creating large scale photorealistic in-the-wild face datasets to support the next generation of realtime face capture.	en_US
dc.description.seriesinformation	EG Graphics Dissertation Online
dc.identifier.doi	10.2312/diss.20233543923
dc.identifier.uri
dc.identifier.uri	https://diglib.eg.org/handle/10.2312/3543923
dc.identifier.uri	https://doi.org/10.2312/diss.20233543923
dc.language.iso	en_US	en_US
dc.publisher	ETH Zurich	en_US
dc.rights.uri	-
dc.subject	Facial Performance Capture	en_US
dc.subject	Animation	en_US
dc.subject	Generative Models	en_US
dc.subject	Retargeting	en_US
dc.subject	Neural Rendering	en_US
dc.subject	Shape Modeling	en_US
dc.subject	Shape Generation	en_US
dc.subject	Morphable Model	en_US
dc.subject	Keypoint Detection	en_US
dc.subject	Face Tracking	en_US
dc.subject	Performance Capture	en_US
dc.subject	Data Driven Animation	en_US
dc.subject	Motion Capture	en_US
dc.title	Fully Controllable Data Generation For Realtime Face Capture	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Fully_Controllable_Data_Generation_For_Realtime_Face_Capture_PrashanthChandran_2023.pdf
Size:: 84.19 MB
Format:: Adobe Portable Document Format
Description:: Main Article

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.79 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

2023