Repository logo
  • Communities & Collections
  • All of DSpace
  • English
  • ÄŒeÅ¡tina
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • LatvieÅ¡u
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Log In
    or
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Browse by Author

Browsing by Author "Endo, Yuki"

Now showing 1 - 3 of 3
Results Per Page
Sort Options
  • Loading...
    Thumbnail Image
    Item
    Diversifying Semantic Image Synthesis and Editing via Class- and Layer-wise VAEs
    (The Eurographics Association and John Wiley & Sons Ltd., 2020) Endo, Yuki; Kanamori, Yoshihiro; Eisemann, Elmar and Jacobson, Alec and Zhang, Fang-Lue
    Semantic image synthesis is a process for generating photorealistic images from a single semantic mask. To enrich the diversity of multimodal image synthesis, previous methods have controlled the global appearance of an output image by learning a single latent space. However, a single latent code is often insufficient for capturing various object styles because object appearance depends on multiple factors. To handle individual factors that determine object styles, we propose a class- and layer-wise extension to the variational autoencoder (VAE) framework that allows flexible control over each object class at the local to global levels by learning multiple latent spaces. Furthermore, we demonstrate that our method generates images that are both plausible and more diverse compared to state-of-the-art methods via extensive experiments with real and synthetic datasets in three different domains. We also show that our method enables a wide range of applications in image synthesis and editing tasks.
  • Loading...
    Thumbnail Image
    Item
    Relighting Humans in the Wild: Monocular Full-Body Human Relighting with Domain Adaptation
    (The Eurographics Association and John Wiley & Sons Ltd., 2021) Tajima, Daichi; Kanamori, Yoshihiro; Endo, Yuki; Zhang, Fang-Lue and Eisemann, Elmar and Singh, Karan
    The modern supervised approaches for human image relighting rely on training data generated from 3D human models. However, such datasets are often small (e.g., Light Stage data with a small number of individuals) or limited to diffuse materials (e.g., commercial 3D scanned human models). Thus, the human relighting techniques suffer from the poor generalization capability and synthetic-to-real domain gap. In this paper, we propose a two-stage method for single-image human relighting with domain adaptation. In the first stage, we train a neural network for diffuse-only relighting. In the second stage, we train another network for enhancing non-diffuse reflection by learning residuals between real photos and images reconstructed by the diffuse-only network. Thanks to the second stage, we can achieve higher generalization capability against various cloth textures, while reducing the domain gap. Furthermore, to handle input videos, we integrate illumination-aware deep video prior to greatly reduce flickering artifacts even with challenging settings under dynamic illuminations.
  • Loading...
    Thumbnail Image
    Item
    User-Controllable Latent Transformer for StyleGAN Image Layout Editing
    (The Eurographics Association and John Wiley & Sons Ltd., 2022) Endo, Yuki; Umetani, Nobuyuki; Wojtan, Chris; Vouga, Etienne
    Latent space exploration is a technique that discovers interpretable latent directions and manipulates latent codes to edit various attributes in images generated by generative adversarial networks (GANs). However, in previous work, spatial control is limited to simple transformations (e.g., translation and rotation), and it is laborious to identify appropriate latent directions and adjust their parameters. In this paper, we tackle the problem of editing the StyleGAN image layout by annotating the image directly. To do so, we propose an interactive framework for manipulating latent codes in accordance with the user inputs. In our framework, the user annotates a StyleGAN image with locations they want to move or not and specifies a movement direction by mouse dragging. From these user inputs and initial latent codes, our latent transformer based on a transformer encoderdecoder architecture estimates the output latent codes, which are fed to the StyleGAN generator to obtain a result image. To train our latent transformer, we utilize synthetic data and pseudo-user inputs generated by off-the-shelf StyleGAN and optical flow models, without manual supervision. Quantitative and qualitative evaluations demonstrate the effectiveness of our method over existing methods.

Eurographics Association © 2013-2025  |  System hosted at Graz University of Technology      
DSpace software copyright © 2002-2025 LYRASIS

  • Cookie settings
  • Privacy policy
  • End User Agreement
  • Send Feedback