Browsing by Author "Andreou, Nefeli"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item LEAD: Latent Realignment for Human Motion Diffusion(The Eurographics Association and John Wiley & Sons Ltd., 2025) Andreou, Nefeli; Wang, Xi; Fernández Abrevaya, Victoria; Cani, Marie-Paule; Chrysanthou, Yiorgos; Kalogeiton, Vicky; Wimmer, Michael; Alliez, Pierre; Westermann, RüdigerOur goal is to generate realistic human motion from natural language. Modern methods often face a trade-off between model expressiveness and text-to-motion (T2M) alignment. Some align text and motion latent spaces but sacrifice expressiveness; others rely on diffusion models producing impressive motions but lacking semantic meaning in their latent space. This may compromise realism, diversity and applicability. Here, we address this by combining latent diffusion with a realignment mechanism, producing a novel, semantically structured space that encodes the semantics of language. Leveraging this capability, we introduce the task of textual motion inversion to capture novel motion concepts from a few examples. For motion synthesis, we evaluate LEAD on HumanML3D and KIT-ML and show comparable performance to the state-of-the-art in terms of realism, diversity and textmotion consistency. Our qualitative analysis and user study reveal that our synthesised motions are sharper, more human-like and comply better with the text compared to modern methods. For motion textual inversion (MTI), our method demonstrates improvements in capturing out-of-distribution characteristics in comparison to traditional VAEs.Item LexiCrowd: A Learning Paradigm towards Text to Behaviour Parameters for Crowds(The Eurographics Association, 2024) Lemonari, Marilena; Andreou, Nefeli; Pelechano, Nuria; Charalambous, Panayiotis; Chrysanthou, Yiorgos; Pelechano, Nuria; Pettré, JulienCreating believable virtual crowds, controllable by high-level prompts, is essential to creators for trading-off authoring freedom and simulation quality. The flexibility and familiarity of natural language in particular, motivates the use of text to guide the generation process. Capturing the essence of textually described crowd movements in the form of meaningful and usable parameters, is challenging due to the lack of paired ground truth data, and inherent ambiguity between the two modalities. In this work, we leverage a pre-trained Large Language Model (LLM) to create pseudo-pairs of text and behaviour labels. We train a variational auto-encoder (VAE) on the synthetic dataset, constraining the latent space into interpretable behaviour parameters by incorporating a latent label loss. To showcase our model's capabilities, we deploy a survey where humans provide textual descriptions of real crowd datasets. We demonstrate that our model is able to parameterise unseen sentences and produce novel behaviours, capturing the essence of the given sentence; our behaviour space is compatible with simulator parameters, enabling the generation of plausible crowds (text-to-crowds). Also, we conduct feasibility experiments exhibiting the potential of the output text embeddings in the premise of full sentence generation from a behaviour profile.Item Pose Representations for Deep Skeletal Animation(The Eurographics Association and John Wiley & Sons Ltd., 2022) Andreou, Nefeli; Aristidou, Andreas; Chrysanthou, Yiorgos; Dominik L. Michels; Soeren PirkData-driven skeletal animation relies on the existence of a suitable learning scheme, which can capture the rich context of motion. However, commonly used motion representations often fail to accurately encode the full articulation of motion, or present artifacts. In this work, we address the fundamental problem of finding a robust pose representation for motion, suitable for deep skeletal animation, one that can better constrain poses and faithfully capture nuances correlated with skeletal characteristics. Our representation is based on dual quaternions, the mathematical abstractions with well-defined operations, which simultaneously encode rotational and positional orientation, enabling a rich encoding, centered around the root. We demonstrate that our representation overcomes common motion artifacts, and assess its performance compared to other popular representations. We conduct an ablation study to evaluate the impact of various losses that can be incorporated during learning. Leveraging the fact that our representation implicitly encodes skeletal motion attributes, we train a network on a dataset comprising of skeletons with different proportions, without the need to retarget them first to a universal skeleton, which causes subtle motion elements to be missed. Qualitative results demonstrate the usefulness of the parameterization in skeleton-specific synthesis.