2025
Permanent URI for this collection
Advancing Machine Learning Algorithms for Object Localization in Data-Limited Scenarios : Techniques for 6DoF Pose Estimation and 2D Localization with limited Data
[meta data] [files: ]
Pöllabauer, Thomas Jürgen
Computational Design of Deployable Gridshells with Curved Elastic Beams
[meta data] [files: ]
Becker, Quentin
Efficient Computational Models for Forward and Inverse Elasticity Problems
[meta data] [files: ]
Li, Yue
Deep High Dynamic Range Imaging: Reconstruction, Generation and Display
[meta data] [files: ]
Chao Wang
Browse
Recent Submissions
Item Advancing Machine Learning Algorithms for Object Localization in Data-Limited Scenarios : Techniques for 6DoF Pose Estimation and 2D Localization with limited Data(2025-01-20) Pöllabauer, Thomas JürgenRecent successes of Machine Learning (ML) algorithms have profoundly influenced many fields, particularly Computer Vision (CV). One longstanding problem in CV is the task of determining the position and orientation of an object as depicted in an image in 3D space, relative to the recording camera sensor. Accurate pose estimation is essential for domains, such as robotics, augmented reality, autonomous driving, quality inspection in manufacturing, and many more. Current state-of-the-art pose estimation algorithms are dominated by Deep Learning-based approaches. However, adoption of these best in class algorithms to real-world tasks is often constrained by data limitations, such as not enough training data being available, existing data being of insufficient quality, data missing annotations, data having noisy annotations, or no directly suitable training data being available at all. This thesis presents contributions on both 6D object pose estimation, as well as on alleviating the restrictions of data limitations, for pose estimation, and for related CV problems such as classification, segmentation, and 2D object detection. It offers a range of solutions to enhance quality and efficiency of these tasks under different kinds of data limitations. The first contribution enhances a state-of-the-art pose estimation algorithm to predict a probability distribution of poses, instead of a single pose estimate. This approach allows to sample multiple, plausible poses for further refinement and outperforms the baseline algorithm even when sampling only the most likely pose. In our second contribution, we drastically improve runtime and reduce resource requirements to bring state-of-the-art pose estimation to low power edge devices, such as modern augmented and extended reality devices. Finally, we extend a pose estimator based on dense-feature prediction to incorporate additional views and illustrate its performance benefits in the stereo use case. The second set of two contributions focuses on data generation for ML-based CV tasks. High quality training data is a crucial component for best performance. We introduce a novel yet simple setup to record physical objects and generate all necessary annotations in a fully automated way. Evaluated on the 2D object detection use case, training on our data performs favourably with more complex data generation processes, such as real-world recordings and physically-based rendering. In a follow-up paper, we further improve upon the results by introducing a novel postprocessing step based on denoising diffusion probabilistic models (DDPM). At the intersection of 6D pose estimation and data generation methods, a final group of three contributions focuses on solving or circumventing the data problem with a range of different approaches. First, we demonstrate the use of physically-based, photorealistic, and non-photorealistic rendering to localize objects on Microsoft HoloLens 2, without needing any real-world images for training. Second, we extend a zero-shot pose estimation method by predicting geometric features, thereby improving estimation quality with almost no additional runtime. Third, we demonstrate pose estimation of objects with unseen appearances based on a 3D scene representation, allowing robust mesh-free pose estimation. In summary, this thesis advances the fields of 6D object pose estimation and alleviates some common data limitations for pose estimation and similar Machine Learning algorithms in Computer Vision problems, such as 2D detection and segmentation. The solutions proposed include several extensions to state-of-the-art 6D pose estimators and address the challenges of limited or poor quality training data, paving the way for more accurate, efficient, and accessible pose estimation technologies across various industries and fields.Item Computational Design of Deployable Gridshells with Curved Elastic Beams(EPFL, 2025) Becker, QuentinDeployable gridshells are lightweight structures made of interconnected elastic beams. They can be actuated from a compact state to a freeform and volume-enclosing deployed shape. This thesis introduces C-shells, a novel class of deployable gridshells, which employs curved elastic rods connected at single-axis rotational joints. As opposed to their straight counterparts, C-shells are guaranteed to be assembled in a planar and stress-free configuration while showing a wide diversity in their deployed shapes. They may serve as temporary shelters, pavilions, or on a smaller scale, as deployable furniture or decorative elements. This thesis presents a comprehensive framework for the forward exploration of C-shell designs, enabling designers to interactively search the shape space and generate deployable structures with diverse appearances and topologies. The framework combines human-interpretable manipulations of a reference linkage with an efÏcient physics-based simulation to predict the deployed shape and mechanical behavior of the structure. Preservation of the linkage deployability and smoothness of the edits are ensured through the use of conformal maps as design handles. The framework is implemented as a Rhino-Grasshopper plugin, providing visual and quantitative realtime feedback on the deployed state. The inverse design of C-shells is also addressed, where the deployed shape is given, and the flat state of the structure is computed. This thesis introduces a two-step pipeline composed of a flattening method and a design optimization algorithm. The flattening algorithm is based on kinetic considerations underlying the deployment of C-shells. The method harmonizes a flat and a hypothetical deployed state constrained on a user-prescribed target surface. The flat beam layout is further adjusted to minimize the deviation of the deployed shape to the target surface while ensuring a low elastic energy deployed state, under some beam smoothness regularization. The proposed method is validated through scanned small-scale prototypes. C-shells are made of curved rods, which entails additional material waste compared to straight beams. To address this issue, this thesis presents a rationalization method that splits the curved beams into smaller straight elements which can be grouped into a sparse kit of parts, while preserving user-provided designs. The original combinatorial problem of jointly assigning parts to elements and adapting the parts’ geometry is relaxed into a two-step optimization process incorporating our physics-based simulation, making it tractable using continuous optimization techniques. The proposed method applies more generally to bending-active structures and is further demonstrated on orthogonal gridshells and umbrella meshes. Part reuse is assessed in a study of the trade-off between the number of parts and fidelity to the input designs.Item Efficient Computational Models for Forward and Inverse Elasticity Problems(ETH Zurich, 2025-05-28) Li, YueElasticity is at the core of many scientific and engineering applications, including the design of resilient structures and advanced materials, and the modeling of biological tissues. Simulating elastic systems poses significant computational challenges due to the inherent nonlinearity of the governing equations, which calls for efficient optimization methods to determine equilibrium states. Second-order methods are particularly attractive because of their superior convergence properties relative to first-order techniques. However, the effective use of second-order solvers requires that the underlying functions and their derivatives are sufficiently smooth and available in closed form. This smoothness can easily degrade when generalizing standard computational models to a broader set of design tasks. This thesis proposes efficient computational models that enable robust and effective simulations for physics-based modeling and the design of complex elastic systems. In chapter~\ref{chapter:PDW}, we propose a novel fabric-like metamaterial that features persisting contacts between 3D-printed yarns. To avoid the complexities of explicit contact modeling, we adopt an Eulerian-on-Lagrangian simulation paradigm; however, current methods remain limited to straight rods. We leverage a $C^2$-continuous representation to allow for Newton-type minimization on naturally curved rods. Chapter~\ref{chapter:DiffGD} presents a computational paradigm for intrinsic minimization of distance-based objectives defined on triangle meshes. Although Euclidean distances meet the $C^2$-continuity requirement, geodesic distances on triangle meshes do not. To permit efficient second-order optimization of embedded elasticity problems, we provide analytical derivatives as well as suitable mollifiers to recover $C^2$-continuity. Finally, in chapter~\ref{chapter:NMN}, we address non-smoothness issues that arise in nonlinear material design, where changes in geometry parameters can lead to discontinuous changes in simulation meshes. We employ neural networks with tailored nonlinearities as $C^\infty$-continuous and differentiable representations to characterize the elastic properties of families of mechanical metamaterials. The resulting smooth representation enables gradient-based inverse design for various high-level design goals.Item Deep High Dynamic Range Imaging: Reconstruction, Generation and Display(2025-07-04) Chao WangHigh Dynamic Range (HDR) images offer significant advantages over Low Dynamic Range (LDR) images, including greater bit depth, a wider color gamut, and a higher dynamic range. These features not only provide users with an enhanced visual experience but also facilitate post-production processes in photography and filmmaking. Despite the considerable advancements in HDR technology over the years, significant challenges persist in the acquisition and display of HDR content. This thesis systematically explores the potential of leveraging deep learning techniques combined with physical prior knowledge to address these challenges. First, it investigates how implicit neural representations can be utilized to reconstruct all-in-focus HDR images from sparse, defocused LDR inputs, enabling flexible refocusing and re-exposure. Additionally, it extends the scope to the 3D domain by employing 3D Gaussian Splatting to reconstruct HDR all-in-focus fields from multi-view LDR defocused images, supporting novel view synthesis with refocusing and re-exposure capabilities. Expanding further, the thesis investigates strategies for generating HDR content from the in-the-wild LDR data or limited HDR datasets, and subsequently utilizes the resulting HDR generative models as priors to enable the transformation of LDR images into HDR. Finally, it proposes a feature contrast masking loss inspired by visual masking theory, enabling a self-supervised learning tone mapper to display the HDR content on LDR devices.