Browsing by Author "Moreland, Kenneth"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item An Accelerated Clip Algorithm for Unstructured Meshes: A Batch-Driven Approach(The Eurographics Association, 2024) Tsalikis, Spiros; Schroeder, Will; Szafir, Daniel; Moreland, Kenneth; Reina, Guido; Rizzi, SilvioThe clip technique is a popular method for visualizing complex structures and phenomena within 3D unstructured meshes. Meshes can be clipped by specifying a scalar isovalue to produce an output unstructured mesh with its external surface as the isovalue. Similar to isocontouring, the clipping process relies on scalar data associated with the mesh points, including scalar data generated by implicit functions such as planes, boxes, and spheres, which facilitates the visualization of results interior to the grid. In this paper, we introduce a novel batch-driven parallel algorithm based on a sequential clip algorithm designed for high-quality results in partial volume extraction. Our algorithm comprises five passes, each progressively processing data to generate the resulting clipped unstructured mesh. The novelty lies in the use of fixed-size batches of points and cells, which enable rapid workload trimming and parallel processing, leading to a significantly improved memory footprint and run-time performance compared to the original version. On a 32-core CPU, the proposed batch-driven parallel algorithm demonstrates a run-time speed-up of up to 32.6x and a memory footprint reduction of up to 4.37x compared to the existing sequential algorithm. The software is currently available under an open-source license in the VTK visualization system.Item In Situ Workload Estimation for Block Assignment and Duplication in Parallelization-Over-Data Particle Advection(The Eurographics Association and John Wiley & Sons Ltd., 2025) Wang, Zhe; Moreland, Kenneth; Larsen, Matthew; Kress, James; Childs, Hank; Li, Guan; Shan, Guihua; Pugmire, David; Aigner, Wolfgang; Andrienko, Natalia; Wang, BeiParticle advection is a foundational algorithm for analyzing a flow field. The commonly used Parallelization-Over-Data (POD) strategy for particle advection can become slow and inefficient when there are unbalanced workloads, which are particularly prevalent in in situ workflows. In this work, we present an in situ workflow containing workload estimation for block assignment and duplication in a parallelization-over-data algorithm. With tightly coupled workload estimation and load-balanced block assignment strategy, our workflow offers a considerable improvement over the traditional round-robin block assignment strategy. Our experiments demonstrate that particle advection is up to 3X faster and associated workflow saves approximately 30% of execution time after adopting strategies presented in this work.Item Scalable In Situ Computation of Lagrangian Representations via Local Flow Maps(The Eurographics Association, 2021) Sane, Sudhanshu; Yenpure, Abhishek; Bujack, Roxana; Larsen, Matthew; Moreland, Kenneth; Garth, Christoph; Johnson, Chris R.; Childs, Hank; Larsen, Matthew and Sadlo, FilipIn situ computation of Lagrangian flow maps to enable post hoc time-varying vector field analysis has recently become an active area of research. However, the current literature is largely limited to theoretical settings and lacks a solution to address scalability of the technique in distributed memory. To improve scalability, we propose and evaluate the benefits and limitations of a simple, yet novel, performance optimization. Our proposed optimization is a communication-free model resulting in local Lagrangian flow maps, requiring no message passing or synchronization between processes, intrinsically improving scalability, and thereby reducing overall execution time and alleviating the encumbrance placed on simulation codes from communication overheads. To evaluate our approach, we computed Lagrangian flow maps for four time-varying simulation vector fields and investigated how execution time and reconstruction accuracy are impacted by the number of GPUs per compute node, the total number of compute nodes, particles per rank, and storage intervals. Our study consisted of experiments computing Lagrangian flow maps with up to 67M particle trajectories over 500 cycles and used as many as 2048 GPUs across 512 compute nodes. In all, our study contributes an evaluation of a communication-free model as well as a scalability study of computing distributed Lagrangian flow maps at scale using in situ infrastructure on a modern supercomputer.