Visual Agentic System for Spatial Metric Query Answering in Remote Sensing Images
dc.contributor.author | Wang, Yinghao | en_US |
dc.contributor.author | Wang, Cheng | en_US |
dc.contributor.editor | Günther, Tobias | en_US |
dc.contributor.editor | Montazeri, Zahra | en_US |
dc.date.accessioned | 2025-05-09T09:31:46Z | |
dc.date.available | 2025-05-09T09:31:46Z | |
dc.date.issued | 2025 | |
dc.description.abstract | Accurately measuring real-world object dimensions from Remote Sensing (RS) images is crucial for applications in geospatial analysis and urban planning. Traditional Vision-Language Models (VLMs) struggle with spatial reasoning, while end-to-end remote sensing VLMs are often limited to predefined tasks such as image captioning. In this paper, we propose a visual agentic system for spatial metric query answering, dynamically integrating code-generation agents with a grounded remote sensing VLM and a Vision Specialist. Our system autonomously identifies reference objects, infers scale factors, and performs spatial measurements through structured subroutines. Experiments demonstrate that our approach achieves higher accuracy in footprint area estimation compared to state-of-the-art large language models with vision capabilities. | en_US |
dc.description.sectionheaders | Posters | |
dc.description.seriesinformation | Eurographics 2025 - Posters | |
dc.identifier.doi | 10.2312/egp.20251028 | |
dc.identifier.isbn | 978-3-03868-269-1 | |
dc.identifier.issn | 1017-4656 | |
dc.identifier.pages | 2 pages | |
dc.identifier.uri | https://doi.org/10.2312/egp.20251028 | |
dc.identifier.uri | https://diglib.eg.org/handle/10.2312/egp20251028 | |
dc.publisher | The Eurographics Association | en_US |
dc.rights | Attribution 4.0 International License | |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | |
dc.subject | CCS Concepts: Computing methodologies → Scene Understanding; Image Segmentation; Object Identification | |
dc.subject | Computing methodologies → Scene Understanding | |
dc.subject | Image Segmentation | |
dc.subject | Object Identification | |
dc.title | Visual Agentic System for Spatial Metric Query Answering in Remote Sensing Images | en_US |