InterChat: Enhancing Generative Visual Analytics using Multimodal Interactions

Chen, Juntong; Wu, Jiang; Guo, Jiajing; Mohanty, Vikram; Li, Xueming; Ono, Jorge Piazentin; He, Wenbin; Ren, Liu; Liu, Dongyu

InterChat: Enhancing Generative Visual Analytics using Multimodal Interactions

dc.contributor.author	Chen, Juntong	en_US
dc.contributor.author	Wu, Jiang	en_US
dc.contributor.author	Guo, Jiajing	en_US
dc.contributor.author	Mohanty, Vikram	en_US
dc.contributor.author	Li, Xueming	en_US
dc.contributor.author	Ono, Jorge Piazentin	en_US
dc.contributor.author	He, Wenbin	en_US
dc.contributor.author	Ren, Liu	en_US
dc.contributor.author	Liu, Dongyu	en_US
dc.contributor.editor	Aigner, Wolfgang	en_US
dc.contributor.editor	Andrienko, Natalia	en_US
dc.contributor.editor	Wang, Bei	en_US
dc.date.accessioned	2025-05-26T06:37:06Z
dc.date.available	2025-05-26T06:37:06Z
dc.date.issued	2025
dc.description.abstract	The rise of Large Language Models (LLMs) and generative visual analytics systems has transformed data-driven insights, yet significant challenges persist in accurately interpreting users analytical and interaction intents. While language inputs offer flexibility, they often lack precision, making the expression of complex intents inefficient, error-prone, and time-intensive. To address these limitations, we investigate the design space of multimodal interactions for generative visual analytics through a literature review and pilot brainstorming sessions. Building on these insights, we introduce a highly extensible workflow that integrates multiple LLM agents for intent inference and visualization generation.We develop InterChat, a generative visual analytics system that combines direct manipulation of visual elements with natural language inputs. This integration enables precise intent communication and supports progressive, visually driven exploratory data analyses. By employing effective prompt engineering, and contextual interaction linking, alongside intuitive visualization and interaction designs, InterChat bridges the gap between user interactions and LLM-driven visualizations, enhancing both interpretability and usability. Extensive evaluations, including two usage scenarios, a user study, and expert feedback, demonstrate the effectiveness of InterChat. Results show significant improvements in the accuracy and efficiency of handling complex visual analytics tasks, highlighting the potential of multimodal interactions to redefine user engagement and analytical depth in generative visual analytics.	en_US
dc.description.sectionheaders	Explainable and Generative AI
dc.description.seriesinformation	Computer Graphics Forum
dc.identifier.doi	10.1111/cgf.70112
dc.identifier.issn	1467-8659
dc.identifier.pages	12 pages
dc.identifier.uri	https://doi.org/10.1111/cgf.70112
dc.identifier.uri	https://diglib.eg.org/handle/10.1111/cgf70112
dc.publisher	The Eurographics Association and John Wiley & Sons Ltd.	en_US
dc.rights	Attribution 4.0 International License
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	CCS Concepts: Human-centered computing → Interactive systems and tools; Visual analytics; Computing methodologies → Natural language processing
dc.subject	Human centered computing → Interactive systems and tools
dc.subject	Visual analytics
dc.subject	Computing methodologies → Natural language processing
dc.title	InterChat: Enhancing Generative Visual Analytics using Multimodal Interactions	en_US