A Multimodal Personality Prediction Framework based on Adaptive Graph Transformer Network and Multi-task Learning

dc.contributor.authorWang, Rongquanen_US
dc.contributor.authorZhao, Xileen_US
dc.contributor.authorXu, Xianyuen_US
dc.contributor.authorHao, Yangen_US
dc.contributor.editorBousseau, Adrienen_US
dc.contributor.editorDay, Angelaen_US
dc.date.accessioned2025-05-09T09:11:45Z
dc.date.available2025-05-09T09:11:45Z
dc.date.issued2025
dc.description.abstractMultimodal personality analysis targets accurately detecting personality traits by incorporating related multimodal information. However, existing methods focus on unimodal features while overlooking the bimodal association features crucial for this interdisciplinary task. Therefore, we propose a multimodal personality prediction framework based on an adaptive graph transformer network and multi-task learning. Firstly, we utilize pre-trained models to learn specific representations from different modalities. Here, we employ pre-trained multimodal models' encoders as the backbones of the modality-specific extraction methods to mine unimodal features. Specifically, we introduce a novel adaptive graph transformer network to mine personalityrelated bimodal association features. This network effectively learns higher-order temporal dependencies based on relational graphs and emphasizes more significant features. Furthermore, we utilize a multimodal channel attention residual fusion module to obtain the fused features, and we propose a multimodal and unimodal joint learning regression head to learn and predict scores for personality traits. We design a multi-task loss function to enhance the robustness and accuracy of personality prediction. Experimental results on the two benchmark datasets demonstrate the effectiveness of our framework, which outperforms the state-of-the-art methods. The code is available at https://github.com/RongquanWang/PPF-AGTNMTL.en_US
dc.description.number2
dc.description.sectionheadersFix it in Post: Image and Video Synthesis and Analysis
dc.description.seriesinformationComputer Graphics Forum
dc.description.volume44
dc.identifier.doi10.1111/cgf.70030
dc.identifier.issn1467-8659
dc.identifier.pages10 pages
dc.identifier.urihttps://doi.org/10.1111/cgf.70030
dc.identifier.urihttps://diglib.eg.org/handle/10.1111/cgf70030
dc.publisherThe Eurographics Association and John Wiley & Sons Ltd.en_US
dc.subjectCCS Concepts: Imaging/Video → Image/Video Processing; Interaction → Multimodal/Cross-modal Interaction; Methods/Applications → Artificial Intelligence/Machine Learning
dc.subjectImaging/Video → Image/Video Processing
dc.subjectInteraction → Multimodal/Cross
dc.subjectmodal Interaction
dc.subjectMethods/Applications → Artificial Intelligence/Machine Learning
dc.titleA Multimodal Personality Prediction Framework based on Adaptive Graph Transformer Network and Multi-task Learningen_US
Files
Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
cgf70030.pdf
Size:
1.42 MB
Format:
Adobe Portable Document Format
Loading...
Thumbnail Image
Name:
paper1128_1.pdf
Size:
127.84 KB
Format:
Adobe Portable Document Format