PG2025 Conference Papers, Posters, and Demos
Permanent URI for this collection
Browse
Browsing PG2025 Conference Papers, Posters, and Demos by Subject "based indexing and retrieval"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
Item PF-UCDR: A Local-Aware RGB-Phase Fusion Network with Adaptive Prompts for Universal Cross-Domain Retrieval(The Eurographics Association, 2025) Wu, Yiqi; Hu, Ronglei; Wu, Huachao; He, Fazhi; Zhang, Dejun; Christie, Marc; Han, Ping-Hsuan; Lin, Shih-Syun; Pietroni, Nico; Schneider, Teseo; Tsai, Hsin-Ruey; Wang, Yu-Shuen; Zhang, EugeneUniversal Cross-Domain Retrieval (UCDR) aims to match semantically related images across domains and categories not seen during training. While vision-language pre-trained models offer strong global alignment, we are inspired by the observation that local structures, such as shapes, contours, and textures, often remain stable across domains, and thus propose to model them explicitly at the patch level. We present PF-UCDR, a framework built upon frozen vision-language backbones that performs patch-wise fusion of RGB and phase representations. Central to our design is a Fusing Vision Encoder, which applies masked cross-attention to spatially aligned RGB and phase patches, enabling fine-grained integration of complementary appearance and structural cues. Additionally, we incorporate adaptive visual prompts that condition image encoding based on domain and class context. Local and global fusion modules aggregate these enriched features, and a two-stage training strategy progressively optimizes alignment and retrieval objectives. Experiments on standard UCDR benchmarks demonstrate that PF-UCDR significantly outperforms existing methods, validating the effectiveness of structure-aware local fusion grounded in multimodal pretraining. Our code is publicly available at https://github.com/djzgroup/PF-UCDR.