Modeling Sketches both Semantically and Structurally for Zero-Shot Sketch-Based Image Retrieval is Better

Loading...
Thumbnail Image
Date
2024
Journal Title
Journal ISSN
Volume Title
Publisher
The Eurographics Association
Abstract
Sketch, as a representation of human thought, is abstract but also structured because it is presented as a two-dimensional image. Therefore, modeling it from semantic and structural perspectives is reasonable and effective. In this paper, for the semantic capturing, we compare the performance of two mainstream pre-trained models on the Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) task and propose a new model, Semantic Net (SNET), based on Contrastive Language-Image Pre-training (CLIP) with a more effective fine-tuning strategy and a Semantic Preservation Module. Furthermore, we propose three lightweight modules, Channels Fusion (CF), Layers Fusion (LF), and Semantic Structure Fusion (SSF) to endow SNET with the ability of stronger structure capture. Finally, we supervise the entire training process by a classification loss based on contrastive learning and bidirectional triplet loss based on cosine distance metric. We call the final version model Semantic Structure Net (SSNET). The quantitative experimental results show that both our proposed SNET and the enhanced version SSNET achieve the new SOTA (16% retrieval boost on the most difficult QuickDraw Ext dataset). The visualization experiments also prove our thinking on sketch modeling from the side.
Description

CCS Concepts: Computing methodologies → Visual content-based indexing and retrieval

        
@inproceedings{
10.2312:pg.20241309
, booktitle = {
Pacific Graphics Conference Papers and Posters
}, editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Modeling Sketches both Semantically and Structurally for Zero-Shot Sketch-Based Image Retrieval is Better
}}, author = {
Jing, Jiansen
and
Liu, Yujie
and
Li, Mingyue
and
Xiao, Qian
and
Chai, Shijie
}, year = {
2024
}, publisher = {
The Eurographics Association
}, ISBN = {
978-3-03868-250-9
}, DOI = {
10.2312/pg.20241309
} }
Citation