StyleBlend: Enhancing Style-Specific Content Creation in Text-to-Image Diffusion Models

Chen, Zichong; Wang, Shijin; Zhou, Yang

StyleBlend: Enhancing Style-Specific Content Creation in Text-to-Image Diffusion Models

Files

cgf70034.pdf (144.21 MB)

Date

2025

Authors

Chen, Zichong
Wang, Shijin
Zhou, Yang

Publisher

The Eurographics Association and John Wiley & Sons Ltd.

Abstract

Synthesizing visually impressive images that seamlessly align both text prompts and specific artistic styles remains a significant challenge in Text-to-Image (T2I) diffusion models. This paper introduces StyleBlend, a method designed to learn and apply style representations from a limited set of reference images, enabling content synthesis of both text-aligned and stylistically coherent. Our approach uniquely decomposes style into two components, composition and texture, each learned through different strategies. We then leverage two synthesis branches, each focusing on a corresponding style component, to facilitate effective style blending through shared features without affecting content generation. StyleBlend addresses the common issues of text misalignment and weak style representation that previous methods have struggled with. Extensive qualitative and quantitative comparisons demonstrate the superiority of our approach.

CCS Concepts: Computing methodologies → Image processing; Image representations

        @article{10.1111:cgf.70034
,
journal = {Computer Graphics Forum},
title = {{StyleBlend: Enhancing Style-Specific Content Creation in Text-to-Image Diffusion Models
}},
author = {Chen, Zichong and 
Wang, Shijin and 
Zhou, Yang
},
year = {2025
},
publisher = {The Eurographics Association and John Wiley & Sons Ltd.
},
ISSN = {1467-8659
},
DOI = {10.1111/cgf.70034
}
}

URI

https://doi.org/10.1111/cgf.70034
https://diglib.eg.org/handle/10.1111/cgf70034

Collections

EG 2025 - Full Papers - CGF 44-Issue 2
44-Issue 2

Full item page