Text-Guided Interactive Scene Synthesis with Scene Prior Guidance

Loading...
Thumbnail Image
Date
2025
Journal Title
Journal ISSN
Volume Title
Publisher
The Eurographics Association and John Wiley & Sons Ltd.
Abstract
3D scene synthesis using natural language instructions has become a popular direction in computer graphics, with significant progress made by data-driven generative models recently. However, previous methods have mainly focused on one-time scene generation, lacking the interactive capability to generate, update, or correct scenes according to user instructions. To overcome this limitation, this paper focuses on text-guided interactive scene synthesis. First, we introduce the SceneMod dataset, which comprises 168k paired scenes with textual descriptions of the modifications. To support the interactive scene synthesis task, we propose a two-stage diffusion generative model that integrates scene-prior guidance into the denoising process to explicitly enforce physical constraints and foster more realistic scenes. Experimental results demonstrate that our approach outperforms baseline methods in text-guided scene synthesis tasks. Our system expands the scope of data-driven scene synthesis tasks and provides a novel, more flexible tool for users and designers in 3D scene generation. Code and dataset are available at https://github.com/bshfang/SceneMod.
Description

CCS Concepts: Computing methodologies → Computer graphics; Natural language processing; Computer systems organization → Neural networks

        
@article{
10.1111:cgf.70039
, journal = {Computer Graphics Forum}, title = {{
Text-Guided Interactive Scene Synthesis with Scene Prior Guidance
}}, author = {
Fang, Shaoheng
and
Yang, Haitao
and
Mooney, Raymond
and
Huang, Qixing
}, year = {
2025
}, publisher = {
The Eurographics Association and John Wiley & Sons Ltd.
}, ISSN = {
1467-8659
}, DOI = {
10.1111/cgf.70039
} }
Citation