Text-Guided Interactive Scene Synthesis with Scene Prior Guidance
dc.contributor.author | Fang, Shaoheng | en_US |
dc.contributor.author | Yang, Haitao | en_US |
dc.contributor.author | Mooney, Raymond | en_US |
dc.contributor.author | Huang, Qixing | en_US |
dc.contributor.editor | Bousseau, Adrien | en_US |
dc.contributor.editor | Day, Angela | en_US |
dc.date.accessioned | 2025-05-09T09:12:43Z | |
dc.date.available | 2025-05-09T09:12:43Z | |
dc.date.issued | 2025 | |
dc.description.abstract | 3D scene synthesis using natural language instructions has become a popular direction in computer graphics, with significant progress made by data-driven generative models recently. However, previous methods have mainly focused on one-time scene generation, lacking the interactive capability to generate, update, or correct scenes according to user instructions. To overcome this limitation, this paper focuses on text-guided interactive scene synthesis. First, we introduce the SceneMod dataset, which comprises 168k paired scenes with textual descriptions of the modifications. To support the interactive scene synthesis task, we propose a two-stage diffusion generative model that integrates scene-prior guidance into the denoising process to explicitly enforce physical constraints and foster more realistic scenes. Experimental results demonstrate that our approach outperforms baseline methods in text-guided scene synthesis tasks. Our system expands the scope of data-driven scene synthesis tasks and provides a novel, more flexible tool for users and designers in 3D scene generation. Code and dataset are available at https://github.com/bshfang/SceneMod. | en_US |
dc.description.number | 2 | |
dc.description.sectionheaders | Shape It Til You Make It: Programs for 3D Synthesis | |
dc.description.seriesinformation | Computer Graphics Forum | |
dc.description.volume | 44 | |
dc.identifier.doi | 10.1111/cgf.70039 | |
dc.identifier.issn | 1467-8659 | |
dc.identifier.pages | 12 pages | |
dc.identifier.uri | https://doi.org/10.1111/cgf.70039 | |
dc.identifier.uri | https://diglib.eg.org/handle/10.1111/cgf70039 | |
dc.publisher | The Eurographics Association and John Wiley & Sons Ltd. | en_US |
dc.rights | Attribution 4.0 International License | |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | |
dc.subject | CCS Concepts: Computing methodologies → Computer graphics; Natural language processing; Computer systems organization → Neural networks | |
dc.subject | Computing methodologies → Computer graphics | |
dc.subject | Natural language processing | |
dc.subject | Computer systems organization → Neural networks | |
dc.title | Text-Guided Interactive Scene Synthesis with Scene Prior Guidance | en_US |