Text2Autochrome: Text guided autochrome synthesis using generative models

Kühn, Paul Julius; Sinha, Saptarshi Neil; Nguyen, Duc Anh; Horst, Robin; Kuijper, Arjan; Fellner, Dieter W.

Text2Autochrome: Text guided autochrome synthesis using generative models

dc.contributor.author	Kühn, Paul Julius	en_US
dc.contributor.author	Sinha, Saptarshi Neil	en_US
dc.contributor.author	Nguyen, Duc Anh	en_US
dc.contributor.author	Horst, Robin	en_US
dc.contributor.author	Kuijper, Arjan	en_US
dc.contributor.author	Fellner, Dieter W.	en_US
dc.contributor.editor	Campana, Stefano	en_US
dc.contributor.editor	Ferdani, Daniele	en_US
dc.contributor.editor	Graf, Holger	en_US
dc.contributor.editor	Guidi, Gabriele	en_US
dc.contributor.editor	Hegarty, Zackary	en_US
dc.contributor.editor	Pescarin, Sofia	en_US
dc.contributor.editor	Remondino, Fabio	en_US
dc.date.accessioned	2025-09-05T08:39:07Z
dc.date.available	2025-09-05T08:39:07Z
dc.date.issued	2025
dc.description.abstract	Autochrome is an early color photography technique that is highly sensitive and prone to deterioration, limiting their public display. A limited collection of digitized autochromes exists, often with defects due to their fragile nature. We applied generative AI methods, specifically Low-Rank Adaptation (LoRA), to fine-tune diffusion models, enabling efficient use of computational resources. Our curated dataset of vintage digitized autochromes showcased various styles and served as the basis for training the LoRA model, resulting in the generation of digitized autochromes that preserved the original color filter effects and characteristic granularity. By leveraging generative AI, we can utilize the multi-modal capabilities of the model, allowing each user to generate images through concept-based prompts. This approach empowers users to creatively interact with the model, producing personalized images while maintaining the historical color fidelity and structure of autochromes. This capability also enables us to generate defect-free autochromes, which can be utilized for synthetic training in autochrome restoration efforts. We evaluated our approach using the CLIPScore metric for quantitative similarity and conducted a user study for qualitative feedback on the generated images. Our results show that the fine-tuned LoRA model effectively captures the essence of autochromes, producing visually appealing images that respect the historical aesthetic. Considering the potential for misinterpretation and ethical concerns surrounding text-to-image methods using deep learning with historical photographs, we are committed to enhancing transparency by releasing our model weights and training datasets, thereby empowering the community to better understand, evaluate, and address these important issues. Further we release an interactive demo together with the fine-tuned weights available via huggingface.	en_US
dc.description.sectionheaders	PERCEIVE: Exhibiting the ''Unexhibitable''
dc.description.seriesinformation	Digital Heritage
dc.identifier.doi	10.2312/dh.20253061
dc.identifier.isbn	978-3-03868-277-6
dc.identifier.pages	10 pages
dc.identifier.uri	https://doi.org/10.2312/dh.20253061
dc.identifier.uri	https://diglib.eg.org/handle/10.2312/dh20253061
dc.publisher	The Eurographics Association	en_US
dc.rights	Attribution 4.0 International License
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	CCS Concepts: Computing methodologies → Neural networks
dc.subject	Computing methodologies → Neural networks
dc.title	Text2Autochrome: Text guided autochrome synthesis using generative models	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: dh20253061.pdf
Size:: 20.24 MB
Format:: Adobe Portable Document Format

Download

Collections

Track 08 – Digital Technologies for Colour