Text2Autochrome: Text guided autochrome synthesis using generative models
dc.contributor.author | Kühn, Paul Julius | en_US |
dc.contributor.author | Sinha, Saptarshi Neil | en_US |
dc.contributor.author | Nguyen, Duc Anh | en_US |
dc.contributor.author | Horst, Robin | en_US |
dc.contributor.author | Kuijper, Arjan | en_US |
dc.contributor.author | Fellner, Dieter W. | en_US |
dc.contributor.editor | Campana, Stefano | en_US |
dc.contributor.editor | Ferdani, Daniele | en_US |
dc.contributor.editor | Graf, Holger | en_US |
dc.contributor.editor | Guidi, Gabriele | en_US |
dc.contributor.editor | Hegarty, Zackary | en_US |
dc.contributor.editor | Pescarin, Sofia | en_US |
dc.contributor.editor | Remondino, Fabio | en_US |
dc.date.accessioned | 2025-09-05T08:39:07Z | |
dc.date.available | 2025-09-05T08:39:07Z | |
dc.date.issued | 2025 | |
dc.description.abstract | Autochrome is an early color photography technique that is highly sensitive and prone to deterioration, limiting their public display. A limited collection of digitized autochromes exists, often with defects due to their fragile nature. We applied generative AI methods, specifically Low-Rank Adaptation (LoRA), to fine-tune diffusion models, enabling efficient use of computational resources. Our curated dataset of vintage digitized autochromes showcased various styles and served as the basis for training the LoRA model, resulting in the generation of digitized autochromes that preserved the original color filter effects and characteristic granularity. By leveraging generative AI, we can utilize the multi-modal capabilities of the model, allowing each user to generate images through concept-based prompts. This approach empowers users to creatively interact with the model, producing personalized images while maintaining the historical color fidelity and structure of autochromes. This capability also enables us to generate defect-free autochromes, which can be utilized for synthetic training in autochrome restoration efforts. We evaluated our approach using the CLIPScore metric for quantitative similarity and conducted a user study for qualitative feedback on the generated images. Our results show that the fine-tuned LoRA model effectively captures the essence of autochromes, producing visually appealing images that respect the historical aesthetic. Considering the potential for misinterpretation and ethical concerns surrounding text-to-image methods using deep learning with historical photographs, we are committed to enhancing transparency by releasing our model weights and training datasets, thereby empowering the community to better understand, evaluate, and address these important issues. Further we release an interactive demo together with the fine-tuned weights available via huggingface. | en_US |
dc.description.sectionheaders | PERCEIVE: Exhibiting the ''Unexhibitable'' | |
dc.description.seriesinformation | Digital Heritage | |
dc.identifier.doi | 10.2312/dh.20253061 | |
dc.identifier.isbn | 978-3-03868-277-6 | |
dc.identifier.pages | 10 pages | |
dc.identifier.uri | https://doi.org/10.2312/dh.20253061 | |
dc.identifier.uri | https://diglib.eg.org/handle/10.2312/dh20253061 | |
dc.publisher | The Eurographics Association | en_US |
dc.rights | Attribution 4.0 International License | |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | |
dc.subject | CCS Concepts: Computing methodologies → Neural networks | |
dc.subject | Computing methodologies → Neural networks | |
dc.title | Text2Autochrome: Text guided autochrome synthesis using generative models | en_US |
Files
Original bundle
1 - 1 of 1