Structuring and Embedding Image Captions: the V.I.F. Multi-modal System

Vasconcelos, Cristina N.; Sá, Asla M.; Sá, Marcio I.; Carvalho, Paulo Cezar P.

Structuring and Embedding Image Captions: the V.I.F. Multi-modal System

dc.contributor.author	Vasconcelos, Cristina N.	en_US
dc.contributor.author	Sá, Asla M.	en_US
dc.contributor.author	Sá, Marcio I.	en_US
dc.contributor.author	Carvalho, Paulo Cezar P.	en_US
dc.contributor.editor	David Arnold and Jaime Kaminski and Franco Niccolucci and Andre Stork	en_US
dc.date.accessioned	2013-11-08T10:32:38Z
dc.date.available	2013-11-08T10:32:38Z
dc.date.issued	2012	en_US
dc.description.abstract	Within the context of historical photographic annotated collections, we observe the frequent occurrence of some subsets of important characters, usually described in captions. For many years, image captions were annotated using natural language texts intended to be read by humans. Today, the information retrieval of structured information is appealing and the migration of natural language captions to structured information is desirable in a variety of photographic collections. In this paper, we describe the Very Important Faces (V.I.F.) system, which is designed to graphically document the occurrence of distinguished characters within photographic collections and store this information in a structured format useful for retrieval purposes. The V.I.F. system implements face detection in the image data and detects proper names in previously inserted captions if any are present. The user matches names to faces throughout the software interface in order to produce a photo annotation that is stored considering structured information principles. Once the matching is done, an efficient verification tool is proposed, which helps the expert to review the annotation, taking advantage of such multi-modal databases. The concept of annotation maturity level is also introduced.	en_US
dc.description.seriesinformation	VAST: International Symposium on Virtual Reality, Archaeology and Intelligent Cultural Heritage	en_US
dc.identifier.doi	10.2312/VAST/VAST12/025-032
dc.identifier.isbn	978-3-905674-39-2	en_US
dc.identifier.issn	1811-864X	en_US
dc.identifier.uri	https://doi.org/10.2312/VAST/VAST12/025-032	en_US
dc.publisher	The Eurographics Association	en_US
dc.subject	H.2.8 [Database Applications ]	en_US
dc.subject	Image databases	en_US
dc.subject	Data mining H.5.2 [User Interfaces]	en_US
dc.subject	Training	en_US
dc.subject	help	en_US
dc.subject	documentation	en_US
dc.subject	I.5.4 [Applications ]	en_US
dc.subject	Computer vision	en_US
dc.subject	Text processing	en_US
dc.subject	I.3.4 [Computer Graphics]	en_US
dc.subject	Graphics Utilities	en_US
dc.title	Structuring and Embedding Image Captions: the V.I.F. Multi-modal System	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 025-032.pdf
Size:: 569.6 KB
Format:: Adobe Portable Document Format

Download

Collections

2012: The 13th International Symposium on Virtual Reality, Archaeology and Intelligent Cultural Heritage