|
Heuristic method for automatic image annotation in HTML documents |
Published in: | Proceedings of the 13th Latin American and Caribbean Conference for Engineering and Technology: Engineering Education Facing the Grand Challenges, What Are We Doing? | |
Date of Conference: | July 29 - 31, 2015 |
Location of Conference: | Santo Domingo, Dominican Republic |
Authors: | Jorge Luis Betancourt González Adisleydis Rodríguez Alvarez
|
Refereed Paper: | #57 |
|
Abstract: |
An automatic heuristic method for embedded image
annotation in HTML documents is exposed. This method exploits
the tree structure present in HTML documents trying to identify
nodes that contain relevant information about the embedded image,
and then using the text in these nearest nodes to expand the
information collected about the image, increasing the recall of a
Web Search Engine. The proposed heuristic was evaluated using
the Agreement Index: the text contained in the identified nodes and
the corresponding image was assessed and assigned a category of
how well the text was related (i.e. described) with the image. In our
test cases the calculated Agreement Index was over 85%, validating
the proposed method.
Keywords-- image annotation, HTML, information retrieval,
search engine, web
|