Del píxel a las resonancias visuales: la imagen con voz propia

Pilar Rosado, Eva Figueras y Ferran Reverter

Vol. 4, Núm. 1 (2016): Visualidad, Energía, Conectividad –

AusArt Journal for Research in Art. 4 (2016), 1, pp. 19-28

DOI: 10.1387/ausart.16670



The objective of our research is to develop a series of computer vision programs to search for analogies in large datasets—in this case, collections of images of abstract paintings—based solely on their visual content without textual annotation. We have programmed an algorithm based on a specific model of image description used in computer vision. This approach involves placing a regular grid over the image and selecting a pixel region around each node. Dense features computed over this regular grid with overlapping patches are used to represent the images. Analysing the distances between the whole set of image descriptors we are able to group them according to their similarity and each resulting group will determines what we call “visual words”. This model is called Bag-of-Words representation Given the frequency with which each visual word occurs in each image, we apply the method pLSA (Probabilistic Latent Semantic Analysis), a statistical model that classifies fully automatically, without any textual annotation, images according to their formal patterns. In this way, the researchers hope to develop a tool both for producing and analysing works of art.