 |
| Volume 8, Number 14, Article 18, Pages 1-26 |
doi:10.1167/8.14.18 |
http://journalofvision.org/8/14/18/ |
ISSN 1534-7362 |
Objects predict fixations better than early saliency
Wolfgang Einhäuser |
Division of Biology, California Institute of Technology, Pasadena, CA, USA, & Department of Neurophysics, Philipps-University Marburg, Marburg, Germany |
|
Merrielle Spain |
Computation and Neural Systems, California Institute of Technology, Pasadena, CA, USA |
|
Pietro Perona |
Computation and Neural Systems, California Institute of Technology, Pasadena, CA, USA |
|
Abstract
Humans move their eyes while looking at scenes and pictures. Eye movements correlate with shifts in attention and are thought to be a consequence of optimal resource allocation for high-level tasks such as visual recognition. Models of attention, such as “saliency maps,” are often built on the assumption that “early” features (color, contrast, orientation, motion, and so forth) drive attention directly. We explore an alternative hypothesis: Observers attend to “interesting” objects. To test this hypothesis, we measure the eye position of human observers while they inspect photographs of common natural scenes. Our observers perform different tasks: artistic evaluation, analysis of content, and search. Immediately after each presentation, our observers are asked to name objects they saw. Weighted with recall frequency, these objects predict fixations in individual images better than early saliency, irrespective of task. Also, saliency combined with object positions predicts which objects are frequently named. This suggests that early saliency has only an indirect effect on attention, acting through recognized objects. Consequently, rather than treating attention as mere preprocessing step for object recognition, models of both need to be integrated.
|
|