Meaning, Not Salience, Draws Visual Attention

Your visual attention is drawn to parts of a scene that have meaning, rather than to those that are salient or “stick out,” new research from the Center for Mind and Brain at the University of California, Davis reveals. The findings overturn the widely-held salience model of visual attention.

Our eyes we perceive a wide field of view in front of us, but we only focus our attention on a small part of this field. How do we decide where to direct our attention, without thinking about it?

The dominant theory in attention studies is “visual salience,” Professor John Henderson, who led the research, said. Salience means things that “stick out” from the background, like colorful berries on a background of leaves or a brightly lit object in a room.

The Magpie Theory

Professor Henderson said:

“A lot of people will have to rethink things. The saliency hypothesis really is the dominant view.”

Saliency is relatively easy to measure. You can map the amount of saliency in different areas of a picture by measuring relative contrast or brightness, for example.

Henderson called this the “magpie theory” our attention is drawn to bright and shiny objects.

“It becomes obvious, though, that it can’t be right,”

he said, otherwise we would constantly be distracted.

Mapping Meaning

Henderson and postdoctoral researcher Taylor Hayes set out to test whether attention is guided instead by how “meaningful” we find an area within our view. They first had to construct “meaning maps” of test scenes, where different parts of the scene had different levels of meaning to an observer.

To make their meaning map, Henderson and Hayes took images of scenes, broke them up into overlapping circular tiles, and submitted the individual tiles to the online crowdsourcing service Mechanical Turk, asking users to rate the tiles for meaning.

By tallying the votes of Mechanical Turk users they were able to assign levels of meaning to different areas of an image and create a meaning map comparable to a saliency map of the same scene.

Next, they tracked the eye movements of volunteers as they looked at the scene. Those eyetracks gave them a map of what parts of the image attracted the most attention.

This “attention map” was closer to the meaning map than the salience map, Henderson said.

Building A Taxonomy

Henderson and Hayes don’t yet have firm data on what makes part of a scene meaningful, although they have some ideas. For example, a cluttered table or shelf attracted more attention than a highly salient splash of sunlight on a wall.

With further work, they hope to develop a “taxonomy of meaning,” Henderson said.

Although the research is aimed at a fundamental understanding of how visual attention works, there could be some near-term applications, Henderson said, for example in developing automated visual systems that allow computers to scan security footage or to automatically identify or caption images online.

John M. Henderson & Taylor R. Hayes
Meaning-based guidance of attention in scenes as revealed by meaning maps
Nature Human Behaviour (2017) doi:10.1038/s41562-017-0208-0

Image: John Henderson and Taylor Hayes, UC Davis