## Data, Maps, and Scene Images for [Where the action could be: Speakers look at graspable objects and meaningful scene regions when describing potential actions][1]
**Citation:** Rehrig, G., Peacock, C.E., Hayes, T.R., Henderson, J.M., Ferreira, F. (2020). Where the action could be: Speakers look at graspable objects and meaningful scene regions when describing potential actions. *JEP:LMC, 46*(9), 1659-1681. doi: 10.1037/xlm0000837
*Note:* The MATLAB code for crowdsourced feature map generation is available on the OSF: https://osf.io/654uh/
Experiment 1
- Data
- Experiment1_Scene-Level_Analysis.txt
Data corresponding to the Experiment 1 scene-level analyses in the manuscript. Each row in the tab-delimited file corresponds to one scene. Variables are defined as follows:
- **scene:** The name of the scene image (file extension omitted).
- **CCR2_Salience:** Correlation coefficient (*R*^2) for the correlation between the saliency map and attention map for the current scene.
- **SPCCR2_Salience:** Semipartial correlation coefficient (*R*^2) for the partial correlation between the saliency map and attention map for the current scene, controlling for the shared variance explained by the meaning map.
- **CCR2_Meaning:** Correlation coefficient (*R*^2) for the correlation between the meaning map and attention map for the current scene.
- **SPCCR2_Meaning:** Semipartial correlation coefficient (*R*^2) for the partial correlation between the meaning map and attention map for the current scene, controlling for the shared variance explained by the saliency map.
- **SalienceMeaning_CCR2:** Correlation coefficient (*R*^2) for the correlation between the meaning map and saliency map for the current scene.
- **SalienceMeaning_CCR2_NP:** Correlation coefficient (*R*^2) for the correlation between the meaning map and saliency map for the current scene, excluding the scene's periphery (which is identical across maps due to the peripheral downweighting present in both maps).
- Experiment1_Fixation_Analysis.txt
Data corresponding to the Experiment 1 scene-level analysis in the manuscript. Each row in the tab-delimited file corresponds to one fixation in scene. Variables are the same as in the scene-level analysis, except that the correlations between meaning and saliency maps are omitted, and there is an additional variable **Fixation** that indicates the current fixation step.
- Maps
- **Attention:** Attention maps (.mat files) derived from viewer fixations.
- Scene-level analysis
Attention maps derived from fixations made within the entire viewing period (12 s).
- Fixation analysis
Attention maps derived from fixations at each time step in the fixation analysis (3 analyzed, 40 show in figure). In the interest of space, only attention maps for the first three fixations in each scene are included.
- **Meaning:** Contextualized meaning maps (.mat files) for each scene.
- **Grasp:** Contextualized graspability maps (.mat files) for each scene.
- **Saliency**: Saliency map (.mat files) for each scene, generating using [GBVS][3].
- **Scenes:** 15 images of each scene presented in the experiment.
Experiment 2
- Data
- Experiment2_Scene-Level_Analysis.txt
Data corresponding to the Experiment 2 scene-level analyses in the manuscript. Each row in the tab-delimited file corresponds to one scene in one condition (*articulatory suppression* or *control*). Variable definitions are identical to those for the Experiment 1 file.
- Experiment2_Fixation_Analysis.txt
Data corresponding to the Experiment 2 scene-level analysis in the manuscript. Each row in the tab-delimited file corresponds to one fixation in scene. Variables are the same as in the scene-level analysis, except that the correlations between meaning and saliency maps are omitted, and there is an additional variable **Fixation** that indicates the current fixation step.
- Maps
- **Attention:** Attention maps (.mat files) derived from viewer fixations.
- Scene-level analysis
Attention maps derived from fixations made within the entire viewing period (12 s).
- Fixation analysis
Attention maps derived from fixations at each time step in the fixation analysis (3 analyzed, 40 show in figure). In the interest of space, only attention maps for the first three fixations in each scene are included.
- **Meaning:** Contextualized meaning maps (.mat files) for each scene.
- **Grasp:** Contextualized graspability maps (.mat files) for each scene.
- **Saliency**: Saliency map (.mat files) for each scene, generating using [GBVS][3].
- **Scenes:** 20 images of each scene presented in the main experiment.
Experiment 3
- Data
- Experiment3_Scene-Level_Analysis.txt
Data corresponding to the Experiment 3 scene-level analyses in the manuscript. Each row in the tab-delimited file corresponds to one scene in one condition (*articulatory suppression* or *control*). Variable definitions are identical to those for the Experiment 1 file.
- Experiment3_Fixation_Analysis.txt
Data corresponding to the Experiment 3 scene-level analysis in the manuscript. Each row in the tab-delimited file corresponds to one fixation in scene. Variables are the same as in the scene-level analysis, except that the correlations between meaning and saliency maps are omitted, and there is an additional variable **Fixation** that indicates the current fixation step.
- Maps
- **Attention:** Attention maps (.mat files) derived from viewer fixations.
- Scene-level analysis
Attention maps derived from fixations made within the entire viewing period (12 s).
- Fixation analysis
Attention maps derived from fixations at each time step in the fixation analysis (3 analyzed, 40 show in figure). In the interest of space, only attention maps for the first three fixations in each scene are included.
- **Meaning:** Contextualized meaning maps (.mat files) for each scene.
- **Grasp:** Contextualized graspability maps (.mat files) for each scene.
- **Reach-weighted grasp:** Contextualized graspability maps weighted by reachability maps (.mat files) for each scene.
- **Saliency**: Saliency map (.mat files) for each scene, generating using [GBVS][3].
- **Scenes:** 20 images of each scene presented in the main experiment.
V-VSS 2020
- **Poster file:** GLR_VSS2020.pdf
- **Video presentation:** GLR V-VSS 2020.mp4
Additional details for V-VSS 2020 Poster #540
- **Authors:** Gwendolyn Rehrig, Candace E. Peacock, Taylor R. Hayes, John M. Henderson, & Fernanda Ferreira
- **Abstract:** The world around us is visually complex, yet we can efficiently describe it by extracting the information that is most relevant to convey. How do the properties of a real-world scene help us decide where to look and what to say about it? Image salience has been the dominant explanation for what drives visual attention and production as we describe what we see, but new evidence shows scene meaning predicts attention better than image salience. Another potentially important property is graspability, or the possible grasping interactions objects in the scene afford, given that affordances have been implicated in both visual and language processing. We quantified image salience, meaning, and graspability for real-world scenes. In three eyetracking experiments (N=30,40,40), native speakers described possible actions that could be carried out in a scene. We hypothesized that graspability would be task-relevant and therefore would preferentially guide attention. In two experiments using stimuli from a previous study (Henderson & Hayes, 2017) that were not controlled for camera angle or reachability, meaning explained visual attention better than either graspability or image salience did, and graspability explained attention better than salience. In a third experiment we quantified salience, meaning, graspability, and reachability for a new set of scenes that were explicitly controlled for reachability (i.e., reachable spaces containing graspable objects). In contrast with our results using previous stimuli, we found that graspability and meaning explained attention equally well, and both explained attention better than image salience. We conclude that speakers use object graspability to allocate attention to plan descriptions when scenes depict graspable objects that are within reach, and otherwise rely more on general meaning. Taken as a whole, the three experiments shed light on what aspects of meaning guide attention during scene viewing in language production tasks.
- **Video Presentation:** Alternative link for video presentation on Vimeo: https://vimeo.com/430815103
[1]: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7483632/