Action Planning Renders Objects in Working Memory More Attentionally Salient

Abstract A rapidly growing body of work suggests that visual working memory (VWM) is fundamentally action oriented. Consistent with this, we recently showed that attention is more strongly biased by VWM representations of objects when we plan to act on those objects in the future. Using EEG and eye tracking, here, we investigated neurophysiological correlates of the interactions between VWM and action. Participants (n = 36) memorized a shape for a subsequent VWM test. At test, a probe was presented along with a secondary object. In the action condition, participants gripped the actual probe if it matched the memorized shape, whereas in the control condition, they gripped the secondary object. Crucially, during the VWM delay, participants engaged in a visual selection task, in which they located a target as fast as possible. The memorized shape could either encircle the target (congruent trials) or a distractor (incongruent trials). Replicating previous findings, we found that eye gaze was biased toward the VWM-matching shape and, importantly, more so when the shape was directly associated with an action plan. Moreover, the ERP results revealed that during the selection task, future action-relevant VWM-matching shapes elicited (1) a stronger Ppc (posterior positivity contralateral), signaling greater attentional saliency; (2) an earlier PD (distractor positivity) component, suggesting faster suppression; (3) a larger inverse (i.e., positive) sustained posterior contralateral negativity in incongruent trials, consistent with stronger suppression of action-associated distractors; and (4) an enhanced response-locked positivity over left motor regions, possibly indicating enhanced inhibition of the response associated with the memorized item during the interim task. Overall, these results suggest that action planning renders objects in VWM more attentionally salient, supporting the notion of selection-for-action in working memory.


INTRODUCTION
In recent years, it has become clear that visual working memory ( VWM) representations are more flexible than traditionally assumed.For example, depending on the specific task, the same object in working memory can be maintained by different brain areas (Henderson, Rademaker, & Serences, 2022) or be differentially represented within the same brain region (e.g., van Driel, Gunseli, Meeter, & Olivers, 2017).Working memories can also be dynamically updated.For instance, if participants receive a cue after encoding an array of visual stimuli, indicating which item is going to be probed in a subsequent memory test, the selective content of their VWM is adjusted accordingly, by discarding from memory irrelevant items and enhancing the representation of the relevant item (Poch, Campo, & Barnes, 2014;Kuo, Stokes, & Nobre, 2012).Furthermore, studies have also uncovered that representations of items concurrently held in VWM can either repel or attract each other, depending on whether the task is to report all items (Bae & Luck, 2017) or only one item (Zerr, Gayet, & Van der Stigchel, 2024).This flexibility in how information is represented in VWM depending on anticipated task demands has led Nobre and Stokes (2019) to coin the term "premembering."This concept underscores an essential but often overlooked aspect of VWM: Its primary role is to optimally guide future behavior rather than solely storing past information.
In their least abstract form, task goals coincide with concrete motor plans, and indeed, there is evidence showing that also the planning of simple actions can alter how information is represented in VWM.For example, a study by Heuer, Crawford, and Schubö (2017) suggests that memory for certain features is strengthened depending on their relevance for a specific action.They showed that memory for size-but not color-improved when participants planned a grasping movement during the memory delay (consistent with size being relevant for grasping), whereas memory for color-but not sizeimproved when participants planned a pointing movement (consistent with color aiding localization).This phenomenon is consistent with what has been called "intentional weighting," which denotes that depending on one's action intentions, different aspects of a sensory representation are upweighted (Memelink & Hommel, 2013).Intentional weighting is thought to result from top-down influences of action planning on sensory processes, which might take place thanks to the instantiation of recurrent feedback connections from motor to visual areas (Olivers & Roelfsema, 2020).Such top-down modulation of activity in early visual regions would then emerge as visual attention; by strengthening the representation of sensory features that are action-relevant, these features are consequently prioritized.
An interesting aspect of intentional weighting is that it is automatic and occurs even when it brings no advantage to task performance.Indeed, it has mostly been observed in dual-task contexts in which participants performed both an action and a memory task, which were unrelated to each other.For example, in Heuer et al. (2017), participants memorized the colors of a memory array for a subsequent change detection task; during the memory delay, they were cued to plan and execute a pointing movement toward one of the placeholders previously occupied by one of the memory items.Results showed a memory advantage for items formerly located at the placeholder that coincided with the movement end goal, although the action cue did not bear any predictive value for the memory test.Most of the studies conducted so far have examined automatic action-induced modulations of information in VWM by means of similar dual-task setups-that is, in contexts in which the action was planned in parallel to, but not directly relevant for, or aimed at, an object held in VWM.However, in everyday life, items held in VWM are typically informative for our actions; in fact, they often are the direct target of our actions.It is therefore important to understand how action plans influence the memory representations of visual items when they are functionally linked to them.Yet, how intentional weighting may affect the representation of memoranda that are the direct target of an action is still unclear.It is conceivable that these memoranda benefit from a privileged, prioritized sensory status compared with memoranda that are equally relevant, but with which we do not plan to directly interact.
In a previous eye tracking study (Trentin, Slagter, & Olivers, 2023), we indeed found initial evidence that the VWM representation of an object that we plan to act on is prioritized (i.e., more attended to) compared with the VWM representation of the same object when it is not the direct target of an action plan.In that study, we capitalized on the phenomenon of VWM-induced attentional guidance (Soto, Hodsoll, Rotshtein, & Humphreys, 2008;Olivers, Meijer, & Theeuwes, 2006;Downing, 2000): the fact that visual attention is automatically drawn toward sensory features in the environment that match VWM content.The procedure was similar to the one illustrated in Figure 1.Participants were asked to memorize a geometric shape that was centrally presented on a touch screen.After a delay, their memory Figure 1.Schematic presentation of the task.Participants pressed the O key to initiate the trial and hold it down either until the memory test (match trial) or for the whole trial duration (no-match trial).Participants then memorized a geometric shape.At test, a geometrical shape (the memory probe) was shown, which either matched the one kept in working memory (match trial) or not (no-match trial), together with a square button.In the action condition, in case of a match, participants performed an actual grip movement on the memory probe, as was registered by the touch screen.In the control condition, in case of a match, they performed the grip movement on the square button instead.In case of a no-match, no action was performed at all (instead hold the key).Thus, in all conditions, participants memorized a geometric shape and planned a grip movement, but only in the action condition could they plan the grip movement on the object held in mind.During the delay, participants performed a VST, in which they covertly located a target (i.e., the letter N) and reported its location (left or right) by means of a keypress with their left hand.To induce an actionreadiness state in participants already early in the trial, we inserted 25% catch trials, in which the VST was skipped and immediately replaced by the working memory test.A different set of to-be-memorized shapes was used in each condition.The two sets were counterbalanced across participants.
of the shape was tested, with a probe that could either match or not match the memorized shape.In the action condition, if the probe matched the item held in VWM, participants gripped the actual probe shape, whereas in the control condition, they performed a grip action toward a secondary object that always appeared on the screen together with the probe.In case of a no-match, participants withheld their planned action.During the memory delay, participants performed an intermediate visual selection task ( VST), in which they were asked to detect and fixate a target letter, while ignoring a distractor character.Critically, the memorized shape was always present in the VST as an irrelevant stimulus, which either surrounded the target in congruent trials, or the distractor in incongruent trials, whereas the other item was surrounded by a nonmemorized circle.Although in both the action and the control conditions participants both retained in memory a shape and planned an action, we found that planning an action directly on the shape held in VWM biased attention more toward this shape in the VST, affecting both preselection and postselection attentional processes.Specifically, in the action condition, (1) the target was fixated more slowly when the VWM-matching shape surrounded the distractor, and (2) initial eye movements were directed more toward the VWM-matching item in both congruent and incongruent trials.Moreover, participants fixated significantly longer on VWM-matching shapes in incongruent trials.Given that VWM-based attentional guidance increases with memory prioritization/strength (Williams, Brady, & Störmer, 2022;van Moorselaar, Theeuwes, & Olivers, 2014), these findings suggest that VWM representations were further prioritized or strengthened by their close association with the action.One's action intention can thus modulate the internal representation of information in VWM, and the stronger the sensorimotor link is, the more VWM representation is prioritized (see also Trentin et al., 2024).
In the current study, we aimed to determine the neural correlates underlying this action-based VWM prioritization effect using EEG.For this purpose, we adapted the task design used in Trentin and colleagues (2023) to be more suitable for EEG measurements, that is, by converting the overt VST into a covert attention task in which participants indicated the location of the target by means of a keypress rather than an eye movement, while keeping central fixation (as monitored with an eye-tracker).The task is illustrated in Figure 1.We expected to replicate our previous behavioral findings, namely, greater attentional biases toward VWM-matching items for VWM representations that are the deliberate target of an upcoming action, as compared with VWM representations that are equally task relevant, but not the target of a planned action.We predicted that we would find evidence for this prioritization in both manual RTs and gaze bias.Indeed, although participants were required to maintain fixation, eye position can still provide a sensitive measure of how attention is spatially allocated over time (e.g., van Ede, Chekroud, & Nobre 2019).For this reason, we also looked at gaze biases, as a secondary measure of attentional biases.
At the neural level, we expected that when a memorized item serves as the target of a future action, its representation is strengthened, and this should render sensorymatching information more salient, as reflected in enhanced early visual processing of the VWM-matching shape during the intermittent VST in the action condition.We specifically predicted that our action manipulation would influence early lateralized ERP components such as the Ppc (posterior positivity contralateral), which is thought to reflect the sensory saliency of a visual item (Abbasi, Kadel, Hickey, & Schubö, 2022).In addition, we expected to find stronger attentional orienting toward items matching the geometric shape held in VWM when this was the direct target of an action, as reflected in the N2 posterior contralateral (N2pc), an ERP-index of early allocation of spatial attention (Luck & Hillyard, 1994).Earlier studies have shown the N2pc to be responsive to memory matching stimuli per se (Carlisle & Woodman, 2011;Mazza, Dallabona, Chelazzi, & Turatto, 2011;Kumar, Soto, & Humphreys, 2009), and we also expected to observe this here.However, in addition, we predicted that the N2pc would occur earlier and be larger in response to items matching VWM representations that are the target of an upcoming action compared with items matching VWM representations not directly linked to an action plan.Finally, given that, in our previous study (Trentin et al., 2023), we also found clear evidence for differences in postselection processes between the two conditions, as reflected in delayed disengagement of fixation from action-linked distractors, we also hypothesized that a lateralized ERP component following the N2pc would be affected by whether the VWMmatching shape was functionally linked to an action plan.In particular, we expected that the sustained posterior contralateral negativity (SPCN; Jolicoeur, Sessa, Dell'Acqua, & Robitaille, 2006), which is thought to index not only the gating of information into working memory but also the sustained attention following the visual selection of an item (Heuer, Mennig, Schubö, & Barke, 2021), would be earlier and larger (i.e., more negative) in response to shapes matching the action-relevant VWM shape.
Given the novelty of our question, we did not have further predictions about other possible ERP modulations driven by the coupling of the memory item to an action plan.However, we also ran an exploratory analysis across the whole scalp to identify additional nonlateralized ERP effects that may signal the prioritization of VWM representations that are the target of an action plan.

Participants
Forty-four participants took part in our study in exchange for course credits or money (10 euros per hour).The protocol was approved by the Scientific and Ethics Review Board of the Faculty of Behavior and Movement Sciences at the Vrije Universiteit Amsterdam.Participants who scored overall less than 75% in the memory test (n = 1), as well as those with an average RT in the VST that was greater or lower than 2.5 SDs from the group mean (n = 1) were excluded from further analyses.Moreover, participants for whom, after preprocessing, less than 150 usable trials remained (see Behavioral Responses section for trials exclusion criteria; n = 4) or with too many (>6) bad EEG electrodes (n = 2) were removed from the data set.This led to the exclusion of eight participants in total, leaving a final sample of 36 participants (age range: 18-28, 28 female participants).On the basis of the sample sizes used in previous studies, which investigated action-induced modulations of ERP components ( Wamain, Corveleyn, Ott, & Coello, 2019;Job, van Velzen, & de Fockert, 2017), we deemed our sample size as adequate for the effects of interest.

Apparatus
Participants sat 40 cm from a DELL P2418HT Touch Screen (1080 × 1920 resolution, 60-Hz refresh rate) on which the task was presented.Their right eye was tracked with a sample frequency of 1000 Hz (Eyelink), and their brain activity was measured using 64 scalp EEG electrodes (BioSemi; ActiveTwo system, 10/20 placement; biosemi.com).Two external electrodes were placed on the mastoids and used for later rereferencing.Four external electrodes were positioned on the forearm (i.e., two over the lateral epicondyles and the other two over the flexor digitorum superficialis, similar to Echeverria-Altuna et al., 2022), and two additional external electrodes were placed on the face to record eye movement potentials.The EEG signal was sampled with a frequency of 1024 Hz.The experimental task was coded using OpenSesame 3.3.8(Mathôt, Schreij, & Theeuwes, 2012).

Stimuli and Task
The task was virtually the same as the one described in Trentin and colleagues (2023), with a few changes to make it suitable for EEG recording.Participants were asked to keep their eyes at a central fixation cross (red, green, blue: 0,0,0; 0.86 × 0.86 dva) displayed throughout the trial.To start a trial, they had to press the O key with their right index finger and participants were asked to keep this key pressed down until they used the right hand to respond to the working memory test (see below).This prevented participants from starting to reach toward the screen too early in the trial.After a subsequent fixation period (1100-1300 msec, randomly jittered), a shape outline was presented at the center of the screen for 1500 msec, and participants were instructed to encode it into working memory.Each shape belonged to one of six possible geometric categories: a quadrilateral, a hexagon, a cross, a triangle, a heart, or a star.The shapes were white, and they were rendered on a gray background (red, green, blue: 128,128,128).All shapes subtended between 5.0°a nd 9.3°of visual angle.This was followed by a fixation period of 1600 msec, which in 75% of trials was followed by an intermittent VST.In this task, two shapes were presented at 6 visual degrees to the left and right of the fixation cross: a circle (with a diameter of 6.18°) and the currently memorized geometric shape.These geometric shapes were irrelevant for the VST, but within each shape, a white element was presented: either the letter N or a horizontal "hourglass" symbol.Both elements subtended an area of 1.49°× 1.24°(length × height).Participants were told to covertly find the letter N and to indicate its location by pressing the A key with their left middle finger if the target was on the left, and pressing the S key with their left index finger if the target was on the right.Participants were encouraged to find the letter N as fast as possible, while also being accurate.Once their response was registered, another fixation display was shown for 1600 msec.Finally, a working memory test was presented.Participants saw both a memory probe and a button-like shape, at a distance of 6 visual degrees above and below the fixation cross (position randomized across trials).If the probe matched the shape held in working memory, participants had to lift their right index finger from the O key and either grip the memory probe (in the action condition) or grip the button-like shape (in the control condition) with their right hand.Thus, in both the action and the control conditions, participants had to keep a geometric shape in working memory and plan a grip movement, but only in the action condition was the grip movement planned on the object held in mind.Here too, we encouraged participants to respond as fast as possible, while also being accurate.If the probe did not match the memorized shape, in both conditions, participants were instructed not to respond, but keep holding down the earlier-mentioned O key.For each original memorized shape, we created two possible nonmatching probes, which were visually very similar to the original memorized shape.This was done to ensure that participants kept the geometric shape in a visual rather than a semantic format in working memory.Independently of whether it was a match or a no-match trial, the next trial began 2200 msec after working memory test onset.To induce an action-ready state in the participants, 25% of trials were catch trials, in which the VST was skipped and participants went directly from the shape encoding to a retention period of 1600 msec and the working memory test.

Design and Procedure
The experiment consisted of an action condition and a control condition block.Block order was counterbalanced across participants.Each experimental block was made up of three sub-blocks of 64 trials each, for 384 trials.A different set of shapes was used in each experimental block (counterbalanced across participants; see Figure 1).Each set consisted of six geometric shapes and two no-match versions of each shape, which were used throughout the whole block.Participants first practiced the task before each experimental block.They did so for two mini-blocks of 16 trials before the first experimental block, and for one mini-block of 16 trials before the second experimental block.More practice was given initially so that participants could familiarize themselves with the overall task structure.In each sub-block, participants performed the VST in 48 trials (75% of trials).Of these, 50% (i.e., 24 trials per sub-block) were congruent trials, in which the target letter N was located within the memorized geometric shape, whereas in the other 50% of trials, the memorized shape surrounded the nontarget element and the letter N was surrounded by the circle (incongruent trials).In incongruent trials, the memorized item was therefore a distractor in the VST.In each sub-block, there were 16 catch trials (25%) in which the VST was skipped altogether.In VST-present trials, the memory probe matched the memorized item on 28 trials (58.33%).In catch trials, the memory probe always matched the shape held in working memory, as to promote action planning during encoding.Across all trials, there were ∼68.75% match trials and ∼31.25% no-match trials.

Behavioral Responses
Manual responses during the VST.Practice blocks were excluded from the analysis.For each participant, only trials with an accurate response in both the VST and the working memory test were retained (on average 61.45% of the trials).Trials with excessively fast or slow RTs were also removed from the analysis independently for each participant, for each combination of factors and levels (i.e., action condition [action/control] × shape congruency [congruent/incongruent]).Outliers were defined as those data points falling above or below 2.5 SDs from the participant's mean response.Moreover, trials in which a saccade greater than 1 dva in a time window from −50 to 650 msec around VST onset was detected were dropped, to minimize the possibility of eye movements contaminating our ERP analyses.Following the application of these exclusion criteria, on average, ∼86% (203.38 ± 25.95 trials; range: 151-241 trials) of the correct trials were retained for the behavioral and ERP analyses.We also checked for outlier participants at the group level: participants with manual RTs falling above or below 2.5 SDs from the group RT mean in the VST.No outlier participants were detected.
Manual RTs in the VST were analyzed by means of a linear mixed-effects model in R (nlme package).To determine the best fitting model, we followed the protocol proposed by Zuur, Ieno, Walker, Saveliev, and Smith (2009).First, we took the model with the most complex combination of fixed effects, that is, the categorical factors action condition, shape congruency, and their interaction.Then, we varied the random part of the model to identify the best combination and structure for the random effects.The variable participant was entered as a random effect, and for each fixed effect, we modeled both random intercept and slope (i.e., 1 + action condition* shape congruency | participant).Once the best random part of the model was identified, we evaluated the winning model by varying the fixed effects.The winning model coincided with the most complex model initially considered (i.e., fixed effects: action condition, shape congruency and their interaction; random slopes: action condition, shape congruency, and their interaction).Gaze bias analysis.Previous studies show that even when observers are instructed to fixate, eye position provides a sensitive measure of how attention is spatially allocated over time (van Ede et al., 2019).We therefore looked at gaze bias as a secondary measure of attentional biases.For each trial with a correct response in both the VST and the working memory test, we extrapolated the x coordinates of the right eye, as outputted by Eyelink.We then computed the gaze bias with respect to the location of the memorized shape and averaged it across conditions and, using a temporal cluster permutation, identified a time window during which the signal was significantly different from zero (from 236 msec to 650 msec after VST onset).We used the mne function mne.stats.permutation_cluster_1samp_testfor this analysis (Gramfort et al., 2013).We then computed the gaze bias for each combination of factors separately and computed the mean amplitude and onset latency of the bias to test whether participants displayed a greater bias toward the shape held in working memory in the action compared with the control condition.Differences in mean amplitude were tested with a repeated-measures ANOVA, with Condition (action, control), Shape Congruency (congruent, incongruent), and their interaction as within-subject factors.Differences in onset latencies (50% peak latencies) were tested by using a jack-knife approach developed for factorial experiments (Ulrich & Miller, 2001) with the same predictors as in the mean amplitude analysis.
Performance on the working memory test.We assigned a score to each trial, depending on whether participants provided a correct response in the working memory test (score = 1) or not (score = 0).We considered only those trials in which participants provided a correct response in the VST.We then looked for differences in memory accuracy between the action and control condition by means of a generalized linear-mixed model, assuming the data would follow a binomial distribution.We used the R package lme4 to run this analysis.To determine the best fitting model, we followed again the protocol proposed by Zuur and colleagues (2009).First, we considered the model with the most complex combination of fixed effects: the categorical factors Action Condition, Probe Match (match/no-match), and Trial Type (catch/ VST), and their interactions.Second, we modeled the random part of the model and identified the best combination of random effects.After choosing the best random part of the model, we varied the fixed effects to identify the winning model.The best model included as fixed effects the categorical variables Action Condition, Probe Match (match/no-match), and their interaction, and as random slopes the variables Action Condition, Probe Match, and Trial Type (catch/ VST), but not their interactions (i.e., 1 + action condition + probe match + trial type | participant).

EEG Data
Both the EEG preprocessing and the ERP analysis were conducted on Python 3.8.8(van Rossum & Drake, 2009).For most of the functions, we relied on the mne package (Gramfort et al., 2014) and on the DvM Github repository (van Moorselaar, 2018).
EEG preprocessing.EEG data were rereferenced to the average signal of the right and left mastoid channels, and band-pass filtered between 0.1 and 200 Hz with a finite impulse response filter.We epoched the data from −0.4 to 7 sec relative to shape encoding onset.Bad channels were identified by means of a random sample consensus analysis and temporarily removed.Additional bad electrodes were identified upon visual inspection by the experimenter.Moreover, epochs with excessive noise across multiple channels were manually removed from the analysis (mean = 9.5, SD = 6.7 epochs).An independent component analysis (ICA) was then run on a copy of the cleaned original epochs after (1) applying a higher high-pass filter of 1 Hz, as it was found in previous work that this greatly benefits the ICA ( Winkler, Debener, Müller, & Tangermann, 2015), and (2) baselining the epochs, by using the whole epoch as baseline, which has also been shown to improve the ICA analysis (Groppe, Makeig, & Kutas, 2009).Components linked to eye-blinks were detected by correlating the EEG signal of the different ICA components to that of the VEOG electrodes.Additional components capturing other artifacts (e.g., noise-line artifacts) were identified.We removed the identified ICA components from the original data set and further cleaned the data by means of the mne Autoreject function (Jas, Engemann, Bekhti, Raimondo, & Gramfort, 2017).As indicated in Behavioral Responses section, we controlled for horizontal eye movements by dropping those epochs in which participants performed an eye movement larger than 1 dva in the time window from −50 msec to 650 msec after VST onset as measured using eye tracking.We then interpolated the bad electrodes initially excluded from the analysis, and finally applied a surface Laplacian on the epoched data to improve spatial resolution.The surface Laplacian (Perrin, Pernier, Bertrand, & Echallier, 1989) is a spatial high-pass filter that reduces volume conduction artifacts by attenuating low spatial frequencies in the data, which are typically associated with deep sources.This sharpens the EEG topography and reduces the correlation between close-by electrodes (Kayser & Tenke, 2015).We used a 20th-order Legendre polynomial and a smoothing factor (lambda) of 10-5 to estimate the surface Laplacian.These preprocessing steps led to the exclusion of, on average, 7% of trials (minimum-maximum across subjects: 3‰-23%).
ERP analysis.After down-sampling the EEG data to 256 Hz, we analyzed both stimulus-and response-locked ERPs following the VST onset.For the stimulus-locked ERPs, we focused our analysis on a period up to 650 msec after the VST onset, which contained our lateralized ERP components of interest (e.g., the N2pc, SPCN).We selected a priori the electrodes P7/ P8, P9/ P10, and PO7/PO8, which are canonically associated with ERP components signaling sensory processing and visual attention allocation (e.g., Forschack, Gundlach, Hillyard, & Müller, 2022;Papaioannou & Luck, 2020).First, we computed the averaged contralateral and ipsilateral signal with respect to the VWM-matching item (n.b.: independent of the target location in the VST) for each electrode pair across all trials and computed their difference.We then ran a temporal cluster permutation analysis, to determine when the thus created lateralized ERP signal was different from zero.This identified four windows of interest: t1 = (122 msec, 138 msec), t2 = (162 msec, 205 msec), t3 = (220 msec, 240 msec), t4 = (275 msec, 650 msec).These time windows roughly correspond to the latencies associated with well-known ERP components: the Ppc (Corriveau et al., 2012), N2pc (Luck & Hillyard, 1994), PD (Hickey, Di Lollo, & McDonald, 2009), and SPCN (Klaver, Talsma, Wijers, Heinze, & Mulder, 1999).Because, upon visual inspection, we noticed that, at a later stage, the SPCN voltage returned to zero only in incongruent trials, we split the SPCN time window into two parts (early, i.e., t4 = [275 msec, 400 msec], and late, i.e., t4b = [400 msec, 650 msec]) to better capture this change of pattern, and analyzed the two time windows separately.For each of the identified time intervals, we computed mean voltage amplitude and 50% peak latency (with the jack-knife approach) for each combination of factors and levels.We then fed the data into separate repeated-measures ANOVA analyses (run in R), with the within-subject factors Action Condition and Shape Congruency.We did not have a priori hypotheses about other potential differences in nonlateralized ERPs between conditions.We therefore also ran an exploratory spatio-temporal cluster permutation repeated-measures ANOVA while considering the signal recorded at all electrodes and with again Action Condition and Shape Congruency as withinsubject factors.We did so for both stimulus-locked and response-locked ERPs.This allowed us to identify spatio-temporal clusters that differentiated between the two action conditions and/or between congruent and incongruent trials.

Behavioral Results
Manual Responses and Gaze Bias in the VST Figure 2A depicts participants' RTs during the VST.As can be seen in this figure, numerically, we observed a normal congruency effect in the action condition (i.e., faster RTs when the VWM-matching shape surrounded the target vs. the distractor), but a reverse congruency effect in the control condition.This was reflected in a significant interaction between the variables Action Condition and Shape Congruency (log-ratio = 7.358, p = .007).However, post hoc pairwise comparisons revealed that neither of the two conditions showed a significant difference between congruent and incongruent trials (action condition: t-ratio = 1.274, p = .203;control condition: t-ratio = 1.305, p value = .192).There was no significant main effect of Action Condition (log-ratio = 1.142, p value = .285)nor of Shape Congruency (log-ratio ∼ 0, p value = .987).There were significant effects of the covariates Target Position (logratio = 5.157, p value = .0232)and Target Repetition (log-ratio = 109.7,p < .001).The first effect reflected that participants were somewhat faster (about 17 msec) in responding to targets located on the left of the fixation cross, whereas the second effect indicated that participants were faster (about 40 msec) in detecting the target when it appeared at the same location as in the previous trial.
As a further measure of attention, we analyzed gaze bias over time during the VST.As can be seen in Figure 2C, participants were more likely to look toward the shape held in working memory in both congruent and incongruent trials in both action conditions (all curves were found to be significantly different from zero between 273 msec and 490 msec after display onset, all ps < .001).Moreover, as predicted, we found that when the memorized shape was the direct target of a planned action, it biased participants' gaze even more than in the control condition, resulting in a larger mean amplitude toward the memory-matching shape in the action than in the control condition (F = 4.035, p = .047,η 2 p = .04),regardless of whether the shape surrounded the target or not (there was no significant interaction between the factors Shape Congruency and Condition: F = 2.081, p = .152,η 2 p = 0.02).As expected, we also found that in both conditions, participants' eyes were attracted more by the memorized shape when this surrounded the target (main effect of Shape Congruency; F = 8.111, p = .005,η 2 p = .07).We observed no significant differences in the latencies of the gaze biases (all ps > .26).
To summarize, the manual RT analysis indicates that when a memorized stimulus is the target of a prospective action, it interferes with an intermediate perceptual task differently-more precisely, oppositely-than when the memorized stimulus is not the direct recipient of the planned action.The gaze bias analysis further reveals that this difference in VWM-guidance of attention coincides with a prioritization of VWM items that are more tightly coupled with an action, corroborating our previous findings (Trentin et al., 2023).

ERPs
We next examined how, at the neural level, action planning on an object in VWM may bias perception and attention (during the VST), focusing first on several lateralized ERP components that are typically linked to preselection and postselection attentional processes, and which are shown in Figure 3.Note again that lateralized here denotes lateralized with respect to the shape matching the one in VWM, not to the search target.
SPCN-like component.The last observable lateralized ERP component following the onset of the VST was a SPCN-like component, which is typically associated with postselection processes such as target encoding into working memory (Hilimire, Mounts, Parks, & Corballis, 2011), and sustained attention toward/in-depth analysis of the search target (Heuer et al., 2021).As explained in the Methods section (ERP analysis), this component was analyzed in two different time bins to isolate a clear change of pattern that we observed in the ERP signal in the later part of the SPCN (see Figure 3B).First, we conducted mean amplitude and latencies analyses on the time window between 275 msec and 400 msec after the onset of the VST.We found a main effect of Shape Congruency on the ERP mean amplitude, reflecting a larger negative deflection in congruent than in incongruent trials (F = 65.078,p < .001,CI [−4.094, −2.478], η 2 p = .38)(μ Cong ± SE Cong = −6.257± 0.862 μV/cm 2 ; μ Incong ± SE Incong = −0.837± 0.961 μV/cm 2 ).This finding confirms that the SPCN mainly reflects target processing ( Wang, Yang, Jin, Zhang, & Li, 2019).We did not observe an effect of Action Condition (F = 1.062, p = .305,CI [−0.388, 1.123], η 2 p = .01)(μ A ± SE A = −2.060± 1.105 μV/cm 2 ; μ C ± SE C = −3.360± 1.063 μV/cm 2 ) nor an interaction between Shape Congruency and Action Condition (F = 0.826, p = .366,CI [−1.178, 0.436], η 2 p = .01)in this early SPCN time window.In the later time window, from 400 msec to 650 msec after the onset of the VST, we again observed a main effect of Shape Congruency (F = 55.288,p < .001,CI [−4.498, −2.604], η 2 p = .34),but this time, we observed also a significant main effect of Action Condition (F = 6.480, p = .0124,CI [−0.112, 1.417], η 2 p = .06)in the presence of a significant interaction between Shape Congruency and Action Condition (F = 4.226, p = .042,CI [−1.147, −0.021], η 2 p = .04).Post hoc comparisons indicated that in incongruent trials, in the action condition, the SPCN was significantly more positive than in the control condition (t-ratio = 2.581, p = .022,CI [0.57,4.37],d = 0.25), whereas in congruent trials, there was no difference between the two conditions (t-ratio = 0.143, p = .987,CI [−1.75, 2.04], d = 0.01).This condition difference in incongruent trials may reflect stronger inhibition of the memorized shape in the action condition to reduce interference between perceptual and memory information (Feldmann-Wüstefeld & Vogel, 2019) because of the initial larger attentional VWM-guidance.Alternatively, it may reflect a more negative SPCN relative to the target letter N instead, as perhaps more effort was needed to process it when presented in competition with the memory-matching shape (Maheux & Jolicoeur, 2017).We found no significant effects in onset latencies in the early SPCN time window (all ps > .206).
Summary.To summarize, our ERP findings indicate that the shape kept in working memory was more attentionally salient in the VST when it was the future target of an action plan, as reflected in a larger Ppc in the action condition over the control condition.This may then actually have led to a greater need for inhibition of this shape in the action condition, to reduce interference, and/or stronger attentional re-orienting to, and encoding of, the contralateral target in working memory, as reflected in modulations of both the PD-and SPCN-like components.

Nonlateralized Stimulus-locked and Response-locked ERP Potentials
We did not have a priori assumptions about other nonlateralized ERP differences between action conditions.We therefore ran an exploratory repeated-measures ANOVA on stimulus-locked (from 0 msec to 650 msec) and response-locked (−358 msec to 0 msec) data while controlling for multiple comparisons by means of a spatiotemporal cluster permutation, to determine if we may have missed other interesting differences.Although the stimulus-locked signal did not reveal any significant cluster differentiating between the action and the control conditions (see Appendix), the response-locked analysis identified one significant cluster ( p = .03)for which the EEG signal differed depending on whether the memorized shape was the direct target of an action or not (Figure 4A).This cluster was topographically confined to electrode C5, and it was significant for a short time interval (between −137 msec and −117 msec before response).Electrode C5 is located over the left motor cortex, which controls right-hand movements, and as action planning on the shape in memory involved the right hand (whereas the VST was performed with the left hand), this effect may reflect differential activation of the corresponding motor cortex between the action conditions.In the responselocked analysis, we further located two clusters of electrodes differentiating between congruent and incongruent trials ( p = .019;p = .013;Figure 4B and C).These clusters included many electrodes.Whereas the earliest cluster (−359 msec to −184 msec before response) involved frontal, central, and visual electrodes, the second cluster comprised mainly right prefrontal, left fronto/ temporal, and centro-parietal electrodes (−246 msec to 0 msec).We did not observe any cluster of electrodes for which the interaction between condition and shape congruency was significant.

Eye Movement Control Analyses
Although we excluded from the main ERP analysis those trials where participants deviated more than 1 dva from central fixation during the VST, we conducted additional control analyses to ensure that our ERP findings were not influenced by underlying eye movement biases.To verify this, for each participant, we correlated the ERP amplitude measures with gaze bias and tested whether the observed correlations at the group level were different from zero (i.e., one-sample paired t test).None of the ERP components of interest showed significant correlations with eye movements (Ppc: t = .166,p = .869;N2pc: t = 1.779, p = .084;PD: t = .295,p = .769;SPCN1: t = .834,p = .410;SPCN2: t = .698,p = .490).

DISCUSSION
In recent years, it has become clear that VWM does not simply represent past events, but is more concerned with guiding future behavior than traditionally assumed (Nasrawi & van Ede, 2022;van Ede, 2020;Nobre & Stokes, 2019).Yet, although in daily life we often plan to act on information held in VWM (e.g., we search for our keys to grab them), very little is known about how planning to act on an object held in VWM may affect its representation in VWM to facilitate the guidance of future action.The current EEG study addressed this question, examining how action planning may prioritize the processing of and the guidance of attention to future action-relevant sensory information.Replicating our previous behavioral findings (Trentin et al., 2023; see also Trentin et al., 2024), we found a difference in VWM-induced attentional biases between the condition in which the object held in VWM was the direct target of an action plan (action condition) and the condition in which this object, despite being informative for the execution of a prospective action, was not its direct target (control condition).Specifically, when looking at gaze position during the VST, we observed that participants showed a significantly greater gaze bias toward the memorized shape when it was the direct target of an action, both when the memorandum surrounded the designated search target (congruent trials) and when it surrounded a distractor (incongruent trials).The manual RTs also suggested a significant difference in attentional bias between the two conditions, but the observed interaction effect was mainly driven by longer RTs in incongruent compared with congruent trials in the action condition, and by faster RTs in incongruent than in congruent trials in the control condition, that is, a seemingly reversed attentional capture effect in the control condition (albeit neither of these individual pairwise effects were significant).It is possible that in the control condition, observers actually used the memorized information to ignore the matching stimulus to some extent (see Woodman &Luck, 2007 andZhang, Liu, Doro, &Galfano, 2018, for earlier evidence).If so, then the direct-association in the action condition negated this inhibitory tendency and turned it into a bias toward the memory-matching stimulus.Extending these results, our ERP findings revealed that the prioritization of a VWM item that is the target of an action plan affected early sensory and attentional processes, as well as later postselection ERP components linked to attentional engagement with a stimulus.These ERP findings corroborate the notion that VWM is oriented toward guiding actions (Olivers & Roelfsema, 2020;van Ede, 2020).In the following paragraphs, we will further unpack these findings.
Although the gaze bias results indicated enhanced attentional capture by the memorized shape, which was also, as predicted, more pronounced in the action condition, the manual RT results did not show this typical attentional capture (or congruency) effect in either condition.
This may reflect the fact that, in our study, the search display only contained two stimuli, so that once one stimulus was identified, it was immediately clear which stimulus was the target: either the selected one or the one on the opposite side of the display.That is, in case of capture by the memorized shape in incongruent trials, it was not necessary to redirect one's attention to the target, given that its position could be inferred from the position of the distractor.This might have introduced noise in our RT measurements, and rendered the attentional capture effect less apparent, even if it was clearly present in both the gaze bias data and in the ERP results.Manual RTs also provide a less direct measure of attentional capture than gaze bias or early lateralized ERP components, as it reflects the summation of all processes leading up to the response.
As to our lateralized ERP results, differences between the action and control conditions in the VST emerged already at early stages of visual processing, in the time window of the visually evoked P1.Specifically, we found that the VWM-matching shape evoked a significantly larger Ppc-like component when this shape was the future target of an action.Some studies have related the Ppc to stimulus saliency, independently of attentional processes ( Jannati, Gaspar, & McDonald, 2013;Sawaki & Luck, 2010), whereas other studies suggest that this early lateralized posterior positivity is subject to attentional modulation (van Moorselaar, Huang, Theeuwes, 2023; Barras & Kerzel, 2017;Weaver, Hickey, & van Zoest, 2017;Fortier-Gauthier, Moffat, Dell'Acqua, McDonald, & Jolicoeur, 2012).In our study, although the VST display contained a physical imbalance between the left and right visual fields, this was the case in both the action and the control conditions.Therefore, our finding of a larger Ppc in the action compared with the control condition cannot simply be explained in terms of a difference in raw, physical saliency between conditions.Rather, it suggests that directly planning an action on the geometric shape in VWM may have rendered this shape more attentionally salient in the VST, enhancing its sensory processing (van Moorselaar, Daneshtalab, & Slagter, 2021).In line with this, a recent study by Abbasi and colleagues (2022) found that the greater the Ppc elicited by a distractor stimulus, the greater the saliency of this stimulus.Interestingly, a previous EEG study investigating the effects of action planning on the deployment of visual attention in a dual-task setup reported an action-induced modulation of early sensory processing in congruent actionperception trials.In this study, participants first planned to either grasp or point toward a cup and then performed a visual search task in which the target was either defined by size (i.e., the largest dot) or luminance (i.e., the brightest dot).Finally, participants executed the planned movement on the cup.It was found that action planning facilitated the visual search of action-relevant features and that this effect was accompanied by the modulation of the early visually evoked P1.Our Ppc results extend these findings by showing that the effects of action planning on sensory processing are even more pronounced when the action is planned on the object itself, revealing a tight link between action planning and how information is prioritized in VWM.It could be argued that the Ppc in our study may rather be a N1pc toward-and hence signal attentional selection of-what is on the opposite side of the fixation cross compared with the VWM-matching shape.However, this is unlikely, as we found that the modulation of this component was independent of VST target location (i.e., it was larger in the action condition in both congruent and incongruent trials), and participants had no reason to prioritize the other, neutral shape, which was not predictive of target location during the VST.Thus, our data suggest that planning an action on an object in VWM renders this object, when visually presented, more attentionally salient at the sensory level, as reflected in a larger Ppc.
Contrary to our initial hypothesis, however, we found neither a larger nor an earlier N2pc component in the action condition.If anything, numerically, the N2pc tended to be smaller when the item held in memory was the direct target of an action.Yet, the N2pc was significantly larger in congruent compared with incongruent trials, suggesting that at this later time point, attention was more strongly driven by what information was relevant for the VST (i.e., the target letter) and that the selection of the target was further aided by the memory-congruent shape.Previous ERP studies using a combined VWM-visual search task paradigm have similarly observed a larger N2pc when the irrelevant VWM shape surrounded the target versus the distractor (Carlisle & Woodman, 2011, 2013;Mazza et al., 2011).
Congruency also modulated the amplitude of the PD-like component, which followed the N2pc.The late PD is usually linked to reactive suppression of distracting information (see Gaspelin et al., 2023, for a more comprehensive account of the PD), and, in line with this notion, this lateralized positivity was larger in incongruent than in congruent trials, that is, when the VWMmatching shape surrounded the distractor (the hour glass) in the VST.This result replicates prior work showing that salient VWM-matching items elicited a clear PD component in a visual search task, when irrelevant to search (De Vito, Al-Aidroos, & Fenske, 2017;Sawaki & Luck, 2011).Furthermore, the N2pc-PD switch relative to the memorized shape that we found in our study was also previously observed (Feldmann-Wüstefeld & Schubö, 2013) and is generally explained as signaling the inhibition of a highly distracting stimulus after it initially captured attention (Lu et al., 2017).Attentional suppression is expected to increase with saliency of a distractor (Drisdelle & Eimer, 2023;Stilwell, Egeth, & Gaspelin, 2022; but see Forschack et al., 2022), but here we found that, although the PD-like component elicited in the action condition by the VWM-matching item was numerically larger than in the control condition in incongruent trials, this difference was not significant.
Rather, planning an action on an object in VWM seemed to modulate the onset of the PD-like component.
Specifically, this component was evoked earlier in the action compared with the control condition.Earlier onsets of the PD component have previously been associated with more predictable contexts, in which distractors are more easily and efficiently suppressed (Gaspelin et al., 2023;van Moorselaar et al., 2021).In our study, the faster emergence of the PD may therefore point toward a greater accessibility of the perceptual representation when this matched the memorized item in the action condition, which might have led to the quicker initiation of inhibitory processes to suppress the more salient, but (both in congruent and incongruent trials) irrelevant, geometric shape.Findings from a recent study (Chen et al., 2023), which showed that more attentionally salient distractors elicit an earlier PD component, seem to support this idea.
Finally, our action manipulation also affected postselection attentional processes, as reflected in the late SPCN, a component linked to encoding of an item in VWM but also to sustained attention (Heuer et al., 2021).Specifically, we observed a sign reversal of the signal amplitude, which led to more positive SPCN in the action condition compared with the control condition solely in incongruent trials.To our knowledge, only one study has previously reported a reversal of the SPCN (or Contralateral Delay Activity) component (Feldmann-Wüstefeld & Vogel, 2019).In this study, the researchers investigated ERP markers of VWM filtering by asking participants to encode a visual stimulus in working memory while ignoring distracting information.They observed that distractors elicited a positive SPCN, in a time window overlapping with the one in which we observed the SPCN reversal in incongruent trials (i.e., between ∼400 and 650 msec after stimulus onset).The authors interpreted this finding as the brain preventing irrelevant visual information from accessing VWM, and indeed, participants with higher VWM capacity (and therefore more efficient filtering) exhibited a larger SPCN reversal.The more positive SPCN in incongruent trials in our study may thus similarly reflect filtering out of the irrelevant geometric shape, which may have been more necessary in the action condition, as reflected by the more pronounced SPCN reversal in the action condition, because the shape was more attentionally salient in this condition (as suggested by our Ppc finding).Because of our bilateral display, an alternative possibility, which cannot be excluded, is that the reversed SPCN signals the employment of more resources toward the ipsilateral target in incongruent trials, rather than increased suppression of the contralateral irrelevant memorized shape.In either case, the more positive SPCN in incongruent trials in the action condition signifies a greater demand on postselection attentional processing, and future studies should dissociate between these alternative interpretations.
We further found that both the early and late SPCN were significantly more negative in congruent than in incongruent trials.This is the first time such result is reported, and it suggests that the discrimination of the target and its encoding into VWM may have been more challenging when it was surrounded by the memorized shape, and/or that participants may have more strongly engaged with it until response selection.
To summarize, our lateralized-ERP results suggest that the geometric shape kept in working memory may have been more attentionally salient in the VST in the action compared with the control condition, that is, when it was the future target of an action plan, as reflected in a larger Ppc.This may have subsequently led to a greater need for inhibition of this shape in the action condition, to reduce interference, and/or stronger attentional re-orienting to the search target, as reflected in modulations of the PD-and SPCN-like components.
Next to testing our primary hypothesis that planning an action on an object in WM would more strongly guide (lateralized) sensory and attentional processing, we ran several exploratory analyses to further investigate actioninduced effects on VWM prioritization.Following a response-locked analysis, we found one cluster that differentiated between the action and the control conditions.This cluster was localized over left motor cortex, which encodes for right hand movements.This is notable as participants had to plan a right-hand grip action on the memory shape in VWM, whereas during the VST, they responded to the search target with a left-hand key press.Although speculative, as our effects were measured at the scalp, it is thus possible that in the action condition, more effort was required to inhibit the right-hand response associated with the memorized item in the VST.Participants indeed often reported feeling like "grabbing" the shape during the VST in the action condition.An increase in response inhibition in the action condition would dovetail an increase in inhibition of the irrelevant geometric shape, as observed in lateralized visual ERPs.
The effects of congruency obtained in the responselocked analyses were much less localized but comprised mostly bilateral prefrontal and central electrodes.The congruency effect over central electrodes reflected greater sustained positivity in incongruent compared with congruent trials.This positivity very much resembles the response-locked Centroparietal Positive Potential component, which is thought to reflect an accumulation-tobound signal and is larger in more difficult decision trials (O'Connell, Dockree, & Kelly, 2012).This central effect may hence reflect the fact that in incongruent trials in the VST, when the geometric shape did not surround the target, it was more difficult to come to a perceptual decision.
To summarize, our behavioral and electrophysiological results provide evidence in favor of a greater prioritization of VWM representations that are the target of an action plan.Adding to our previous behavioral findings (Trentin et al., 2023), we show here that objects matching VWM representations that are more tightly linked to an action elicit (1) a larger Ppc component, signaling greater sensory saliency; (2) a faster PD, suggesting greater accessibility; (3) a more positive SPCN, when they are incongruent with the target location, possibly indicating greater demands on VWM filter processes; and (4) stronger activation over right prefrontal and left motor scalp regions, possibly reflecting that more inhibition is necessary to prevent objects that we plan to grasp in the future from interfering with the task at hand.Together, these findings support the notion that items in VWM that we plan to act on are prioritized compared with those that share less features with the planned action, further demonstrating the actionoriented nature of working memory.

APPENDIX: NONLATERALIZED STIMULUS-LOCKED ERP POTENTIALS
Following an exploratory whole-brain cluster permutation analysis on the nonlateralized stimulus-locked ERP data (0-650 msec), we observed a nearly significant Action Condition effect, three Shape Congruency effects but no interaction between the two factors (Figure A1).The nearly significant cluster that distinguished between the action and control conditions ( p = .06)showed a more sustained (220-619 msec) positivity over right prefrontal scalp regions in the action compared with the control condition (Figure A1A).This difference in frontal neural activity resembles that observed in studies investigating inhibitory control in go/no-go tasks ( Johari & Berger, 2023;Hege, Preissl, & Stingl, 2014;Swann et al., 2009), in which increased brain activity over right prefrontal regions has been associated with effective motor inhibition.
As mentioned above, we further observed three clusters distinguishing between congruent and incongruent trials ( p = .034,p = .025,p = .015;Figure A1B, C, and D).The first cluster, which was significant between 177 msec and 338 msec after the onset of the VST, comprised central and right frontal electrodes.At right frontal electrodes, we observed both (1) a greater anterior P2 component for congruent than incongruent trials, likely indicating improved target selection (Martens, Munneke, Smid, & Johnson, 2006) when the target letter was surrounded by the memorized geometric shape, and (2) a larger anterior N2 component for incongruent than congruent trials, consonant with previous research reporting a larger anterior N2 in the presence of conflicting visual information (Folstein & Van Petten, 2008; Figure A1B).The second cluster captured a larger positivity in incongruent versus congruent trials, which most likely reflects processes underlying the P3b component, as its timing (365-615 msec) and scalp topography correspond to the latency and scalp topography of the P3b (Kok, 2001; Figure A1C).Later in time as well, starting from 439 msec after the onset of the VST, the activity at frontal and left temporal/central electrodes was distinguished between congruent and incongruent trials (Figure A1D).pattern of gender imbalance: Although the proportions of authorship teams (categorized by estimated gender identification of first author/ last author) publishing in the Journal of Cognitive Neuroscience ( JoCN ) during this period were M(an)/ M = .407,W(oman)/ M = .32,M/ W = .115,and W/ W = .159,the comparable proportions for the articles that these authorship teams cited were M/M = .549,W/M = .257,M/ W = .109,and W/ W = .085(Postle and Fulvio, JoCN, 34:1, pp.1-3).Consequently, JoCN encourages all authors to consider gender balance explicitly when selecting which articles to cite and gives them the opportunity to report their article's gender citation balance.The authors of this paper report its proportions of citations by gender category to be: M/M = .567;W/M = .269;M/ W = .119;W/ W = .045.

Figure 2 .
Figure 2. (A) Manual RTs during the VST.Error bars represent within-subject standard errors; (B) the direction of the interaction effect shown in A, but for each participant.Positive values indicate that the difference in manual RT between incongruent and congruent trials) was larger in the action condition compared with the control condition; (C) gaze bias during the VST.Time 0 indicates the onset of the VST.As can be seen, participants' gaze was biased toward the VWM-matching shape in all conditions, and particularly so in the action condition and in congruent trials.

Figure 3 .
Figure 3. ERPs following the onset of the VST at electrodes P7/P8, P9/P10, and PO7/PO8.(A) Contralateral (solid line) and ipsilateral (dashed line) ERPs with respect to the VWM-matching item for each condition (action/control) and each trial type (congruent/incongruent). (B) Difference between contralateral and ipsilateral signals for each combination of factors.Time windows of interest are highlighted in gray and correspond to well-known lateralized visual ERP components: the Ppc, the N2pc, the PD, and the SPCN.

Figure 4 .
Figure 4. Response-locked spatio-temporal clusters differentiating between the action and the control conditions (A), and between congruent and incongruent trial (B and C) during the VST.RO = response offset.