rapid serial visual presentation

  • Bookmarklet
  • About Iris Reading
  • Live Courses
  • Online Courses
  • Free Resources

All Online Courses on Sale. Click here for details!

Accessibility Links

  • Skip to content
  • Skip to search IOPscience
  • Skip to Journals list
  • Accessibility help
  • Accessibility Help

Click here to close this panel.

Purpose-led Publishing is a coalition of three not-for-profit publishers in the field of physical sciences: AIP Publishing, the American Physical Society and IOP Publishing.

Together, as publishers that will always put purpose above profit, we have defined a set of industry standards that underpin high-quality, ethical scholarly communications.

We are proudly declaring that science is our only shareholder.

A review of rapid serial visual presentation-based brain–computer interfaces

Stephanie Lees 1 , Natalie Dayan 1 , Hubert Cecotti 2 , Paul McCullagh 1 , Liam Maguire 1 , Fabien Lotte 3 and Damien Coyle 1

Published 24 January 2018 • © 2018 IOP Publishing Ltd Journal of Neural Engineering , Volume 15 , Number 2 Citation Stephanie Lees et al 2018 J. Neural Eng. 15 021001 DOI 10.1088/1741-2552/aa9817

Article metrics

11646 Total downloads

Share this article

Author e-mails.

[email protected]

Author affiliations

1 Faculty of Computing and Engineering, Ulster University, Belfast, United Kingdom

2 Department of Computer Science, College of Science and Mathematics, California State University, Fresno, 2576 E. San Ramon MS ST 109 Fresno, CA 93740–8039, United States of America

3 Inria Bordeaux Sud-Ouest/LaBRI/CNRS/Université de Bordeaux/IPB, 200 Avenue de la Vieille Tour, 33405 Talence, France

Stephanie Lees https://orcid.org/0000-0003-1036-5639

  • Received 5 September 2017
  • Accepted 3 November 2017
  • Published 24 January 2018

Peer review information

Method : Single-anonymous Revisions: 2 Screened for originality? No

Buy this article in print

Rapid serial visual presentation (RSVP) combined with the detection of event-related brain responses facilitates the selection of relevant information contained in a stream of images presented rapidly to a human. Event related potentials (ERPs) measured non-invasively with electroencephalography (EEG) can be associated with infrequent targets amongst a stream of images. Human–machine symbiosis may be augmented by enabling human interaction with a computer, without overt movement, and/or enable optimization of image/information sorting processes involving humans. Features of the human visual system impact on the success of the RSVP paradigm, but pre-attentive processing supports the identification of target information post presentation of the information by assessing the co-occurrence or time-locked EEG potentials. This paper presents a comprehensive review and evaluation of the limited, but significant, literature on research in RSVP-based brain–computer interfaces (BCIs). Applications that use RSVP-based BCIs are categorized based on display mode and protocol design, whilst a range of factors influencing ERP evocation and detection are analyzed. Guidelines for using the RSVP-based BCI paradigms are recommended, with a view to further standardizing methods and enhancing the inter-relatability of experimental design to support future research and the use of RSVP-based BCIs in practice.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence . Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Rapid serial visual presentation (RSVP) is the process of sequentially displaying images at the same spatial location at high presentation rates with multiple images per second, e.g. with a stimulus onset asynchrony no greater than 500 ms but often lower than 100 ms, i.e.  >10 stimuli presented per second. Brain–computer interfaces (BCIs) are communication and control systems that enable a user to execute a task via the electrical activity of the user's brain alone (Vidal 1973 ). RSVP-based BCIs are a specific type of BCI that are used to detect target stimuli, e.g. letters or images, presented sequentially in a stream, by detecting brain responses to such targets. RSVP-based BCIs are considered as a viable approach to enhance human–machine symbiosis and offers potential for human enhancement.

To date, the literature on RSVP-BCIs has not been comprehensively evaluated, therefore it is timely to review the literature and provide guidelines for others considering research in this area. In this review we: (1) identify and contextualize key parameters of different RSVP-BCI applications to aid research development; (2) document the growth of RSVP-based BCI research; (3) provide an overview of key current advancements and challenges; (4) provide design recommendations for researchers interested in further developing the RSVP-BCI paradigm.

This review is organized as follows: section 2 presents background information on the fundamental operating protocol of RSVP-BCIs. Section 3 details results of a bibliometric analysis of the key terms 'rapid serial visual presentation', 'RSVP', 'electroencephalography', 'EEG', 'brain–computer interface', 'BCI', 'event-related potentials', 'ERP and 'oddball' found within authoritative bibliographic resources. Section 4 provides an overview of performance measures. Section 5 outlines existing RSVP-based BCI applications, presenting inter-application study comparisons, and undertakes an analysis of the design parameters with inter-application study comparisons. Section 6 provides a summary, discussion of findings and ongoing challenges.

2. Background

RSVP-based BCIs have been used to detect and recognize objects, scenes, people, pieces of relevant information and events in static images and videos. Many applications would benefit from an optimization of this paradigm, for instance counter intelligence, policing and health care, where large numbers of images/information are reviewed by professionals on a daily basis. Computers are unable to analyze and understand imagery as successfully as humans and manual analysis tools are slow (Gerson et al 2005 , Mathan et al 2008 ). In studies carried out by Sajda et al ( 2010 ), Poolman et al ( 2008 ) and Bigdely-Shamlo et al ( 2008 ), a trend of using RSVP-based BCIs for identifying targets within different image types has emerged. Research studies show the ability to use RSVP-based BCIs to drive a variety of visual search tasks including, in some circumstances, skills learned for visual recognition. Although the combination of RSVP and BCI has proven successful on several image sets, other research has attempted to establish whether or not greater efficiencies can be reached through the combination of RSVP-based BCIs and behavioral responses (Huang et al 2007 ).

2.1. Event related potentials and their use in RSVP-based BCIs

Event-related potentials (ERPs) are electroencephalography (EEG) signal amplitude variations in the EEG associated with the onset of a stimulus (usually auditory or visual) presented to a person. ERPs are typically smaller in amplitude (<10 µ V) in comparison to the ongoing EEG activity (~50–100 µ V) they are embedded within (Huang et al 2008 , Acqualagna and Blankertz 2011 ). As ERPs are locked in phase and time to specific events, they can be measured by averaging epochs over repeated trials (Huang et al 2011 , Cecotti et al 2012, 2014 ). Shared EEG signal features are accentuated and noise attenuated (Luck 2005 , Cohen 2014 ). The outcome is represented by a temporal waveform with a sequence of positive and negative voltage deflections labeled as ERP components. ERPs are representative of summated cortical neural processing and behavioral counterparts, such as attentional orientation (Wolpaw and Wolpaw 2012 , Cohen 2014 ).

The stream of images presented within an RSVP paradigm comprise frequent non-target images and infrequent target images; different ERP components are associated with target and non-target stimuli (Bigdely-Shamlo et al 2008 , Cohen 2014 , Sajda et al 2014 ). BCI signal processing algorithms are used to recognize spatio-temporal electrophysiological responses and link them to target image identification, ideally on a single trial basis (Manor et al 2016 ).

The most commonly exploited ERP in RSVP-based BCI applications is the P300. The P300 appears at approximately 250–750 ms post target stimulus (Polich and Donchin 1988 , Leutgeb et al 2009 , Ming et al 2010 , Zhang et al 2012 ). As specified by Polich and Donchin ( 1988 ) during the P300 experiment (commonly referred to as the 'oddball' paradigm), participants must classify a series of stimuli that fall into one of two classes: targets and non-targets. Targets appear more infrequently than non-targets (typically ~5–10% of total stimuli in the RSVP paradigm) and should be recognizably different. It is known that P300 responses can be suppressed in an RSVP task if the time between two targets is  <0.5 s, which is known as attentional blink (Raymond et al 1992 , Kranczioch et al 2003 ). The amplitude and the latency of the P300 are influenced by the target discriminability and the target-to-target interval in the sequence. The latency of the P300 is affected by stimulus complexity (McCarthy and Donchin 1981 , Luck et al 2000 ). The P300 amplitude can vary as a result of multiple factors (Johnson 1986 ), such as:

  • subjective probability—the expectedness of an event;
  • stimulus meaning—comprised of task complexity, stimulus complexity and stimulus value;
  • information transmission—the amount of stimulus information a participant registers in relation to the information contained within a stimulus.

2.2. RSVP-based BCI amongst the BCI classes

BCIs can be of three different types: active, reactive or passive (Zander et al 2010 ). An active BCI is purposefully controlled by the user through intentional modulation of neural activity, often independent of external events. Contrastingly, reactive BCIs generate outputs from neural activity evoked in response to external events, enabling indirect control by the user. Passive BCI makes use of implicit information and generate outputs from neural activity without purposeful control by the user. Active/reactive BCIs are commonly aimed at users with restricted movement abilities who intentionally try to control brain activity, whereas implicit or passive BCIs are more commonly targeted towards applications that are also of interest to able-bodied users (Zander and Kothe 2011 , Sasane and Schwabe 2012 ).

2.3. RSVP-based BCI presentation modes

RSVP-based BCIs have two presentation modes: static mode in which images appear and disappear without moving; and moving mode where targets within short moving clips have to be identified (Sajda et al 2010 , Cecotti et al 2012, Weiden et al 2012 ). Both presentation modes can be used with or without a button press. With a button press, users indicate manually, by pressing a button, when they observe a target stimulus. A button press is used to establish baseline performance, reaction time and/or to enhance performance (discussed further in section 5.1 ).

2.3.1. Static.

In 'static mode', images displayed have identical entry and exit points—the images are transiently presented on screen (typically for 100–500 ms) and then disappear. One benefit of static mode is that images occupy the majority of the display and, therefore, identification of targets is likely even if they are only presented briefly. There are a number of different possible instructions a participant may be given.

  • Prior to presentation, a target image may be shown to participants and participants are asked to identify this image in a sequence of proceeding images. Target recognition success rates can be achieved with presentation rates as high as 10 s −1 (Cecotti et al 2012).
  • Participants may be asked to identify a type of target e.g. an animal within a collection of images. In this mode, the rate of presentation should be slowed down (4 s −1 ) (Wang et al 2009 ).
  • Immediately after image sequence presentation, the participant may be shown an image and asked: 'did this image appear in the sequence you have just seen?' (Potter et al 2002 ).

2.3.2. Moving.

There has been relatively little research regarding neural signatures of a target and/or anomalies in real world or simulated videos. In 'moving mode', short video clips are shown to participants, and within one video clip participants may be asked to identify one or more targets. It is important that these targets are temporally 'spread out' to avoid P300 suppression. There are different possible instructions a participant may be given:

  • Prior to presentation, participants may be given a description of a target, i.e. asked to identify, say a 'person' or 'vehicle' in a moving scene (Weiden et al 2012 ).
  • Participants can be asked to identify a target event; in this case, the target is identified across space and time. The participant is required to integrate features from both motion and form to decide whether a behavior constitutes a target, for example, Rosenthal et al ( 2014 ) defined the target as a person leaving a suspicious package in a train station.

2.4. Cognitive blindness

When designing an RSVP-based BCI, three different types of cognitive blindness should be considered namely, the attentional blink, change blindness and saccadic blindness. Generally, RSVP is a paradigm used to study the attentional blink, which is a phenomena that occurs when a participant's attention is grabbed by an initial target image and a further target image may not be detectable for up to 500 ms after the first (Raymond et al 1992 ). Depending upon the duration of stimuli presentation the ration of target images/total images will change (e.g. if images are being presented at a duration of 100 ms then there must be a minimum of five images between targets 1 and 2. In a sequence of 100 images there can be a maximum of 20 target images. Whereas if images are presented at 200 ms this limits the maximum number of targets to 10/100 images in total).

Change blindness occurs when a participant is viewing two images that vary in a non-trivial fashion, and has to identify the image differences. Change blindness can occur when confronted by images, motion pictures, and real world interactions. Humans have the capacity to get the gist of a scene quickly but are unable to identify particular within-scene features (Simons and Levin 1997 , Oliva 2005 ). For example, when two images are presented for 100 ms each and participants are required to identify a non-trivial variation as the images are interchangeably presented, participants can take between 10–20 s to identify the variation. This latency period in identifying non-trivial variations in imagery can be augmented through use of distractors or motion pictures (Rensink 2000 ). In the context of designing an RSVP paradigm change blindness is of interest, as it will take longer for a user to identify a target within an image if it does not pop out from the rest of the image. Distractors within the image or cluttered images, will increase the time it takes a user to recognize a target, reducing the performance of the RSVP paradigm.

Saccadic blindness is a form of change blindness described by Chahine and Krekelberg ( 2009 ) where ' humans move their eyes about three times each second. Those rapid eye movements called saccades help to increase our perceptual resolution by placing different parts of the world on the high-resolution fovea. As these eye movements are performed, the image is swept across the retina, yet we perceive a stable world with no apparent blurring or motion '. Saccadic blindness thus refers to the loss of image when a person saccades between two locations. Evidence shows that saccadic blindness can occur 50 ms before saccades and up to 50 ms after saccades (Diamond et al 2000 ). Thus, it is important that stimuli have a duration greater than 50 ms to bypass saccadic blindness, unless participants are instructed to attend a focus point and the task is gaze independent and thus does not demand saccades (such as during the canonical RSVP paradigm (section 5.4 )).

Having considered some of the factors influencing RSVP-based BCI designs, the remainder of the paper focuses on a bibliometric study of the RSVP literature highlighting the key methodological parameters and study trends. Studies are compared and contrasted on an intra- and inter-application basis. Later sections focus on study design parameters and provide contextualized recommendations for researchers in the field.

3. Bibliometric study of the RSVP related literature

A bibliometric review of the RSVP-based BCIs was conducted. The inclusion criteria for this review were studies that focused on EEG data being recorded while users were performing visual search tasks using an RSVP paradigm. The studies involved various stimulus types presented using the RSVP paradigm where participants had to identify target stimuli. All reported studies where not simply theoretical and had at least one participant. One or more of the keywords BCI, RSVP, EEG or ERP appeared in the title, abstract or keyword list. Only papers published in English were included. The literature was searched, evaluated and categorized up until August 2017. The databases searched were Web of Science, IEEE, Scopus, Google Scholar, and PubMed. The search terms used were: 'rapid serial visual presentation', 'RSVP', 'electroencephalography', 'EEG', 'brain–computer interface', 'BCI', 'event-related potentials', 'ERP and 'oddball'.

Papers were excluded for the following reasons: 1. the research protocol had insufficient detail; 2. key aspects needed to draw conclusive results were missing; 3. the spectrum of BCI research reported was too wide (i.e. review papers not specific to RSVP), 4. a 'possible' research application was described but the study was not actually carried out; 5. the study was a repeated study by original authors with only minor changes. Due to the immaturity of RSVP-based BCI as research topic, conference papers were not excluded. Inclusion of conference papers was considered important in order to provide a comprehensive overview of the state-of-the-art and trends in the field. Fifty-four papers passed initial abstract/title screening; these were then refined to the 45 most relevant papers through analysis of the entire paper contents. The date of the included publications ranged from 2003–2017.

The relevant RSVP-based BCI papers are presented in table 1 when a button press was required, and table 2 when no button presses were conducted. RSVP-based BCIs were evaluated in terms of the interface design. Tables 1 and 2 show that there is considerable variation across the different studies in terms of the RSVP-BCI acquisition paradigm, including the total number of stimuli employed, percentage of target stimuli, size of on-screen stimuli, visual angle, stimulus presentation duration, and the number of study participants. Performance was measured using a number of metrics: the area under the receiver operating characteristic (ROC) curve (Fawcett 2006 ), classification accuracy (%) and information transfer rate. ROC curves are used when applications have an unbalanced class distribution, which is typically the case with RSVP-BCI, where the number of target stimulus is much smaller than that of non-target stimuli. Many studies report different experimental parameters and some aspects of the studies have not been comprehensively reported. From tables 1 and 2 , it can be seen that the majority of applications using a button press as a baseline may be classified as surveillance applications while applications that do not use a button press are more varied. This may be because often surveillance applications have an industry focus, and quantified improvement relative to manual labeling alone is crucial for acceptance. In the majority of the applications where a button press was used, participants undertake trials with and without a button press and the difference in latency of response between the two is calculated to compare neural and behavioral response times. The results of the bibliometric analysis are further discussed in sections 4 – 6 , following the analysis of key papers identified in the following section.

Table 1.  Design parameters reviewed, mode: button press  =  yes. Table acronyms: SVM (support vector machine), SFFS (sequential forward feature selection), N/A (not available), BLDA (Bayesian linear discriminant analysis), CSP (common spatial pattern), BCSP (bilinear CSP), CCSP (composite CSP), LDA (linear discriminant analysis), C (EEG channel), FDA (fisher discriminant analysis), FDM (finite difference model), LLC (linear logistic classifier), RBF SVM (radial basis function SVM), PCA (principle component analysis), LP (Laplacian classifier), LN (linear logistic regression), SP (spectral maximum mutual information projection), FDA (fisher discriminant analysis), ACSP (analytic CSP), HT (human target), NHT (non-human target), ST (single trial), DT (dual trial), BDA (bilinear discriminant analysis), ABDA (analytic BDA), DCA (directed components analysis), HDCA (hierarchical discriminant component analysis), TO (only background distractors), TN (non-target distractor stimuli and background and target stimuli), TvB (target versus background distractor), Tv[B  +  NT] (target versus both background distractor and non-target).

Table 2.  Design parameters reviewed, mode: button press  =  no. Table acronyms: FDA (fisher discriminant analysis), N/A (not available), SWFP (spatially weighted fisher linear discriminant—principal component analysis), CNN (convolutional neural network), HDPCA (hierarchical discriminant principal component analysis algorithm), HDCA (hierarchical discriminant component analysis), SVM (support vector machine), RBF (radial basis function) kernel, RDA (regularized discriminant analysis), HMM (hidden Markov model), PCA (principle component analysis), BDCA (bilinear discriminant component analysis), BFBD (bilinear feature-based discriminants), BLDA (Bayesian linear discriminant analysis), SWLDA (step-wise linear discriminant analysis), MLP (multilayer perceptron), LIS (locked in syndrome), CV (computer vision), STIG (spectral transfer with information geometry), MSS (max subject-specific classifier), L1 ( ℓ 1 -regularized cross-validation), MV (majority vote), PMDRM (pooled Riemannian mean classification algorithm), AWE (accuracy weighted ensemble), MT (multi-task learning), CALIB (within-subject calibration), RF (random forest), BHCR (Bayesian human vision-computer vision retrieval).

4. Validating inter-study comparison through performance measures

When comparing RSVP studies it is important to acknowledge that researchers use different measures of performance. Before going into depth about signal processing techniques (section 5.7 ) it is important to discuss, firstly, the variations in approaches used to measure performance. To encourage valid inter-study comparison within and across RSVP application types, it is crucial to emphasize that we are, on the whole, reporting classification accuracy when it is calculated in terms of the number of correctly classified trials. Classification accuracy can be swayed by the imbalanced target and non-target classes, with targets being infrequently presented, e.g. with a 10% target prevalence; if all trials are classed as non-targets, correct classification rate would be 90%. Hence, ROC values are also reported in this review where relevant information was provided in the publications reviewed.

In the literature, there are many variations on how performance is estimated and reported. The studies cited in the current section provide examples of performance measure variations from the literature. The intention of Files and Marathe ( 2016 ) with respect to the reference list provided. Please check.]was to develop a regression-based method to predict hit rates and error rates whilst correcting for expected mistakes. There is a need for such methods, due to uncertainty and difficulty in correctly identifying target stimuli. The regression method developed by Files and Marathe ( 2016 ), had relatively high hit rates, which spanned 78.4%–90.5% across all participants. Contrastingly, as a measure of accuracy, Sajda et al ( 2010 ) used hit rates expressed as a fraction of total targets detected per minute. Sajda et al ( 2010 ) discussed an additional experiment that employed ROC values as an outcome measure. In Alpert et al ( 2014 ), where the RSVP application was categorization based, accuracy was defined as the number of trials in which the classifier provided the correct response divided by the total number of available trials, with regards to target/non-target classification. Yazdani et al ( 2010 ) were concerned with surveillance applications of RSVP-based BCI and used the F-measure to evaluate the accuracy of the binary classifier in use. Precision (fraction of occurrences flagged that are of relevant) and recall (fraction of relevant occurrences flagged) were reported, as the F-measure considers both these values.

Different variations in ROC value calculations were also discovered across the studies evaluated. Variability in the distribution of accuracy outcome measures is also founded upon whether the dataset is non-parametric, e.g. median AUC is reported as opposed to the mean AUC (Matran-Fernandez and Poli 2014 ). As a measure of accuracy, Rosenthal et al ( 2014 ) conducted a bootstrap analysis: to show the sampled distribution of AUC values for HDCA classifiers were 1000 times over, labels were randomized, classifiers were trained and AUC values calculated through a 'leaving-one-out cross-validation' technique. Cecotti et al (2012) presented a comparison of three class classifiers in a 'one versus all' strategy. The focus of Cecotti et al (2012) was to compare the AUC to the volume under the ROC hyper-surface and the authors found an AUC of 0.878, which is suggestive of the possibility for discrimination between greater than two types of ERPs using single-trial detection. Huang et al ( 2006 ) reported the AUC for session one of two experiments during button press trials. This paper demonstrates that the three classifiers approach produces a similar performance with AUC of  >0.8 across the board (Huang et al 2006 ). Moreover, accuracy reportedly increases through collating evidence from two BCI users, and reportedly yielded a 7.7% increase in AUC compared to a single BCI user (Matran-Fernandez and Poli 2014 ) using collaborative BCIs. This process was repeated 20 times to achieve an average accuracy measurement that would not be relatable to other studies included in the bibliometric analysis that involved average performance over single trial test. Cecotti et al ( 2011 ) carried out a study where they compared varying target stimuli probability. Target probability has a significant effect on both behavioral performance and target detection. The best mean AUC is achieved with target probability of 0.10 AUC  =  0.82. The best target stimuli probability for optimal detection performance were 5%  =  78.7%.

This above review exemplifies how performance measures are used. The variability of accuracy analytics limits the extent to which inter-study comparability is feasible, nonetheless a high proportion of studies use AUC values and percentage accuracy as outcome measures, therefore these measures provide the basis for comparisons in section 5 . In the RSVP-based BCI application sections that follow, we provide additional information about the values reported in tables 1 and 2 , the intention being to validate why these performance metrics were selected when a number of different results are reported by the specified study, and to highlight inter-study idiosyncrasies that may need to be considered whilst comparing findings. In the next section, the different design parameters for the studies identified in tables 1 and 2 are reviewed and a number of recommendations are suggested for the parameters that should be considered for RSVP-based BCI applications.

5. Design parameters

RSVP-based BCI applications to date can be grouped into surveillance, data categorization, RSVP speller, face recognition and medical image analysis applications. Often EEG-based RSVP-BCI system studies are multifactorial by design and report numerous results in the form of different outcome measures. In the RSVP-based BCI application section that follows, we provide examples of the different application types and examples of their design parameters.

When designing an RSVP paradigm, there are eight criteria that we recommend be taken into consideration.

  • (1)   The type of target images and how rapidly these can be detected, e.g. picture, number of words.
  • (2)   The differences between target and non-target images and how these influence the discrimination in the RSVP paradigm.
  • (3)   The display mode—static or moving stimuli and the background the images are presented on, e.g. single color white, mixed, textured.
  • (4)   The response mode—consideration should be given as to whether a button press is used or not to confirm if a person has identified a target.
  • (5)   The number of stimuli/the percentage of target stimuli—how many are presented throughout the duration of a session and the effect this could have on the ERP.
  • (6)   The rate at which stimuli are presented on screen throughout the duration of a session and the effect this has on the ERP.
  • (7)   The area (height   ×   width), visual angle and the overt or covert attention requirement of the stimuli.
  • (8)   The signal processing pipeline—determine the features, channels, filters, and classifiers to use.

5.1. Display and response modes

A button press may be used in conjunction with either of the aforementioned presentation modes (section 2.2 ), and entails users having to click a button when they see a target. This mode is used as a baseline to estimate the behavioral performance and the difficulty of the task. In most research studies, participants undergo an experimental trial without a button press and a follow-on trial with a button press.

A button press can be used in RSVP-based BCI research in combination with the participant's EEG responses in order to monitor attention (Marathe et al 2014 ). The combination of EEG and button press can lead to increased performance in RSVP-based BCIs. Tasks that require sustained attention can cause participants to suffer from lapses in vigilance due to fatigue, workload or visual distractors (Boksem et al 2005 ). The button press can be used to determine if there is a tipping point during the presentations when participants are unable to consciously detect target stimuli, while still identifying targets via EEG recordings (Potter et al 2014 ). However, the core advantage of the RSVP-based BCIs is the enhanced speed of using a neural signature instead of a behavioral response to determine if a user has detected an intended image of interest.

Forty of the studies reported use static mode as a method of presentation; six of these papers used moving mode in conjunction with static mode while one study exclusively used moving mode. Moving mode is more complex than static mode as participants have to take in an entire scene rather than specific images. Moving mode uses motion onset in conjunction with the P300 for scenes in which the targets are moving, yielding a more realistic setting to validate RSVP-based BCIs (Weiden et al 2012 ). All papers employing moving mode were found within the surveillance application category; this is unsurprising as the moving mode offers the opportunity to detect targets in realistic surveillance situations where movements of people or vehicles are of interest. For the other application areas, i.e. medical, categorization, etc the static mode is likely to be the most appropriate.

Won et al ( 2017 ) compared motion RSVP to standard RSVP, with the motion-type RSVP being the rapid presentation of letters of the alphabet, numbers 1–9 and a hyphen '-' used to separate words, in six different color groups in one of six directions in line with the hands of a clock, i.e. 2, 4, 6, 8, 10, 12, whilst participants focused on a central point. An increase in performance accuracy with motion-type RSVP versus static-type was demonstrated, which could be accounted for by the shorter latency and greater amplitudes of ERP components in the motion-type variation (Won et al 2017 ).

Out of the studies found, 22 used a button press while 23 did not. 70% of surveillance applications used a button press. In categorization studies and face recognition studies the majority of applications used a button press. 89% of RSVP-speller applications did not use a button press. Typically, the BCI studies that involve spellers focus on movement-free communication and high information transfer rates. Having a button press for confirmation of targets is not standard practice in such applications (Orhan et al 2012, Oken et al 2014 ). In many of the studies that did not utilize a button press, researchers were focused on different aspects of the RSVP paradigm other than reaction time. For example, researchers focused on the comparison of two classification methods, image duration, etc (Sajda et al 2010 , Cecotti et al 2014 ). Combining EEG responses with button press can improve accuracy although more signal processing is required in order to remove noise that occurs as a result of participant movement (Healy and Smeaton 2011 ). Button press confirmation is unnecessary unless an assessment of physical reaction time is an important aspect of the study.

Maguire and Howe ( 2016 ) instructed participants to use a button press following image blocks to indicate if a target was consciously perceived as present or absent. Such an approach is useful when studying RSVP-based parameters and the limits of perception. However, button press responses might be less useful than EEG responses during RSVP for data labeling or image sorting, where the focus is to label individual images within the burst. Nonetheless, Bigdely-Shamlo et al ( 2008 ) applied an image burst approach where a button press at the end of the image burst was used to determine if the participant saw a target image or not. The authors showed that airplanes could be detected in aerial shots with image bursts lasting 4100 ms and images presented at 12 Hz. The button press served well in determining correct and incorrect responses. In practice, however, a button press may be superfluous or infeasible.

A body of researchers is of the opinion that RSVP-related EEG accuracy must surpass button press accuracy in order to be useful. However, this need not be the case as Gerson et al ( 2006 ) report no significant differences in triage performance based on EEG recordings or button presses. Nevertheless button-based triage performance is superior for participants that correctly respond to a high percentage of target images. Conversely, EEG-based triage alone is shown to be ideal for the subset of participants who responded correctly to fewer images Gerson et al ( 2006 ). Hence, the most reliable strategy for image triaging in an RSVP-based paradigm may be through reacting to the target image by real-time button presses in conjunction with an EEG-based detection method. Target identification reflected in EEG responses can be confirmed by a button press, and through signal processing techniques both reported and missed targets can be identified.

Studies such as Marathe et al ( 2014 ) proposed methods for integrating button press information with EEG-based RSVP classifiers to improve overall target detection performance. However, challenges arise when overlaying ERP and behavioral responses, such as issues concerning stimulation presentation speed and behavioral latency (Files and Marathe 2016 ). Crucially Files and Marathe ( 2016 ) demonstrated that techniques for measuring real-time button press accuracy start to fail at higher presentation rates. Given evidence of human capacity for semantic processing during 20 Hz image streams (approximately 50 ms per image) and response times (RTs) often being an order of magnitude greater than EEG responses, button presses may be unsuitable for faster RSVP-based image triaging.

Pending further studies investigating the reliability of fast detection of neural correlates, EEG-based responses have the potential to exceed button press. However, it is not necessary for EEG-based RSVP paradigms to surpass button press performance and evidence suggests that a complement of both modalities at comfortable lower presentation rates may indeed be the best approach. Nevertheless, ideally studies would contain an EEG-only block and EEG plus button press block, where the button press follows the target and not the image burst. This would facilitate more accurate evaluation of differences and correlations between behavioral and neural response times. Interesting, Bohannon et al ( 2017 ), presented a heterogeneous multi-agent system comprising computer vision, human and BCI agents, and showed that heterogeneous multi-agent image systems may achieve human level accuracies in significantly less time than a single human agent by balancing the trade-off between time-cost and accuracy. In such cases a human–computer interaction may occur in the form of button press if the confidence in the response of other, more rapid agents such as RSVP-BCI agents or computer vision algorithm is low for a particular sequence of stimuli.

5.2. Type of stimuli

Surveillance is the largest RSVP BCI system application reported in this review, reflected as such by the discussion length of this subsection (Sajda et al 2003 , Erdogmus et al 2006 , Gerson et al 2006 , Poolman et al 2008 , Bigdely-Shamlo et al 2008 , Sajda et al 2010 , Huang et al 2011 , Cecotti et al 2012, Weiden et al 2012 , Matran-Fernandez and Poli 2014 , Marathe et al 2014 , Rosenthal et al 2014 , Yu et al 2014 , Marathe et al 2015, Barngrover et al 2016 , Cecotti 2016 , Files and Marathe 2016 ).

In a surveillance application study carried out by Huang et al ( 2011 ) the targets were surface-to-air missile sites. Target and non-target images shared low-level features such as local textures, which enhanced complexity. Nonetheless target images were set apart due to large-scale features such as unambiguous road layouts. Another example of surveillance targets denoted by Bigdely-Shamlo et al ( 2008 ) is where overlapping clips of London satellite images were superimposed with small target airplane images, which could vary in location and angle within an elliptical focal area. Correspondingly, in Barngrover et al ( 2016 ), the prime goal was to correctly identify sonar images of mine-like objects on the seabed. Accordingly, a three-stage BCI system was developed whereby the initial stages entailed computer vision procedures, e.g. Haar-like feature classification whereby pixel intensities of adjacent regions are summed and then the difference between regions is computed, in order to segregate images into image chips. These image chips were then fed into an RSVP type paradigm exposed to human judgment, followed by a final classification using a support vector machine (SVM).

In the categorization, application type images were sorted into different groups (Cecotti et al 2011 ). Alpert et al ( 2014 ) conducted a study whereby five image categories were presented: cars, painted eggs, faces, planes, and clock faces (Sajda et al 2014 ). A second study by Alpert et al ( 2014 ), containing target (cars) and non-target image (scrambled images of the same car) categories, was conducted. In both RSVP experiments, the proposed spatially weighted Fisher linear discriminant–principal component analysis (SWFP) classifier correctly classified a significantly higher number of images than the hierarchical discriminant component analysis (HDCA) algorithm. In terms of categorization, empirical grounds were provided for potential intuitive claims, stating that target categorization is more efficient when there is only one target image type, or distractors are scrambled variations of the target image as opposed to different images all together (Sajda et al 2014 ).

Face recognition applications have been used to seek out whether a recognition response can be delineated from an uninterrupted stream of faces, whereby each face cannot be independently recognized (Touryan et al 2011 ). Two of the three studies evaluated utilized face recognition RSVP paradigm spin offs with celebrity/familiar faces as targets and novel, or other familiar or celebrity faces as distractors (Touryan et al 2011 , Cai et al 2013 ). Cecotti et al ( 2011 ) utilized novel faces as targets amongst cars with both stimuli types presented with and without noise. Utilizing the RSVP paradigm for face recognition applications is an unconventional approach; nonetheless the ERP itself has been used exhaustively to study neural correlates of recognition and declarative memory (Yovel and Paller 2004 , Guo et al 2005 , MacKenzie and Donaldson 2007 , Dias and Parra et al 2011 ). Specifically, early and later components of the ERP have been associated with the psychological constructs of familiarity and recollection, respectively (Smith 1993 , Rugg et al 1998 ). There is thus substantial potential for the utility of the RSVP-based BCI paradigm for applications in facial recognition. In the future, RSVP-based BCI face recognition may be apposite in a real world setting in conjunction with security-based identity applications to recognize people of interest. Furthermore, Touryan et al ( 2011 ) claimed that, based on the success of their study, RSVP paradigm-based EEG classification methods could potentially be applied to the neural substrates of memory. Indeed, some studies show augmentation in the posterior positivity of ERP components for faces that are later remembered (Paller and Wagner 2002 , Yovel and Paller 2004 ). That is to say, components of ERPs triggered by an initial stimulus may provide an indication of whether memory consolidation of the said stimulus will take place, which provides an interesting avenue for utilizing RSVP-based BCI systems for enhancing human performance. Based on these studies, it is clear that relatively novel face recognition paradigms have achieved success when used in RSVP-based BCIs.

RSVP-based BCIs that assist with finding targets within images to support clinical diagnosis has received attention (Stoica et al 2013 ), for example, in the development of more efficient breast cancer screening methods (Hope et al 2013 ). Hope et al ( 2013 ) is the only paper evaluated from the field of medical image analysis and hence described in detail. During an initial sub-study participants were shown mammogram images, where target lesions were present or absent. In a subsequent study, target red or green stimuli were displayed among a set of random non-target blobs. These studies facilitated comparison between 'masses' and 'no masses' in mammograms, and strong color-based images versus random distractors. Images were presented against a grey background in three-second bursts of 30 images (100 ms per image). A difference in the amplitude of the P300 potential was observed across studies, with a larger amplitude difference between target and non-target images in the mammogram study. The researchers attributed this to the semantic association with mammogram images, in contrast to the lack thereof in the colored image-based study.

5.3. Total stimuli number and prevalence of target stimuli

The number of stimuli refers to the total number of stimuli, i.e. the same stimulus can be shown several times. An exception to this is RSVP-speller studies where researchers only report on the number of symbols used, i.e. 28 symbols—26 letters of the alphabet, space and backspace (Hild et al 2011 ). In the RSVP-speller studies reviewed, the number of times each symbol was shown was not explicit. RSVP-speller applications are likely to have significantly fewer stimuli than the other aforementioned applications as participants are spelling out a specific word or sentence, which only has a small number of target letters/words. The integration of language models into RSVP-speller applications enables ERP classifiers to utilize the abundance of sequential dependencies embedded in language to minimize the number of trials required to classify letters as targets or non-targets (Orhan et al 2011 , Kindermans et al 2014 ). Some systems, such as the RSVP keyboard (described in Hild et al ( 2011 ), Orhan et al ( 2012a ), Oken et al ( 2014 )) display only a subset of available characters in each sequence. This sequence length can be automatically defined or be a pre-defined parameter chosen by the researcher. The next letter in a sequence becomes highly predictable in specific contexts, therefore it is not necessary to display every character in the RSVP speller. Studies show that target characters are generally displayed more than once before the character is selected. The length of a sequence and the ratio of target to non-target stimuli can have an effect on the typing rate/performance. In an online study by Acqualagna et al ( 2011 ), participants were shown 30 symbols that were randomly shuffled ten times before a symbol was selected through classification and presented on screen. Orhan et al (2012), carried out an offline study whereby two healthy participants where shown three sequences (consisting of 26 randomly ordered letters of the alphabet). The results of this study showed that the number of correctly identified symbols more than doubled when using three sequences instead of one sequence to identify targets.

Task complexity is enhanced by the multiplicity of target categories. In Poolman, et al ( 2008 ) there were two blocks of target presentations: a helipad block with a 4% target prevalence; and a surface-to-air missile and anti-aircraft artillery block with a 1% target prevalence. Additionally, in Cecotti et al (2012) the targets were 50% vehicles, 50% people, with 50% being stationary and 50% moving. Further to this, (Weiden et al 2012 ) demonstrated that presenting kinetic images during the RSVP paradigm as opposed to stationary images increased the performance of EEG-based detection, and that this is negatively correlated with the cognitive load associated with the presented stimuli. In RSVP-speller applications task complexity varies based on what instructions participants are given, e.g. (1) participants may be asked to 'spell dog'; (2) 'type a word related to weather'; (3) participants can be given a word bank containing 20 words and asked to 'spell a word found within this word bank'. Half of the RSVP-speller-based BCI studies evaluated involved user-defined sequence lengths (instructions 2 and 3) (Acqualagnav et al 2010 , Hild et al 2011 , Orhan et al 2012, Oken et al 2014 ), while the other half involved users being given a target word/sentence to spell (instruction 1). If a participant has to remember the sentence or how to spell a long or unfamiliar word this can increase the complexity of a task (i.e. dog is much easier to spell than idiosyncrasy) (Primativo et al 2016 ). Note however that these different complexities in instructions are only present for evaluation/training tasks with the RSVP-BCI spellers. For their real use, participants choose themselves what they want to spell. The RSVP-based text application allows the number of stimuli before a target stimulus be reduced (i.e. letters such as ' z ' that are less commonly used can be shown less frequently).

Excluding RSVP-speller applications, as it is already known that they do not require the same number of stimuli as the other applications, the number of stimuli used typically varied between studies from approximately 800 in the surveillance application study by Sajda et al ( 2010 ) to 26 100 in a categorization application study by Sajda et al ( 2014 ). The most common target stimuli percentage range was 1–10% found in 61% of the studies reviewed, followed by 11–20% then  >20%. There are a number of studies that focus specifically on the percentage of target stimuli. In a study by Cecotti et al ( 2011 ), researchers investigated the influence of target probability when categorizing face and car images. In this study, researchers used spatially filtered EEG signals as the input for a Bayesian classifier. Using eight healthy participants, this method was evaluated using four probabilities of target stimuli conditions, i.e. 0.05, 0.10, 0.25, or 0.50. It was found that the target probability had an effect on the participant's ability to detect targets and on behavioral performance. The best mean AUC (0.82) was achieved using the 0.1 probability condition. The results showed that the percentage of targets shown in an RSVP paradigm has an effect on participants' performance. As number and percentage of target stimuli used can have an effect on the complexity of a task, it is important to keep the percentage of targets to  <10% to evoke the P300 and maximize detection rates. This was proposed to be in line with well-established P3 measures, whereby bigger gaps between target trials reduce peak latency and increase amplitude (Gonsalvez and Polich 2002 ).

5.4. Duration of stimuli presentation

A key factor of the RSVP paradigm is the rate of presentation, as the focus of this paradigm is presenting data at a rapid rate so that large datasets can be analyzed in short periods. The duration for which stimuli were presented varied from 50 to 500 ms (Sajda et al 2003 , Touryan et al 2011 , Cai et al 2013 ). The upper limits for the presentation time of stimuli during the RSVP paradigm is ill-defined in the literature; however we found 500 ms per image to be the maximum RSVP duration used across all RSVP studies. The duration of stimuli typically differs between applications. Table 3 shows that the most common duration of stimuli was between 100–199 ms per image. The quickest duration of 50 ms per image was used in a study by Sajda et al ( 2003 ) where two participants were asked to identify scenes containing people in natural scenes. In each trial, the duration of the stimulus presentation was decreased from 200 to 100 to 50 ms per image. The results of this study showed that both participants had reduced performance for faster stimulus presentations, i.e. 50 ms. This would suggest that the most suitable duration for RSVP-based BCI applications is 100–200 ms, to balance the trade-off between accuracy and speed.

Table 3.  Variation of image duration in RSVP studies.

Overall, these limited findings are suggestive of presentation rates of  >10 Hz being infeasible for identification of neural correlates that allow successful identification of targets. Despite the low a participant number in Sajda et al ( 2003 ), validation for this upper cut-off presentation rate may be provided by Raymond et al ( 1992 ), where the attentional blink was first described. An RSVP paradigm was undertaken whereby the participant must register a target white letter in a stream of black letters and a second target 'X' amongst this stream. It was found that if the 'X' appeared within ~100–500 ms of the initial target, errors in indicating whether the 'X' was present or not were likely to be made even when the first target was correctly identified (Raymond et al 1992 ). This is not to say that humans cannot correctly process information presented at  >10 Hz. Forster ( 1970 ), has shown that participants can process words presented in a sentence at 16 Hz (16 words per second). However, the sentence structure may have influenced the correct detection rate, which has an average of four words per second for simple sentence structures and three words for complex sentences. Detection rates improve when presented at a slower pace, e.g. four relevant words per second, with masks (not relevant words) presented between relevant words. Additionally, Fine and Peli ( 1995 ) showed that humans can process words at 20 Hz in an RSVP paradigm.

Potter et al ( 2014 ) assessed the minimum viewing time needed for visual comprehension using RSVP of a series of 6 or 12 pictures presented at between 13 and 80 ms per picture, with no inter-stimulus interval. They found that observers could determine the presence or absence of a specific picture even when the pictures in the sequence were presented for just 13 ms each. The results suggest that humans are capable of detecting meaning in RSVP at 13 ms per picture. However, the finding challenges established feedback theories of visual perception. Specifically, research assert that neural activity needs to propagate from the primary visual cortex to higher cortical areas and back to the primary visual cortex before recognition can occur at the level of detail required for an individual picture to be detected, Maguire and Howe ( 2016 ). Maguire and Howe ( 2016 ) supported Potter et al ( 2014 ) in that the duration of this feedback process is likely  ⩾50 ms, and suggest that this is feasible based on work done by Lamme and Roelfsema ( 2000 ). Explicitly, Lamme and Roelfsema ( 2000 ) estimated that response latencies at any hierarchical level of the visual system are ~10 ms. Therefore, assuming that a minimum of five levels must be traversed as activity propagates from the V1 to higher cortical areas and back again, this feedback process is unlikely to occur in  <50 ms. However, Maguire and Howe ( 2016 ) suggested a potential confound of Potter et al ( 2014 ), which was that pictures in the RSVP sequence, on occasion, contained areas with no high-contrast edges and hence may not have adequately masked proceeding pictures. Consequently, Maguire and Howe ( 2016 ) replicated the study rectifying the edges to ensure high-contrast covering the entire image. They were unable to find any evidence that meaning can be detected in an RSVP stream at 13 ms, or even 27 ms, per image but at 53 and 80 ms this is possible. Upon this basis, the limits of RSVP processing could be reduced to a minimum of ~20 Hz. Nonetheless, further study is needed to investigate the limits of human capability to rapidly distinguish target from non-target information, in comparison to the limit in detecting target related ERPs versus non-target ERPs at 20 Hz presentation rates.

In all three face recognition studies, each face image was displayed for 500 ms (Cecotti et al 2011 , Touryan et al 2011 , Cai et al 2013 ). In two of the studies there was no ISI (Cecotti et al 2011 , Touryan et al 2011 ), and in the other an ISI of 500 ms was given to ensure ample time for image processing (Cai et al 2013 ). The speed at which face images were shown was reduced in comparison to the other RSVP applications. RSVP spellers most commonly use a duration of 400 ms; RSVP-spellers can benefit from slower stimulus duration with the incorporation of a language model to enable the prediction of relevant letters. The estimation of performance can be challenging in the RSVP paradigm when the ISI is small, as assigning a behavioral response (i.e. button press) to the correct image cannot be done with certainty. A solution to this problem is to assign behavioral responses to each image, therefore researchers are able to establish hits or false alarms (Touryan et al 2011 ). When two targets are temporally adjacent with a SOA of 80 ms, participants are able to identify one of the two targets but not both. SOA should be at least 400 ms and target images should not be shown straight after each other (Raymond et al 1992 ). Acqualagnav et al ( 2010 ), had a four factorial design looking at classification accuracy when the letters presented as no-color or color letters at either 83 or 133 ms with an ISI of 33 ms (Acqualagnav et al 2010 ). The number of sequence stimuli was presented for enhanced accuracy rate in selecting letter of choice. After 10 sequences ~90% mean accuracy was reached in 133 ms color presentation mode (100% for 6/9 participants). After ten sequences in 133 ms no color presentation mode ~80% mean accuracy was reached (100% in 3/9 participants). Whilst at presentation rates of 83 ms mean accuracy rate was ~70% and the there was no significant effect of color. This formulation is based on the chance rate of 3.33% (i.e. 1 in 30). This implies that cultured letters enhances performance accuracy but not past a certain speed of stimulus presentation.

There is likely a significant interaction between the difficulty of target identification and presentation rate. For example, the optimal presentation rate for a given stimulus set is highly dependent on the difficulty of identifying targets within that set (Ward et al 1997 ). Image sets with low clutter, high contrast, no occlusion, and large target size are likely amenable to faster presentation rates; while image sets with high clutter, low contrast, high levels of occlusion, with small target sizes will require slower presentation rates (Rousselet et al 2004 , Serre et al 2007 , Hart et al 2013 , Liu and Kwon 2016 ). A more conclusive analysis of the effect of stimulus presentation duration for each application type could be derived by varying the presentation rate duration between 100, 200, and 500 ms, whilst other parameters remain fixed. With regards to temporal proximity of target images, 500 ms should be taken to be the minimum to maximize performance.

5.5. Image size/visual angle

Another RSVP design aspect to be considered is stimulus size. There is a large variation in image sizes ranging from 256   ×   256 pixels in a categorization application to 960   ×   600 pixels in a surveillance applications. In general, surveillance applications use larger images than the other applications described. The most common image size used is 500   ×   500 pixels. This is only used in static surveillance applications and all surveillance studies using this image size achieved a high accuracy (>80%). The other applications used smaller image sizes such as 360   ×   360 pixels and achieved high accuracies (i.e. 91% and 89.7%). Therefore, it can be concluded that for surveillance studies, image size should be at least 500   ×   500 pixels, although for all other applications the image size may be smaller. A more complex task is where a target stimulus is presented in the background of a larger image eliciting the N2 ERP. Early components such as the P1 and N2 are sensitive to the spatial location of the stimuli (Saavedra and Bougrain 2012 ).

One issue with reporting only image size is that it is always relevant to the distance viewed from screen and its location on the screen with respect to the viewer, i.e. the visual angle. The visual angle is the angle an image subtends at the eye, reported in degrees of arc. In a study by Dias and Parra ( 2011 ) it was shown that participants performed best (90%) when the target stimulus was centered. Performance consistently decreased to 50% in all participants as target stimulus were placed further away from the center (4° of visual angle), this dropped further when target stimulus was placed at 8° of visual angle. Although performance drops significantly participants are still able to detect target stimulus shown in their peripheral visual field even at such rapid paces. Many papers report that the visual angle of the stimuli can have an effect on performance. As a general principle, targets must appear larger or be more distinct for detection at the outer edge of the visual field. The visual angle can thus be deemed the most important measure as it accounts for distance from screen, image location on screen and image size. Authors are therefore encouraged to report visual angle, as reporting image size alone is not useful without the availability of distance from the screen. For RSVP-speller studies, none of the papers found reported on the size of the image or font, however some reported the visual angle.

5.6. Target versus non-target stimuli

Many different types of target images have been identified within this review. The majority of research focuses on a two-class problem, i.e. detecting target images in sequences of non-target images that are completely different from each other. However, in real-life situations, non-target images are likely to share some of the same characteristics as target images (Marathe et al 2015). These presentation sequences appear to be more like moving images than static images. In Marathe et al (2015) a more complex surveillance task was carried out where, in the first task, participants were required to detect targets when targets are the only infrequent image whilst, in the second task, targets were presented with non-targets (i.e. the target image could be found in the background of a larger image). Participants were required to ignore everything else in the image, a much more difficult task, and consequently the amplitude of the P300 was reduced. The results of this study found that the introduction of the infrequent non-target stimuli in the scene yielded a substantial slowing of the reaction time. Surveillance applications commonly use stimuli that are more complex where trained participants, such as intelligence analysts, outperform novice participants, as they are able to give meaning to the stimuli. The RSVP-speller applications present their letters as images one at a time on screen (Hild et al 2011 ). Due to the nature of the RSVP paradigm, it is important that these letters are shown in a random order as participants pre-empting a target can have an effect on ERP responses (Oken et al 2014 ). Data categorization applications had the most variance between the different types of stimuli presented to a participant. However, these stimuli tend to be everyday items that participants can easily recognize.

5.7. Signal processing

All applications have certain requirements in terms of speed and type of images displayed, which, as outlined above, can influence the ERP and therefore also variations in performance as measured by detection accuracy. The signal processing framework plays an important role in being able to cope with variations in ERP and maximizing performance. There is a likely tradeoff between the design parameters used as described above and the level of sophistication built into the signal processing framework, which often varies across studies. Here we review some of the approaches applied.

5.7.1. Pre-processing.

To extract the relevant features, data is first pre-processed to improve the signal to noise ratio (SNR). The signal is pre-processed using varying band pass filters, depending on the application, in order to remove high frequency noise or artifacts (such as muscle activity). Generally, lower and upper cut-off frequencies of around 0.1 Hz and 30–40 Hz are used, respectively. The data is then often downsampled, and, for offline analyses, electrodes with substantial noise are removed through visual inspection of the EEG data or automated approaches based on thresholding or correlating artifacts in EEG channels with simultaneously recorded electrooculography or electromyography. Data is then epoched into segments typically lasting ~600 ms, from 100 ms prior to stimulus onset and the 500 ms post-stimulus onset. The starting point and duration of the epochs selected for further analysis vary from study to study.

5.7.2. Feature extraction.

Feature extraction is applied to the data for dimensionality reduction and to extract discriminant and non-redundant features. It can be difficult to carry out feature extraction due to the low SNR in single trial analysis. Conventionally, averaging over multiple repeated trials is often used to overcome this. Many studies employ spatial filtering to extract ERPs from EEG. Some of the spatial filtering methods used include principal component analysis (PCA) (Sajda et al 2003 , Alpert et al 2014 ), independent component analysis (Bigdely-Shamlo et al 2008 , Blankertz et al 2011 , Kumar and Sahin 2013 ), or the xDAWN algorithm, which maximizes the SNR between target and non-target stimuli classes (Rivet et al 2009 , Rivet and Souloumiac 2013 , Cecotti et al 2014 ). In the case of image triage where the intention is to classify single-trial ERPs, spatial filters are used to enhance the SNR and exploit spatial redundancy (e.g. Parra et al ( 2005 )). Yu et al ( 2011 ) went a step further by utilizing a methodology that considers spatial and temporal features to ensure augmented single-trial detection accuracy (Yu et al 2011 ). A bilinear common spatial pattern (BCSP) was suggested to outperform common spatial pattern (CSP) filters (composite and common spatial pattern filters) (Yu et al 2011 ). It should be noted however that CSP spatial filters were not designed to classify ERP but to classify oscillatory EEG activity. CSPs, indeed, ignore the EEG time course—i.e. the ERP—and are thus suboptimal for RSVP-BCI. We would recommend using spatial filters dedicated to ERP classification, such as xDAWN, which were used successfully in many RSVP-BCI. Spatial filtering is normally only performed on high-density EEG data, which might be impractical in certain real-life applications (Parra et al 2005 ). High-density EEG data has been reported to increase accuracy (Ušćumlić et al 2013 ). Table 4 shows the most common method used for different application types.

Table 4.  Parameter and recommendations for RSVP-based BCIs.

Face recognition applications differ from other applications as face images evoke different ERPs, in addition to the P300. Faces typically evoke a N170 component that changes between targets and non-targets (Maurer et al 2008 , Luo et al 2010 ). The vertex positive potential is also associated with face recognition (Zhang et al 2012 ). The midfrontal FN400 and later parietal FP600 components have been associated with familiarity and recollection, respectively (MacKenzie and Donaldson 2007 ). Specifically, the amplitude of FP600 (a positive deflection  >500 ms post-stimulus) was found to significantly correlate with the extent of face familiarity (Touryan et al 2011 ). The use of spatial filters that utilize spatial and temporal features may act as an advantage over conventional spatial filters that only exploit spatial redundancy, e.g. Yu et al ( 2011 ). However, spatial filters can only be performed on high-density EEG data, which might be impractical in certain real-life applications (Parra et al 2005 ).

5.7.3. Classification.

This review found many different classification methods had been used in the acknowledged studies, however some conclusions can be drawn. Linear classifiers are most populous within RSVP-based BCIs. Often EEG can contain information that enables classification of the stimuli correctly even when a participant's behavioral response is incorrect (Sajda et al 2003 , Bigdely-Shamlo et al 2008 ). The two most commonly used classifiers were linear discriminant analysis (LDA) and SVM, or variations of the two, such as Bayesian LDA (BLDA) and radial basis function SVM, respectively. Parra et al ( 2008 ) presented an RSVP framework that projects the EEG data matrix bi-linearly onto temporal and spatial axes (Parra et al 2008 ). This framework is versatile upon implementation, for example, it has been applied to classify target natural scenes and satellite missile images (Gerson et al 2006 , Sajda et al 2010 ). Contrastingly, Alpert et al ( 2014 ) presented a two-step linear classifier, which achieved classification accuracy suited to real-world applications (Sajda et al 2014 ). Whilst Sajda et al ( 2010 ) proposed a two-step system utilizing computer vision and EEG subsequently to optimize the classification (Sajda et al 2010 ). The performance of an ensemble LDA classifier diminished when eight centro-parietal EEG channels were utilized as opposed to the full 41 EEG channels (Ušćumlić et al 2013 ). Contrastingly, Healy and Smeaton ( 2011 ) claimed that consideration of additional channels might introduce noise as opposed to advancing categorical information, as indicated by results from one study participant.

For the surveillance application, SVM achieved the highest percentage accuracies (Huang et al 2011 , Weiden et al 2012 ). For the RSVP-speller application, the most common method of classification used was regularized discriminant analysis (RDA). RDA achieved an AUC performance of 0.948–0.973 (Orhan et al 2011 ). Step-wise LDA (SWLDA) was also used in RSVP-speller applications with high AUC performance and accuracies (0.82, 0.84, 86%, 89%) (Hope et al 2013 ). In face recognition applications, the best AUC performance was produced using an SVM classifier (Cai et al 2013 ). Within this review, only one medical application was identified (Hope et al 2013 ) and the researchers had achieved high accuracy using a Fisher discriminant analysis. BLDA classifiers were also used, achieving high levels of accuracy (79%). The SWFP algorithm outperformed the HDCA algorithm by 10% in categorization applications. Touryan et al ( 2011 ) demonstrated that EEG classification methods applied to categorization procedures can be adapted to rapid face recognition procedures (Touryan et al 2011 ). Window sizes post stimulus onset of 128, 256 and 512 ms were fed into the classifiers. AUC values (average AUC  =  0.945) were reported for the customized PCA models utilized to describe the changes in ERPs seen between familiar (famous and personal) and novel faces displayed for 500 ms at a time. It is the customized version of these models, i.e. the models developed for each participant using only that participant's data, which were shown to improve classification performance through the acknowledgment of discrete variability in the windowed ERP components.

Many of the BCI algorithms presented in tables 1 and 2 are linear, enabling simple/fast training with resilience to overfitting often caused by noise, implying suitability to single-trial EEG data classification. Nonetheless, linear methods can limit feature extraction and classification, and non-linear methods, e.g. neural networks, are more versatile in modeling data of greater variability, also implying suitability to single-trial EEG data classification (Erdogmus et al 2006 , Huang et al 2006 , Lotte et al 2007 ). The use of neural networks, in particular deep neural network for the RSVP-based BCI framework, represents an attractive venture, and has shown promise over standard linear methods (Manor et al 2016 , Huang et al 2017 ). A convolution neural network was shown to outperform a two-step linear classifier using the same dataset (Sajda et al 2014 , Manor and Geva 2015 ).

The majority of studies reviewed investigated the effectiveness of classifiers in identifying single-trial EEG correlates for target stimuli presented through an RSVP-type paradigm. However, the spatial filtering technique, as well as the type of classifier used, has an impact on proficiency in detecting EEG of single trials (Bigdely-Shamlo et al 2008 , Cecotti et al 2014 ). For example, independent component analysis reportedly identifies and divides multiple classes of non-brain response artifacts associated with eye and head movements, which would be useful for EEG de-noising during real-world applications when operators are mobile (Bigdely-Shamlo et al 2008 ).

Additionally Cecotti et al ( 2014 ) evaluated three classifiers using three different spatial filtering methods, so all in all twelve techniques were compared for three different RSVP paradigms. Marathe et al (2015) utilized an active learning technique in a bid to reduce the training samples required to calibrate the classifier. Active learning is a partially supervised iterative learning technique reducing the amount of labeled data required for training. Recalibration depends on parameters such as human attentiveness, physical surroundings or task-specific factors. Looking at the real world applicability of RSVP-based BCI systems, Marathe et al (2015) built upon work addressing the issue of the thorough recalibration required for real-time BCI system optimization.

There is growing interest in the use of transfer learning (TL) for calibration reduction or suppression to encourage the real-world applicability of BCIs (Wang et al 2011 ). With TL, the EEG data or classifiers from a given domain are transformed in order to be applied to another domain, hence transferring data/classifiers from one domain to another, possibly increasing the amount of data for the target domain (Wang et al 2011 ). For RSVP-BCI, this typically consists in combining EEG data or classifiers from different participants, in order to classify EEG data from another participant, for which very little or even no calibration EEG data is available. An unsupervised transfer method, namely spectral transfer with information geometry (STIG), ranked and collated unlabeled predictions from a group of information geometry classifiers, which was established through training on individual participants (Waytowich et al 2016 ). Waytowich et al ( 2016 ) showed that STIG can be used for single-trial detection in ERP-based BCIs, eliminating the requirement for taxing data collection for training. With access to limited data, STIG outperformed alternative zero-calibration and calibration reduction algorithms (Waytowich et al 2016 ). Within the BCI community conventional TL approaches still necessitate training for each condition, however methodologies have been applied to eradicate the need for subject-specific data calibration, where large-scale data is leveraged from other participants (Wei et al 2016 ). This demarcates the potential for single-trial classification via unsupervised TL and user-independent BCI technology deployment.

5.8. Suggested parameters

The parameters reviewed here have been selected because they have an effect on one or all of the following aspects of the RSVP paradigm: task complexity, stimulus complexity, stimulus saliency or information transmission. Performance within RSVP-based BCIs is measured as the participant's ability to correctly identify oddball images in a sequence. RSVP-based BCIs use two different measurements of performance such as accuracy (percentage of targets that are correctly identified using EEG) and ROC curves. 10% of papers assessed in this review did not report at least one out of these performance measures (ROC/percentage accuracy). The accuracies of the different studies need to be put in context, as all the reviewed parameters and other observed parameters i.e. number of trials and participants will influence study accuracy. In table 4 parameter recommendations are provided for designing RSVP-based BCIs within the different application types and these have been discussed thoroughly throughout section 5 . In particular, table 4 suggests the parameters to use for each application, according to those leading to the best detection performances (accuracy or AUC) in studies comparatively. If no formal comparisons between parameters were available for a specific application or parameter, the most popular parameter values that yield good performances are mentioned.

Applying BCI systems commercially and outside the lab in real-world scenarios will ideally require the system to be robust during the execution of tasks of increasing difficulty. Section 5 summarized the five applications areas that have been studied to the greatest extent in the context of RSVP-based BCIs. Specifically, this section tackles intra-application comparisons of various aspects of the papers that met the inclusion/exclusion criteria. A few of the papers found in this review carried out more than one study in different application types. The most common type of application found was surveillance applications, followed by RSVP-speller applications and categorization applications; after this were face recognition and lastly medical applications. Although there is are a relatively limited number of studies, the design parameters and the focal points of different applications vary widely.

6. Discussion and conclusions

With the increasing intensity in RSVP-based BCI research there is a need for further standardization of experimental protocols, to compare and contrast development of the different applications described in this review. This will aid the realization of a platform that researchers can use to develop RSVP paradigms and compare their results and determine the optimal RSVP-based BCI paradigm for their application type. This paper presents a review of the available research, the defining elements of the research and a categorization approach that will facilitate coordination efforts among researchers in the field. Research has revealed that using a combination of RSVP with BCI technology allows the detection of targets at an expedited rate without detriment to accuracy.

Understanding the neural correlates of visual information processing can create symbiotic interaction between human and machine through BCIs. Further development of RSVP-based BCIs will depend on both basic and applied research. Within the last five years there have been advancements in how studies are reported, and a sufficient body of evidence exists in support of the development and application of RSVP BCIs. However, there is a need for the research to be developed further, and standardized protocols applied, so that comparative studies can be done for progressive research. Many ERP reviews have been carried out; however, this paper focuses on RSVP visual search tasks with high variability in targets and the parameters used. This paper gives guidelines on which parameters impact performance but also on which parameters should be reported so that studies can be compared. It is important that the design aspects shown in tables 1 and 2 are reported and described within each research study. It has been shown that RSVP-based BCIs can be used in processing target images in multiple application types with a low-target probability, but consistency of reporting method renders it difficult to truly compare one paradigm to another or one parameter setup to another.

There has been profuse reporting of percentage accuracy and area under the ROC curve values, nonetheless there is room for more studies to utilize this unofficial standardization across RSVP-based BCI research.

To maximize relatability to pre-existing literature in terms of keeping one feature that contributes to cognitive load constant, it is recommended that studies utilizing more than one category type as targets to conduct the same study with just one target category in the first instance.

For all applications, it is of course necessary to choose an epoch for single trial ERP classification corresponding to the temporal evolution of the most robust ERP components that are, on the whole, pre-established in the literature as associated with the specified task at hand, i.e. target stimuli identification due to their infrequency, recognizability, relevancy or contents. However, whether the duration of stimuli presentation must extend beyond the latency between ERP component appearances relative to stimuli presentation is questionable.

This review found a single medical application. More research in applying the RSVP-based BCI paradigm to high throughput screening within medicine is highly encouraged upon the basis that similarly complex imagery has been categorized relatively successfully in other applications, e.g. side scan sonar imagery of mines or aircraft amongst birds eye view of maps in surveillance (Bigdely-Shamlo et al 2008 , Barngrover et al 2016 ). The medical application of RSVP-based BCIs has immense potential in diagnostics and prognostics through recognition and tracking of established disease biomarkers, and accelerating high throughput health image screening.

Studies utilized varying image sizes, visual angles and participant distance from the screen. Researchers are encouraged to report visual angle as it accounts for both image size and distance of the participant from the screen. A potential way to facilitate uniformity of these variables is to utilize a head mounted display (HMD) or virtual reality (VR) headset such as an oculus rift (Foerster et al 2016 ). The rapid visual information processing capacity is heavily dependent on visual parameters and use of an HMD headset would enable standardization of viewing distance, room lighting and visual angle (Foerster et al 2016 ). Use of a VR headset could distort electrode positions; nonetheless this affect could be easily mitigated. BCIs employing motion-onset visual evoked potentials (mVEP) have been utilized with VR headsets in neurogaming, and shown to be feasible (Beveridge et al 2016 ). The mVEP responses were evaluated in relation to mobile, complex and varying graphics within game distractors (Beveridge et al 2016 ). Foerster et al ( 2016 ) used the VR device oculus rift for neuropsychological assessment of visual processing capabilities. This VR device is head-mounted and covers the entire visual field, thereby shielding and standardizing the visual stimulation, and therefore may improve test–retest reliability. Compared to a CRT screen performances, visual processing speed, threshold of conscious perception and capacity of visual working memory did not differ significantly using the VR headset. VR headsets may therefore be applicable for standardized and reliable assessment and diagnosis of elementary cognitive functions in laboratory and clinical settings and maximize the opportunity to compare visual processing components between individuals and institutions and to establish statistical norm distributions. Recently, a new VR-EEG combined headset with electrodes embedded in occipital areas for ERP detection has been reported for neurogaming ( www.neurable.com ). RVSP-based BCI paradigms may therefore benefit from the head mounted visual displays however a vision obscuring headset may not be appropriate in some contexts as it could limit the ability of the users, e.g. a person with disabilities, to communicate with their peers and environment. Such a headset may prevent the expressive or receptive use of non-verbal communication skills, such as eye movement and facial expressions, that are vital for users with non-verbal communication skills.

Advancements towards RSVP of targets during moving sequences have shown promising results, although it is more difficult to study movie clips since the stimulus start event is not as clear. A remaining challenge in this area is for researchers to design signal processing tools that can deal with imprecise stimulus beginning/end (Cecotti 2015 ). However, an advantage of moving mode is that the target stimulus remains on the screen for longer than with static mode, allowing participants the opportunity to confirm a target stimulus. Moving stimuli studies to date have been limited to surveillance applications so there is a need for further investigation in this area. Just over half the papers used the button press mode in conjunction with one of the other modes, as not all of the studies are concerned with comparing EEG responses to motor responses. It is important to develop a scale in order to rank the difficulty of tasks. This will enable the comparison of paradigms that are at the same level. The key outcomes of this study are shown in table 4 , provided as suggested guidelines. These are suggested parameters that may be useful to researchers when designing RSVP-based BCI paradigms within the different application types. From this review, we can conclude that using these parameters will enable more consistent performance for the different application types and will enable improved comparison with new studies.

In acknowledgment of the need for standardization of parameters for RSVP-based BCI protocols, Cecotti et al ( 2011 ) raise an interesting proposal, stating that other parameters could be automatically prescribed in accordance with the chosen target likelihood, such as the optimal ISI length, classifiers and spatial filters (Cecotti et al 2011 ). Such an infrastructure for parameter choices does not currently exist with studies focusing on the impact of different parameters.

Future studies would benefit from engaging with iterative changes in design parameters. This would allow for a comparative study of the different design parameters and enable the identification of parameters that most affect the experimental paradigm. A study involving increasing the rate of presentation until classification starts to deteriorate significantly for various types of stimulus categories may indicate the maximum possible speed of RSVP-BCI. Additionally, a future development for RSVP-based BCIs might be to use real life imagery with numerous distractor stimuli amongst the target stimuli. This is a more difficult task but it would enhance paradigm relatability to real-life applications. Hybridizing RSVP BCIs with other BCI paradigms has also started to receive more attention (Kumar and Sahin 2013 ). Users of this system navigate using motor imagery movements (left, right, up and down). Search queries are spelt using the Hex-O-Speller and results retrieved from a web search engine may be fed back to the user using RSVP. This study shows the potential benefits of the RSVP paradigm and how it may be used in order to aid physically impaired users. Eye tracking can be used as an outcome measure to assess and enhance RSVP stimuli and presentation modes. Specifically, using eye-tracking researchers can establish where the participant's gaze is focused during erroneous trials and explore correlations between gaze variability and performance. With the RSVP-based BCI paradigm there is much scope to evaluate different data types/imagery. This is a fast-growing field with a promising future. There are multiple opportunities and a large array of potential RSVP-BCI paradigm setups. Researchers in the field are therefore recommended to consider the literature to date and the comparative framework proposed in this paper.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 23 January 2020

Neural dynamics of the attentional blink revealed by encoding orientation selectivity during rapid visual presentation

  • Matthew F. Tang   ORCID: orcid.org/0000-0001-5858-5126 1 , 2 , 3 ,
  • Lucy Ford   ORCID: orcid.org/0000-0002-7665-8904 1 ,
  • Ehsan Arabzadeh   ORCID: orcid.org/0000-0001-9632-0735 2 , 3 ,
  • James T. Enns   ORCID: orcid.org/0000-0002-3676-8316 4 , 5 ,
  • Troy A. W. Visser 5 &
  • Jason B. Mattingley   ORCID: orcid.org/0000-0003-0929-9216 1 , 2 , 6 , 7  

Nature Communications volume  11 , Article number:  434 ( 2020 ) Cite this article

6156 Accesses

18 Citations

9 Altmetric

Metrics details

  • Human behaviour
  • Sensory processing

The human brain is inherently limited in the information it can make consciously accessible. When people monitor a rapid stream of visual items for two targets, they typically fail to see the second target if it occurs within 200–500 ms of the first, a phenomenon called the attentional blink (AB). The neural basis for the AB is poorly understood, partly because conventional neuroimaging techniques cannot resolve visual events displayed close together in time. Here we introduce an approach that characterises the precise effect of the AB on behaviour and neural activity. We employ multivariate encoding analyses to extract feature-selective information carried by randomly-oriented gratings. We show that feature selectivity is enhanced for correctly reported targets and suppressed when the same items are missed, whereas irrelevant distractor items are unaffected. The findings suggest that the AB involves both short- and long-range neural interactions between visual representations competing for access to consciousness.

Similar content being viewed by others

rapid serial visual presentation

The time-course of feature-based attention effects dissociated from temporal expectation and target-related processes

rapid serial visual presentation

Competing rhythmic neural representations of orientations during concurrent attention to multiple orientation features

rapid serial visual presentation

Periodic attention operates faster during more complex visual search


Despite the remarkable capacity of the human brain, it is found wanting when undertaking multiple tasks concurrently, or when several goal-relevant items must be dealt with in rapid succession. These limitations are particularly evident when individuals are required to execute responses to multiple items under time pressure 1 , 2 , or when they must report relevant target items that appear briefly and in rapid succession 3 , 4 , 5 . Elucidating the source of these limitations has been a persistently difficult challenge in neuroscience and psychology. While the neural bases for these processing limits are not fully understood, it is widely assumed that they are adaptive because they provide a mechanism by which selected sensory events can gain exclusive control over the motor systems responsible for goal-directed action.

Here we address a long-standing question concerning the neural basis of the widely studied attentional blink (AB) phenomenon, where observers often fail to report the second of two target items (referred to as T2) when presented within 200–500 ms of the first target (T1) in a rapid stream of distractors 3 , 4 , 5 . Functional magnetic resonance imaging (fMRI) lacks the temporal resolution to accurately characterise neural activity associated with the rapid serial visual presentation (RSVP) tasks presented at rates of 8–12 Hz, which are commonly used to elicit the AB 6 , 7 . Even electroencephalography (EEG), which has relatively good temporal resolution, produces smeared responses to items in an RSVP stream 8 . Furthermore, mass-univariate approaches applied to fMRI or EEG data only measure overall neural activity while providing no information about how neural activity represents featural information carried by single items (e.g., their orientation).

Here we overcome these limitations by combining recently developed multivariate modelling techniques for neuroimaging 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 with an RSVP task designed to determine the neural and behavioural basis for the AB. Forward (or inverted) encoding modelling determines the neural representation of feature-selective information contained within patterns of brain activity, using multivariate linear regression. This approach allowed us to explicitly measure the neural representation of specific features—in this case, orientation-selective information elicited by grating stimuli—separately for each item within an entire RSVP stream.

We use this approach to address two central theoretical questions. First, does selection of a target from within an RSVP stream increase the gain or the precision of its neural representation? Previous efforts to answer this question in the domain of spatial attention have come from single cell recordings in non-human primates 17 , 18 , as well as whole-brain activity measured using fMRI and EEG in humans 15 , 19 . With few exceptions, these studies have found that spatial attention increases the gain of feature-selective processing of attended items. By contrast, feature-based manipulations of attention, in which specific characteristics of an item such as its colour or motion are cued for selective report, typically result in a sharpening of neural selectivity 20 , 21 . To date, it remains unknown whether the limits of temporal attention in the AB are associated with changes in neural tuning to targets, distractors, or both classes of items. The neural response in human primary visual cortex 6 and macaque lateral intraparietal area 22 to the second target is reduced overall on AB trials compared with non-AB trials, while subtraction-based EEG designs have shown that a late-stage component of the ERP (the N400) is reduced 200–400 ms after target presentation 8 . Critically, however, these measures cannot determine how the AB affects the neural representation of visual information, which could conceivably reflect a reduction in gain, an increase in tuning sharpness, or both.

A second, unresolved theoretical question concerns the source of the AB. Existing theories have often attributed the AB to either extended processing of the first target, or to inadvertent distractor processing. In the first class of theories, it is assumed that all items generate representations in early visual areas, but that the system inhibits items after T1 detection to avoid contamination by distractors 23 , 24 , 25 , 26 . On other accounts (so-called ‘distractor-based’ theories), the AB is assumed to reflect a cost associated with switching between target and distractor processing 27 . Finally, a third class of theories argues that the representation of the second target can become merged with either the first target or the distractors 28 , 29 . This class of theories is motivated by the finding that the perceived order of targets is often reversed (i.e., T2 is reported as appearing before T1).

Our RSVP task consists of a stream of randomly oriented Gabor gratings, with two higher spatial-frequency targets set amongst lower-spatial-frequency distractors (Fig.  1a and Supplementary Movie  1 ). At the end of the stream, participants are asked to reproduce the orientations of the two targets (Fig.  1b ). Critically, the orientation of each item in the stream is uncorrelated with the orientation of all other items (Fig.  1c ), thus permitting the use of linear regression analyses to separately extract the influence of each item in the stream on neural activity measured by EEG, and on behavioural reports of the orientations of the two targets. These aspects of the experimental design allowed us to quantify the influence of both targets and distractors on participants’ perceptual reports and on their associated neural representations.

figure 1

a An illustration of a typical trial in the RSVP task, which consisted of 20 sequentially presented Gabor patches at fixation. Each of the twenty items within a single RSVP stream was presented for 40 ms, with an 80 ms blank interval between items (120 ms inter-stimulus interval), yielding an 8.33 Hz presentation rate. The number of items (Lag) between the first (T1) and second (T2) targets was varied to measure the temporal duration of the AB. At the end of each RSVP stream, participants reproduced the orientations of T1 and T2 (higher spatial-frequency gratings) in the order in which they were presented by adjusting an on-screen cursor at the end of the trial. They were asked to determine the orientations as accurately as possible and were not given any time restriction to do this. Visual feedback was provided following the response. b A schematic of the feedback screen for responses. c The correlation values between orientations of the RSVP items over trials in Experiment 1. As Gabor orientations were randomly drawn (without replacement) on each trial, across all trials the orientation of any given item in the stream was uncorrelated with the orientation of any other item. This permitted the use of regression-based approaches to isolate the behavioural and neural processing of individual items independently of surrounding items within the stream. The correlations were calculated for each participant and are displayed as averaged across participants.

To preview the results, the behavioural target task replicated the hallmarks of the AB effect: the orientation of T1 was reported with a relatively high degree of accuracy, whereas orientation judgements for T2 were degraded when T2 appeared 200–400 ms after T1. Forward encoding analyses of EEG activity showed that targets evoked greater orientation-selective information than distractors when T2 was accurately reported (i.e., in non-AB trials), and that orientation information evoked by T2 was suppressed, relative to the distractors, when T2 was missed (i.e., in AB trials). Critical to our first question of whether focused attention influences the gain or precision of feature-specific representations, only the gain of the encoded EEG response was affected by T2 response accuracy.

With respect to our second question—whether accuracy in registering the second target is linked to the processing of T1 or to the intervening distractors—the evidence was in favour of T1-based theories of the AB. We found no evidence to suggest that neural representations of the distractors are affected by the AB. Finally, we describe an unexpected observation—one not predicted by any theory of the AB—namely, a significant interaction between the specific features of T1 and T2, implying a previously unknown long-range temporal integration of target representations within rapid sequential visual streams.

Experiment 1—behavioural hallmarks of the AB

Participants’ ( N  = 22) response errors (i.e., the difference between the presented and reported orientation for each target) were centred around 0°, verifying that they were able to perform the task as instructed. Figure  2a captures the temporal dynamics of the AB, such that accuracy was affected by target position (T1 or T2) and Lag. Specifically, at Lag 1 accuracy for both T1 and T2 was degraded relative to accuracy at the other lags (2, 3, 5 and 7). Moreover, at Lags 2 and 3, T1 accuracy was high whereas T2 accuracy was relatively poor. This was largely due to an increase in the baseline guessing rates (where errors occurred evenly across all orientations). Finally, at longer temporal separations (Lags 5 and 7), target accuracy was similar for both items.

figure 2

a The distribution of response errors (difference between presented and reported orientation) across participants ( N  = 22) for the first (T1, blue lines) and second (T2, red lines) target for each Lag condition. The line shows fitted four-parameter Gaussian function. b Quantified behavioural responses for the four parameters of the fitted Gaussian function (see Supplementary Fig.  1 ) for each participant. Gain shows the amplitude, width shows the standard deviation of the function, centre orientation is the mean (which should be centred around 0° for unbiased estimates), and baseline is a constant parameter accounting for non-orientation selective responses which indicates guessing. Asterisks indicate Bonferroni-corrected t -tests showing significant differences at p  < 0.05. c Regression results for the influence of distractors and targets on participants’ responses. Higher regression weights indicate that a given item’s orientation was more influential for determining the reported orientation. The dotted vertical lines indicate the position of the other target (colour matched). Consider, for example, the panel depicting Lag 2 results. For T1 report, T2 occurred at item plus 2 as indicated by the dotted blue line, whereas for T2 report, T1 occurred at item minus 2, as indicated by the dotted red line. Across all panels, error bars indicate ∓ 1 standard error of mean.

Experiment 1—modelling the AB using behavioural data

We fitted Gaussian functions to each individual’s data to quantify how the AB affected target perception (Fig.  2b ; see Methods and Supplementary Fig.  1 ). The accuracy reduction for T2 at Lags 2 and 3 was primarily linked to a reduction in gain. A 2 (Target; T1,T2) × 5 (Lag; 1,2,3,5,7) within-subjects ANOVA showed the gain parameter was affected by Target ( F (1,21) = 10.00, p  = 0.005, η p 2  = 0.32) and Lag ( F (4,84) = 11.66, p  < 0.0001, η p 2  = 0.36), and the interaction between these factors ( F (4,84) = 7.10, p  < 0.0001, η p 2  = 0.25). Critically for our first theoretical question, the spread (width) of orientation errors was unaffected by the factors of Target ( F (1,21) = 0.10, p  = 0.76, η p 2  = 0.005) or Lag ( F (4,84) = 0.55, p  = 0.70, η p 2  = 0.03), or by the interaction between these factors ( F (4,84) = 0.19, p  = 0.94, η p 2  = 0.01). The baseline parameter, which reflects guessing of random orientations, was also significantly affected by the factors of Target ( F (1,21) = 12.72, p  = 0.002, η p 2  = 0.38) and Lag ( F (4,84) = 4.82, p  = 0.002, η p 2  = 0.19), and by the interaction between them ( F (4,84) = 5.04, p  = 0.001, η p 2  = 0.19). These same effects were also evident when the data were not normalised (Supplementary Fig.  2 ), and with a wide range of parameters to specify the orientation errors (Supplementary Fig.  3 ).

Taken together, these results are consistent with a previous AB study using similar analysis methods 30 . They also lend weight to the global workspace theory of consciousness in the AB 31 , which argues that participants either see the target and have full awareness of it (allowing them to respond precisely), or they have no awareness (and so simply guess randomly). By contrast, the results are inconsistent with the opposing view that the AB involves a noisier (i.e., weaker precision) signal for the target that is inaccurately reported 32 .

Experiment 1—targets, not distractors, influence orientation judgements

To evaluate the influence of distractors on participants’ reports, we aligned the orientations of the items relative to target position within the RSVP stream (−4 to +7 items) and constructed a regression matrix to predict the behavioural response for each target. If the orientation of an item is influential in determining the reported orientation, the regression weight will be relatively high (Fig.  2c ). As expected, for all lags, each reported target orientation was influenced principally by its own orientation. The one exception was the item at Lag 1, where the reported orientation of T1 was as strongly influenced by the orientation of T2 as by the orientation of T1. This observation is in line with numerous studies which have suggested that temporal order information can be lost for consecutive targets 29 , 33 . This phenomenon, also known as Lag 1 switching, where the perceived order of the targets is reversed, explains why the accuracy of orientation judgements on both T1 and T2 was reduced at Lag 1 (see also Supplementary Fig.  4 ). By contrast, for items at Lags 2 and 3, orientation judgements on T1 were only marginally influenced by the orientation of T2 (i.e., for items at positions +2 and +3, respectively, in the RSVP stream). However, at these same lags (where the AB was maximal) T2 reports were significantly influenced by T1 orientation (i.e., for items at positions −2 and −3, respectively). Importantly, there was no reliable influence of distractors on reported target orientation at any lag, suggesting distractors played little or no role in target orientation errors.

Experiment 1—long-range integration of target orientations

One account of the AB 28 , 29 has suggested that successive targets presented at short lags are integrated into a single episodic trace, which accounts for Lag 1 switching. With the present task, we can directly quantify how targets are integrated by looking for systematic biases in the reported orientation of a given target based on its orientation difference with respect to the other target. Figure  3a shows orientation judgement errors as a function of the difference between the two target orientations. While the average orientation error is centred on 0°, the perceived orientation of either target (T1 or T2) was significantly biased toward the orientation of the other target within the RSVP stream at early Lags. Furthermore, these biases were orientation-tuned, such that the largest bias occurred when targets differed by approximately 45°, somewhat analogous to serial dependency effects 34 , 35 . This profile of biases suggests response integration, rather than replacement, as the latter would predict that only the orientation of T2 should drive the reported orientation of T1. Instead, and consistent with our linear regression analysis (see Fig.  2c ), the bias reflected the difference between target orientations, which supports the idea that the critical features of the two targets are assimilated over time 28 , 29 .

figure 3

a Orientation error (the difference between presented and reported orientation) plotted against the difference between T1 (blue lines) and T2 (red lines) orientations (divided into 30° bins, for clarity of presentation). Positive values on the X -axis indicate that a given target was rotated clockwise relative to the other target. For instance, when examining T1, a positive value indicates that T2 was rotated clockwise relative to T1, whereas a negative value indicates that T2 was rotated anti-clockwise relative to T1. For T1, the plotted values reflect the calculation of T1 minus T2, and vice versa for the calculation of T2, to ensure values were equivalent for the comparison of interest. The same convention applies to orientation error, shown on the Y -axis. The fitted line is the first derivative of a Gaussian (D1) function showing the orientation-tuned gain and width of the response. b Bias magnitude was quantified across participants by fitting the D1 function to each participant’s (non-binned) data, with the gain showing bias magnitude. A positive gain on the Y -axis indicates that the perceived orientation was biased toward, rather than away from, the other target. c Bias magnitude by difference with target and distractors. For both T1 and T2, the difference between the target and the item was found in the same manner as in ( a ). We fit the D1 function to find the magnitude of bias induced by each item for each participant. The dotted coloured lines indicate the temporal position of the other target (T1 = blue, T2 = red). For all panels, the asterisks indicate, for each target (colour matched), at which lags the bias was significantly greater than zero (Bonferroni-corrected one-sample t -tests p  < 0.05). Across all panels, error bars indicate ∓ 1 standard error of mean.

We fit first derivative of Gaussian (D1) functions 36 , 37 , 38 to quantify the amount of orientation-selective bias for both targets at each Lag for each participant. A 2 (Target; T1, T2) × 5 (Lag; 1,2,3,5,7) within-subjects ANOVA revealed significant main effects of Target ( F (1,21) = 5.04, p  = 0.04, η p 2  = 0.19) and Lag ( F (4,84) = 6.54, p  < 0.0001, η p 2  = 0.24), and a significant interaction ( F (4,84) = 6.14, p  < 0.0001, η p 2  = 0.27). For T1 reporting, the bias was significantly greater than chance at all intervals, whereas for T2, there was a significant bias at Lags 2 and 3 only (Bonferroni-corrected one-sample t -test, all ps  < 0.05). As might be expected 28 , 29 , the ‘attraction’ bias in target reports was strongest when the two targets were presented with no intervening distractors between them (i.e., at Lag 1). An entirely unexpected finding, however, is that there was an equally strong attraction bias between targets presented at Lags 2 and 3 (see Fig.  3b ), even though participants were not explicitly aware of the orientation of T2 on AB trials.

Experiment 1—biased perception of targets by preceding distractors

Previous work suggests that distractor processing can significantly interfere with target processing 39 , 40 , 41 , particularly for the immediate post-target item which can be integrated into the target representation 28 , 29 , 33 . To determine whether this was the case in our data, we repeated the previous analysis but used the difference in orientation between the target and each of the other items in the RSVP stream (Fig.  3c ). For most lags, the reported target orientation was significantly attracted toward the immediately following distractor, but was not reliably influenced by any other distractor. A 2 (Target; T1, T2) × 5 (Lag; 1,2,3,5,7) × 5 (Item position; −1,1,2,3,4) within-subjects ANOVA confirmed a significant three-way interaction between the factors ( F (16,336) = 4.11, p  < 0.0001, η p 2  = 0.16). At Lag 1, there was no influence of distractors on reported orientations for either T1 or T2. Taken with the previous result, this suggests that the representation of a given target is influenced by both the other target and by the post-target item. The results suggest that when the visual system detects a target, it automatically integrates features from the immediately subsequent item. This is consistent with previous studies that have highlighted the importance of masking by the item immediately following the target in eliciting the AB 42 .

Experiment 2—electrophysiological recording of the AB

We next characterised the neural activity elicited by individual RSVP items, and determined how this was affected by the AB. In Experiment 2, a group of 23 new participants undertook the RSVP task introduced in Experiment 1 while neural activity was concurrently measured using EEG. The method was identical in all respects, except that we now included targets only at Lags 3 and 7 (i.e., a single target inside and outside the AB, respectively) to increase the within-subject power for the EEG analyses.

Experiment 2—behavioural results

The behavioural results replicated, in all important respects, those found in Experiment 1. As shown in Fig.  4a , participants performed well overall, and their orientation judgements for T1 and T2 were centred on the presented orientations. As in Experiment 1, we fit Gaussian functions to quantify the results (Fig.  4b ). For the gain parameter, a 2 (Target; T1, T2) × 2 (Lag; 3, 7) within-subjects ANOVA revealed significant main effects of Target ( F (1,22) = 11.63, p  = 0.003, η p 2  = 0.35) and Lag ( F (1,22) = 18.70, p  < 0.0001, η p 2  = 0.46), and a significant interaction ( F (1,22) = 40.19, p  < 0.0001, η p 2  = 0.65). Likewise for the baseline parameter, there were significant effects of Target ( F (1,22) = 8.96, p  = 0.007, η p 2  = 0.30) and Lag ( F (1,22) = 12.21, p  = 0.002, η p 2  = 0.36), and a significant interaction ( F (1,22) = 7.91, p  = 0.01, η p 2  = 0.26). By contrast, there were no significant main effects and no interaction for the width parameter (Target ( F (1,22) = 1.19, p  = 0.29, η p 2  = 0.05; Lag ( F (1,22) = 3.90, p  = 0.06, η p 2  = 0.15); interaction ( F (1,22) = 0.14, p  = 0.71, η p 2  = 0.006).

figure 4

a Aggregate response accuracy across participants (difference between presented and reported orientations) for T1 (blue lines) and T2 (red lines), shown separately for Lag 3 and Lag 7 trials. Lines are fitted Gaussian functions. b Quantified behavioural responses for the four parameters of the fitted Gaussian functions (gain, width, centre orientation and baseline) to each participant’s data. Asterisks indicate Bonferroni-corrected t-test differences at p  < 0.05. c Regression results for the influence of distractors and targets on participants’ responses. The dotted vertical lines indicate the position of the other target (colour matched). Consider, for example, the panel depicting Lag 3 results. For T1 report, T2 occurred at item plus 3 as indicated by the dotted blue line, whereas for T2 report, T1 occurred at item minus 3, as indicated by the dotted red line. Across all panels error bars indicate ∓ 1 standard error of mean.

Experiment 2—orientation selectivity of RSVP items

We next applied forward modelling to the EEG data recorded during the task to quantify orientation information contained within multivariate patterns of neural activity. Because the orientations of successive items were uncorrelated, we were able to quantify orientation selectivity for each grating without contamination from adjacent items. Forward encoding uses a linear regression-based approach to find multivariate patterns of EEG activity that are selective for features of interest—in this case orientation. As no previous study has used forward encoding in conjunction with rapid visual presentations, we first verified that orientation selectivity for each of the 20 RSVP items could be extracted separately using this approach, and at what time point any such response was evident. To do this, we constructed 20 encoding models, one for each of the item positions within the 20-item RSVP stream, based on the orientations presented for that item across trials.

As shown in Fig.  5 , the forward encoding revealed robust and reliable feature selectivity derived from patterns of EEG activity for each of the gratings presented during the RSVP. Each item’s orientation was successfully decoded over a time window that extended from 74 to 398 ms after the item was presented. Examination of the neural responses to each of the 20 items within the RSVP stream (Fig.  5c ) shows that feature selectivity was evident as a series of regularly spaced, short-lived impulse responses, each with a delay of around 50 ms from grating onset and lasting approximately 300 ms. To quantify these observations, we fit Gaussian functions to the forward encoding results for each item separately for each participant and at each time point. There was significant feature selectivity (given by the gain of the Gaussian) for each item immediately after it was presented but not before (Fig.  5d ). These representations were temporally overlapping, such that multiple orientation-selective responses (~3) were detectable at the same time. Taken together, the forward encoding analyses verify that it is possible to reliably recover the presented orientation of every RSVP item from the multivariate pattern of neural activity recorded using EEG.

figure 5

a Forward encoding results aligned at the time of item onset and the presented orientation across all participants. All representations have been re-aligned so that the presented orientation is equivalent to 0°. b Forward encoding results averaged over 50 ms bins (shown by corresponding colour in ( a ) following each item. Feature-selectivity peaks around 50–120 ms after the onset of each item and persists for ~200 ms. c Forward encoding results for each item in the RSVP stream. Vertical black lines indicate the presentation time of each of the 20 items within the RSVP stream. The dotted horizontal line indicates the presented orientation. The colour scale is the same as in panel ( a ). d Gaussian distributions were fitted to each participant’s data for each item in the stream, with the gain showing feature selectivity. The red horizontal line segments underneath each trace indicate timepoints over which feature selectivity was significantly different from zero (i.e., where feature selectivity was greater than what would be expected by chance; two-tailed, sign-flipping cluster-permutation, alpha p  < 0.05, cluster alpha p  < 0.05, N permutations = 20,000), which occurs immediately following item presentation. Across all panels shading indicates ∓ 1 standard error of the mean across participants. a.u. = arbitrary units.

Experiment 2—reduced feature-selective information for T2 during the AB

We next examined how neural representations of the target items were affected by the AB. To increase signal-to-noise for training the encoding model, we aligned the EEG data to the presentation time of each item in the RSVP stream and applied the same forward encoding procedure. This meant that the model was trained and tested across 12,000 presentations (600 trials by 20 RSVP items; see Fig.  6 ). To determine the effect of the AB on orientation-selectivity, we separated the forward encoding results by target (T1,T2) and T2 accuracy (correct, incorrect). For the purposes of the analyses, trials were scored as correct if the reported orientation was within ±30 degrees of the presented orientation, a criterion which yielded roughly equal correct and incorrect trials at Lag 3. In line with the AB literature, for all the EEG analyses we only included trials where participants correctly identified T1. Applying these criteria yielded the classic AB effect (Supplementary Fig.  5 ). A 2 (Lag; 3, 7) × 2 (Target; T1, T2) within-subjects ANOVA applied to these scores revealed significant main effects of Lag ( F (1,22) = 19.05, p  < 0.0001, η p 2  = 0.46) and Target ( F (1,22) = 18.00, p  < 0.0001, η p 2  = 0.45), and a significant interaction ( F (1,22) = 31.91, p  < 0.0001, η p 2  = 0.59). Follow-up t-tests showed that Lag 3 accuracy was significantly lower than Lag 7 accuracy for T2 items ( t (22) = 5.20, Bonferroni p  = 0.0001, d  = 0.44) but not for T1 items ( t (22) = 2.11, Bonferroni p  = 0.09, d  = 0.44). In addition, T2 accuracy was significantly lower than T1 accuracy at Lag 3 ( t (22) = 5.94, Bonferroni p  < 0.0001, d  = 1.08), but there was no such difference at Lag 7 ( t (22) = 1.20, Bonferroni p  = 0.48, d  = 0.25).

figure 6

a Time course of measured feature selectivity for T1 and T2, given by the gain of the fitted Gaussian parameter. Trials were scored as correct if the participant’s response was within 30° of the presented orientation. Only trials in which participants responded accurately to T1 were included in the analysis. The thick black horizontal line in the upper right panel indicates a period of significant difference between Incorrect (blue lines) and Correct (red lines) trials (two-tailed sign-flipping cluster-permutation, alpha p  < 0.05, cluster alpha p  < 0.05, N permutations = 20,000). Note that the difference in magnitude for the encoding results shown in Fig.  5 is due to the increased number of training trials used in this analysis (12,000 vs 600). b Forward encoding results were averaged across the significant timepoints for T2 Lag 3 shown in ( a ) (upper right panel) to reconstruct the full representation of orientation. Reliable changes in the gain of orientation representations for T2 were present at Lag 3 (upper panel) but not at Lag 7 (lower panel). There was no difference in the width for either Lag. Shading indicates ∓ 1 standard error of the mean.

We again fitted Gaussians to each time point to quantify the amount of feature-selective information evoked by the targets. For both T1 and T2, there was significant feature-selective activity shortly after each item appeared (Fig.  6a ). For Lags 3 and 7, there was no difference between correct and incorrect trials for the T1 representation. For T2, however, incorrect trials resulted in a significantly decreased feature-selective response (cluster p  = 0.02) relative to correct trials shortly after each item appeared (100–150 ms) at Lag 3, although the response was not completely suppressed. There were no significant differences in the orientation-selective response between correct and incorrect trials for T2 at Lag 7, suggesting the suppression is caused by the AB rather than general target detection. This was expected because the AB typically lasts less than 500 ms, and is consistent with the current behavioural results showing an AB at Lag 3 but not at Lag 7. Performing the same analysis on the other parameters of the Gaussian (width, centre, baseline) showed no effect of the AB (Supplementary Fig.  6 ).

To ensure we did not miss any small but consistent effects, we averaged the forward encoding results (Orientation × Time) over the early (100–150 ms) timepoints to increase signal-to noise-ratio and recovered the orientation tuning curve (Fig.  6b ). Fitting Gaussians to these values confirmed that the AB was associated with a change in the gain of feature selectivity for T2 at Lag 3, such that correct trials showed significantly greater gain than incorrect trials ( t (22) = 3.12, p  = 0.01, d  = 0.65; Fig.  6b upper panel). By contrast, the width of the representation was again unaffected by the AB ( t (22) = 1.66, p  = 0.11, d  = 0.35) for the same item. For Lag 7 items, neither the gain ( t (22) = 0.12, p  = 0.90, d  = 0.03; Fig.  6b lower panel) nor the width ( t (22) = 0.04, p  = 0.96, d  = 0.01) of the neural representations of T2 items were affected by behavioural performance (correct vs. incorrect trials).

The reduction in T2 selectivity for incorrect trials at Lag 3 was not driven by an arbitrary split of trials into correct and incorrect categories. To verify this, we sorted the evoked T2 forward encoding results by the amount of orientation error (in 15º error bins to allow sufficient signal-to-noise ratios for fitting). There was significantly greater feature selectivity when the orientation error was small, and this selectivity gradually decreased with larger errors (one-way within-subjects ANOVA, F (1,22) = 2.76, p  = 0.02, η p 2  = 0.11; Fig.  7 ). Note that this finding is inconsistent with a graded model of the AB, and instead supports the idea that response variability during the AB is associated with both a decrease in feature-selective gain and an increase in the rate of guessing. This finding is consistent with the behavioural results, which suggest a discrete model of the AB. Overall, these results indicate that the AB is associated with a reduction in gain, but not width, of feature-selective information for the second target item (T2), and that this effect occurs soon after the target appears within the RSVP stream.

figure 7

Forward encoding results were averaged across early timepoints (100–150 ms), and were binned by the absolute difference between the presented and reported orientations (in 15° increments). Each bin is displayed as the starting value (e.g., 0° incorporates errors from 0° to 15°). Gaussians were fitted to quantify selectivity with the gain parameter shown here. Feature selectivity was highest when participants reported the orientation to within 30° of the presented orientation, and declined significantly with larger reporting errors. Error bars indicate ∓ 1 standard error of the mean.

Experiment 2—only targets affect the AB, but not distractors

We next examined the neural representations both of targets and distractors to test the different predictions made by T1 25 , 26 - versus distractor-based 27 accounts of the AB. T1-based accounts argue that the second target deficit is caused by extended processing of the first target, whereas distractor-based accounts, argue that deleterious processing of the distractors, mainly between T1 and T2, causes the second target to be missed. The theories thus make distinct predictions about the neural representation of target and distractor items. According to T1-based accounts, target representations should be enhanced relative to those of distractors, and missed T2 items on AB trials should be more poorly represented than correctly reported T2 items. By contrast, distractor-based accounts predict that neural representations of distractor items should be stronger on AB trials than on non-AB trials and weaker following T1 presentation.

As before, we averaged the forward encoding modelling representations (Orientation × Time) across an early time point (100 to 150 ms), and fit Gaussians to each participant’s data to quantify feature selectivity (Fig.  8a ). For correct trials (i.e., orientation responses to T2 were within 30° of the presented orientation), the two targets resulted in significantly higher feature selectivity (gain) than the immediately adjacent distractors (−2, −1, +1 and +2 items) for both T1 and T2 representations (all ps  < 0.04). On incorrect trials, feature selectivity for T1 was not significantly greater than selectivity for the surrounding distractors ( t (22) = 0.15, p  = 0.88, d  = 0.03), even though we included only trials in which T1 was correctly reported. Most interestingly, on incorrect trials the representations of T2 items were significantly lower than those of the immediately adjacent distractors ( t (22) = 2.09, p  = 0.04, d  = 0.44), suggesting that the featural information carried by T2 was suppressed, while distractors were unaffected. To directly test the distractor model of the AB, we compared distractor representations before T1 with distractor representations during the AB (i.e., between T1 and T2). The account predicts that distractors presented during the AB should elicit a stronger neural representation as they are likely to be incorrectly selected as targets. Instead, we found that distractors were represented similarly before and during the AB for both correct trials ( t (22) = 0.85, p  = 0.40, d  = 0.18) and incorrect trials ( t (22) = 1.83, p  = 0.08, d  = 0.38).Taken together, these results suggest that for trials where participants accurately report target orientation, the neural representations of targets are boosted relative to those of distractors. By contrast, when the second target is missed, as occurs during the AB, there is a significant suppression of the target’s featural information.

figure 8

a Neural feature selectivity (gain of Gaussian) of target and distractor representations for Lag 3. Blue lines show incorrect trials and red lines show correct trials. Gaussians were fit to the averaged neural representation from 100 to 150 ms. To aid comparison, the grey bar indicates the average distractor representation ( ∓ 1 standard error of mean). Note that all distractors and targets have gain values significantly above 0 arbitrary units (a.u.) indicating robust feature selectivity. Error bars indicate ∓ 1 standard error of mean. b Headmaps showing univariate orientation selectivity over time, plotted separately for targets and distractors. Plus symbols indicate positive cluster-permuted differences between targets and distractors (two-tailed cluster-permutation, alpha p  < 0.05, cluster alpha p  < 0.025, N permutations = 1500).

Experiment 2—localisation of feature selectivity for targets and distractors

In a final step, we performed a univariate sensor-level analysis for feature selectivity 10 to find the topographies associated with target and distractor processing. To do this, we trained a simplified model of feature selectivity on each type of item (targets and distractors) separately for each EEG sensor. Orientation information for both targets and distractors was evident most strongly over occipital and parietal areas, and target items generated significantly greater selectivity over these areas than distractors (Fig.  8b ). These findings suggest that while target and distractor items are processed in overlapping brain regions, targets generate significantly greater orientation-selective information than distractors.

We developed an RSVP paradigm to determine the neural and behavioural bases of the limits of temporal attention. The behavioural results replicated the hallmark of the AB with response accuracy being significantly reduced when T2 was presented within 200–400 ms of T1. We discovered that target representations influenced one another, such that the reported orientation of one target was biased toward the orientation of the other. Results from Experiment 2 revealed that successfully reporting T2 depended on a boost to its neural representation relative to other items in the RSVP stream, whereas missing T2 corresponded to a suppressed neural response relative to the distractors. Notably, there was no evidence for suppression of neural representations of the distractors, suggesting the AB is primarily driven by processing competition between target items. This observation supports theories that have attributed the second-target deficit to first target processing 4 , 23 , 43 , but is inconsistent with theories that attribute the AB to inadvertent processing of distractor items 24 , 27 .

An important but unexpected result is that target reports were influenced by one another despite being separated by several hundred milliseconds and multiple distractor items. One influential theory argues that the AB is caused by temporal integration of the target with the immediate post-target distractor 28 , 29 . Our RSVP task found evidence for this but also showed that target representations appear to be integrated with each other even when they are separated by multiple distractor items within the stream. This finding is not explicitly predicted by any existing account of the AB. The largest bias was for Lag 1 trials, in which the two targets appear sequentially, a result that is consistent with Lag 1 switching 28 , 29 , 33 . The orientation of the immediate post-target distractor also significantly biased the perceived target orientation, whereas the distractors that appeared between the targets did not bias perceptual judgements. Taken together, our findings across two experiments suggest that the detection of a target in an RSVP sequence starts a period of local integration which involuntarily captures the next item, whether it is a target or a distractor. This is followed by a more global integration of targets, possibly within working memory 4 .

Our first major aim was to determine how the AB affects target representations. The forward encoding modelling of the EEG data adds to previous results 30 by demonstrating that the gain in neural representations of Lag 3 items is significantly reduced in AB trials, compared with non-AB trials. Supporting the behavioural results, there was no effect on the width of EEG-derived feature selectivity during the AB. The neural results also go beyond the behavioural findings by showing that the gain of Lag 3 items is not only suppressed on AB trials, but boosted on non-AB trials compared with those of the distractors. Taken together, these results suggest that temporal attention operates in a similar manner to spatial attention 15 , 17 , 18 , 19 , but not to feature-based attention 20 , 21 , as the former has been found to affect the gain of neural responses whereas the latter tends to affect the sharpness of neural tuning.

The second major aim of our study was to resolve the persistent debate between T1- and distractor-based theories of the AB 4 , 23 , 24 , 25 , 26 , 27 , 43 , 44 . Behaviourally, we found scant evidence that distractors (apart from the immediately subsequent distractor) influence target perception. Consistent with T1-based accounts of the AB 4 , 25 , there were robust neural representations of distractors and no evidence that distractor representations were boosted following initial target detection, as would be predicted by distractor-based accounts. Furthermore, we found no evidence that post-T1 distractors were suppressed, as would be predicted by T1-based inhibition accounts of the AB 4 , 23 . Instead, consistent with T1-based accounts, the representations of both targets were boosted relative to those of the distractors. If the second target was missed, however—as occurs during the AB—then the representation of the second target was significantly suppressed relative to the distractors. Taken together, these results suggest that when the first target is processed rapidly, attention is efficiently redeployed to the second target, causing its representation to be boosted. By contrast, if the second target appears while processing of the first target is ongoing, the visual system actively suppresses the information to avoid the targets interfering with each other.

Suppression of the T2 representation occurred 100–150 ms after the target appeared, suggesting inhibition of the sensory information by ongoing processing of T1. This fits well with previous work showing that the AB is associated with a reduced late-stage response, as indicated by an ERP component associated with working memory consolidation 8 , 45 . Taken together with the current results, it appears that the AB is associated with an early suppression of sensory information associated with the T2 stimulus. The diminished strength of sensory information associated with the T2 item in turn is expected to exert less influence on later stages in the information processing hierarchy, such as working memory. This could also explain why the T2 representation was only initially affected (100–150 ms), as only its early appearance needs to be suppressed to stop inference with T1 processing at a higher stage. These behavioural results may be consistent with sequential working memory consolidation of targets. We found the precision of reporting T1 was unaffected by Lag, even though often during the AB only one item is reported, whereas at longer lags two items are reported. During spatial working memory tasks, where multiple items are simultaneously presented, longer lags should have a higher memory load and lead to lower precision 46 . Instead, the current results suggest that each target is consolidated into working memory before the store allows a second item to enter.

In summary, the current work adds to our understanding of the neural and behavioural basis of temporal attention. We were able to recover a neural signature for each item within an RSVP stream, something that has not been possible with conventional approaches to EEG and fMRI data. Our methodology indicated that while there is co-modulation of featural information carried by each of the targets, there is no evidence for distractor suppression in this RSVP task. We also document the existence of interactions among targets that are separated by several hundred milliseconds.

Our methodology provides a rich framework for exploring the neural bases of many psychological phenomena, including repetition blindness 47 and contingent attentional capture 48 . The current work was not designed to pinpoint the exact neural locus of the AB, but combining our approach with a technique like fMRI, which has better spatial resolution than EEG, could elucidate some of the key brain areas involved in the phenomenon. It has been suggested that feedback and feedforward processes modulate different aspects of the AB 49 . Future studies might also fruitfully combine our method with invasive recordings across multiple brain sites in animal models, to better understand the neuronal mechanisms underlying the AB effect.


In Experiment 1, 22 participants (13 females, 9 males; median age 22 years; range 19–33 years) were recruited from a paid participant pool and reimbursed at AUD$20/hr. In Experiment 2, 23 participants (14 females, 9 males; median age 23 years; range 19–33 years old) were recruited from the same pool and reimbursed at the same rate. Each person provided written informed consent prior to participation and had normal or corrected-to-normal vision. The study was approved by The University of Queensland Human Research Ethics Committee and was in accordance with the Declaration of Helsinki.

Experimental setup

Both experiments were conducted inside a dimly illuminated room. The items were displayed on a 22-inch LED monitor (resolution 1920 × 1080 pixels, refresh rate 100 Hz) using the PsychToolbox presentation software for MATLAB 50 , 51 . In Experiment 1, participants were seated at a distance of approximately 45 cm from the monitor. In Experiment 2, the same viewing distance was maintained using a chinrest to minimise head motion artefacts in the EEG. At a viewing distance of 45 cm, the monitor subtended 61.18º × 36.87º (one pixel = 2.4′ × 2.4′).

A schematic of the task is shown in Fig.  1 . Supplementary Movie  1 shows two example trials. Each trial began with a central fixation point and the RSVP stream commenced after 300 ms. The stream consisted of 20 Gabors (0.71° standard deviation, ~5° diameter, 100% contrast, centred at fixation) on a mid-grey background. On each trial, the orientations of the twenty Gabors in the stream were drawn pseudo-randomly, without replacement, from integer values ranging from 0–179°. Both targets and distractors were drawn from the same random distribution, meaning there was no restriction on the relationship between targets (except they could not be identical). Note the uncorrelated nature of the targets means the design controls for possible repetition blindness effects 52 , since the targets were equally likely to be similar in orientation as they were to be maximally dissimilar (i.e., orthogonal), and thus any potential orientation-specific effects would cancel out across trials.

Each item was presented for 40 ms and was separated from the next item by a blank interval of 80 ms, yielding an 8.33 Hz presentation rate. The participants’ task was to reproduce the orientations of the two high-spatial-frequency Gabors (targets; 2 c/°) while ignoring the items of a low-spatial frequency (distractors; 1 c/°). Between 4 and 8 distractors, varied pseudo-randomly on each trial, were presented before the first target (T1) to minimise the development of strong temporal expectations, which can reduce the AB 40 , 53 . The number of distractor items between T1 and T2 defined the inter-target lag (1,2,3,5,7 in Experiment 1, and 3,7 in Experiment 2). There were 600 trials in each of the two experiments, with an equal distribution of trials across the lag conditions (120 in Experiment 1, 300 in Experiment 2), with fewer lags included in Experiment 2 to increase signal to noise for the regression-based EEG analysis. In Experiment 2, we selected Lag 3 as the test condition for the AB because it yielded a significant reduction in T2 response accuracy compared with T1 in Experiment 1, and because it has been widely used in previous studies of the AB 24 , 39 , 40 , 54 , 55 , 56 , 57 .

Participants were asked to monitor the central RSVP stream until the presentation of the last Gabor, after which a response screen appeared (see Fig.  1b ). The response screen consisted of a centrally presented black circle (10° diameter) and a yellow line. Participants rotated the line using a computer mouse to match the perceived orientation of the target and clicked to indicate their desired response. They were asked to reproduce the orientations of the two targets (T1, T2) in the order they were presented, and to respond as accurately as possible, with no time limit. After providing their responses, participants were shown a feedback screen which displayed their orientation judgements for T1 and T2, and the actual orientations of both targets (see Fig.  1c ). The feedback was displayed for 500 ms before the next trial began, and participants were given a self-paced rest break every 40 trials. Each experiment took between 50 and 60 min to complete.

EEG acquisition and pre-processing

In Experiment 2, continuous EEG data were recorded using a BioSemi Active Two system (BioSemi, Amsterdam, Netherlands). The signal was digitised at 1024 Hz sampling rate with a 24-bit A/D conversion. The 64 active scalp Ag/AgCl electrodes were arranged according to the international standard 10–20 system for electrode placement 58 using a nylon head cap. As per BioSemi system design, the common mode sense and driven right leg electrodes served as the ground, and all scalp electrodes were referenced to the common mode sense during recording. Pairs of flat Ag-AgCl electro-oculographic electrodes were placed on the outside of both eyes, and above and below the left eye, to record horizontal and vertical eye movements, respectively.

Offline EEG pre-processing was performed using EEGLAB 59 in accordance with best practice procedures 60 , 61 . The data were initially down sampled to 512 Hz and subjected to a 0.5 Hz high-pass filter to remove slow baseline drifts. Electrical line noise was removed using the clean_line , and clean_rawdata functions in EEGLAB was used to remove bad channels (identified using Artifact Subspace Reconstruction), which were then interpolated from the neighbouring electrodes. Data were then re-referenced to the common average before being epoched into segments for each trial (−0.5 s to 3.0 s relative to the first Gabor in the RSVP). Systematic artefacts from eye blinks, movements and muscle activity were identified using semi-automated procedures in the SASICA toolbox 62 and regressed out of the signal. The data were then baseline corrected to the mean average EEG activity from 500 to 0 ms before the first Gabor in the trial.

Behavioural analysis

To determine how the AB affected participants’ perception of targets, for each trial we found the difference between the actual target orientation and the reported orientation (i.e., the orientation error) for T1 and T2. This approach is analogous to one employed in previous work that examined whether the AB is associated with discrete or graded awareness of T2 30 . The continuous nature of the orientation responses given by participants on each trial raises the challenge of distinguishing “correct” and “incorrect” trials. For Experiment 2, we scored trials as correct when the orientation error was less than 30° from the presented orientation; trials were scored as incorrect when the orientation error was greater than 30°. As shown in Supplementary Fig.  5 , this approach to scoring yielded a classic blink effect, suggesting the task captures the important behavioural features of the widely reported AB phenomenon. For each lag condition, we found the proportion of responses (in 15° bins) between −90° and +90° for the orientation errors (see Figs.  2a and 4a ) and fit Gaussian functions with a constant offset (Eq.  1 ) using non-linear least square regression to quantify these results for each participant (Figs.  2b and 4b ):

where A is the gain, reflecting the proportion of responses around the reported orientation, μ is the orientation on which the function is centred (in degrees), σ is the standard deviation (degrees), which provides an index of the precision of participants’ responses, and C is a constant used to account for changes in the guessing rate. Using different bin sizes yields the same pattern of results suggesting this procedure did not bias the results (Supplementary Fig.  3 ). We used a Gaussian with a constant offset to characterise behavioural performance, as it captures the distribution of errors well (median R 2  = 0.76, SE = 0.04 in Experiment 1). This model allows the gain, width, bias and guessing rates to vary independently (Supplementary Fig.  1 ), unlike the function used in a previous study using a continuous report measure for the AB 30 . Most importantly, the function we implemented can also be used to characterise the forward encoding results, thus allowing a direct comparison of the AB based upon behavioural and neural measures.

We used a regression-based approach 63 (see Figs.  2d and 4c ) to determine how targets and distractors within each RSVP stream influenced behavioural responses. To do this, we aligned the orientations of both distractor and target items from 4 items prior to the appearance of the target through to 9 items after the appearance of the target to construct a regression matrix of the presented orientations. The regression matrix was converted to complex numbers (to account for circularity of orientations) using Eq.  2 :

where C is the regression matrix (in radians) and 1i is an imaginary unit. Standard linear regression was used to determine how the orientations of the items affected the reported orientation using Eq.  3 :

where R is the reported orientation (in radians). This was done separately for T1 and T2 reports, with a higher regression weight indicating the item was more influential in determining the reported orientation.

To determine whether the finding that the orientations of T1 and T2 influenced the reported orientation was due to participants integrating the other target or the surrounding distractors (Fig.  3 ), we found the difference in orientation between the target of interest and the other item (either target or distractor) and the orientation error for each trial. This showed an orientation-tuned effect characteristic of integration. To quantitatively determine the magnitude of this effect, we fit first-derivative Gaussian functions (D1; Eq.  4 ) to these responses 36 , 37 , 38 :

where A is the gain, μ is the orientation on which the function is centred (in degrees) and σ is the standard deviation (degrees).

Forward encoding modelling

Forward encoding modelling was used to recover orientation-selective responses from the pattern of EEG activity for both target and distractor items in the RSVP stream. This technique has been used previously to reconstruct colour 16 , spatial 15 and orientation 19 selectivity from timeseries data acquired through fMRI. More recently, the same approach has been used to encode orientation 9 , 12 , 13 and spatial 14 information contained within MEG and EEG data, which have better temporal resolution than fMRI.

We used the orientations of the epoched data segments to construct a regression matrix with 9 regression coefficients, one for each of the orientations (Fig.  9a ). This regression matrix was convolved with a tuned set of nine basis functions (half cosine functions raised to the eighth power 9 , 10 , 13 , Eq.  5 ) centred from 0° to 160° in 20° steps.

where μ is the orientation on which the channel is centred, and x are orientations from 0º to 180º in 1º steps.

figure 9

a A basis set of the nine channels used to model feature (orientation) selectivity. b The basis set was used to find the expected response (regression coefficients) for each different RSVP item in every trial, for each EEG electrode (three electrodes are shown here for a single example participant). Three trials are shown for the corresponding gratings. c Ordinary least squares regression was used to find regression weights for the orientation channels across trials for each EEG electrode (three electrodes are shown here for a single example participant). d Shrinkage matrix that the weights were divided by to perform regularisation, to account for correlated activity between electrodes. e The regression weights were applied to predict the presented orientation. Neural activity (headmaps) from two trials, with the channel responses for those trials. Dotted lines indicate the presented orientations. f Applying this procedure to each time point gives the time course of feature-(orientation) selectivity (for one participant). Trials have been binned in 20º intervals, with the dotted lines representing the presented orientation in those trials. On the y -axis, 0 ms represents the onset of the item within the RSVP stream. Feature selectivity emerged around 75 ms after stimulus presentation. g Modified Gaussian functions (equation) were used to quantify the tuning. The colours of the free parameters in the equation correspond to the relevant components of the tuning curve below.

This tuned regression matrix was used to measure orientation information either across trials or in epoched segments. This was done by solving the linear Eq. ( 6 ):

where B 1 (64 sensors ×  N training trials) is the electrode data for the training set, C 1 (9 channels ×  N training trials) is the tuned channel response across the training trials and W is the weight matrix for the sensors to be estimated (64 sensors × 9 channels). Following methods recently introduced for M/EEG analysis, we separately estimated the weights associated with each channel individually 13 , 64 . W was estimated using least square regression to solve Eq. ( 7 ):

Following this previous work 11 , 13 , 64 , we removed the correlations between sensors, as these add noise to the linear equation. To do this, we first estimated the noise correlation between electrodes and removed this component through regularisation 65 , 66 by dividing the weights by the shrinkage matrix. The channel response in the test set C 2 (9 channels × N test trials) was estimated using the weights in (7) and applied to activity in B 2 (64 sensors × N test trials), as per Eq.  8 :

To avoid overfitting, we used cross validation (10-fold in the initial whole-trial analysis, and 20-fold when the item presentations were stacked), where X-1 of epochs were used to train the model, and this was then tested on the remaining (X) epoch. This process was repeated until all epochs had served as both test and training trials. We also repeated this procedure for each point in the epoch to determine time-resolved feature-selectivity. To re-align the trials with the exact presented orientation, we reconstructed the item representation 15 by multiplying the channel weights (9 channels × time × trial) against the basis set (180 orientations × 9 channels). This resulted in a 180 (−89° to 90°) Orientations × Trial × Time reconstruction. In order to average across trials, the orientation dimension was shifted so that 0º corresponded to the presented orientation in each trial.

For the initial encoding analysis (Fig.  5 ), to determine whether feature selectivity could be recovered for each RSVP item we used 20 encoding models (one for each item position in the stream) with 600 trials. We trained and tested each model across the entire 2250 ms of the trial to determine when feature selectivity emerged for that RSVP item. This analysis verified that each RSVP item could be encoded independently. We aligned all RSVP items across trials ( N  = 12,000; 600 trials by 20 items) and used a fixed encoding model for training and testing 67 , 68 (Figs.  6 – 8 ). This meant we trained and tested all encoding models across all items (both targets and distractors) regardless of trial type 12 , 13 .

Aligned item reconstructions were then averaged over the relevant condition (Lag, Accuracy or item position) and smoothed using a Gaussian with a temporal kernel of 6 ms 10 , 13 to quantify feature selectivity. The Gaussian functions were fit, using least square regression, to quantify different parameters of feature selectivity across timepoints, as per Eq.  1 , where A is the gain representing the amount of feature selective activity, μ is the orientation on which the function is centred (in degrees), σ is the width (degrees) and C is a constant used to account for non-feature selective baseline shifts.

Univariate orientation selectivity analysis

We used a univariate selectivity analysis 10 to determine the topography associated with orientation-selective activity for targets and distractors (Fig.  8b ). Data were epoched in the same manner as in the forward encoding model where EEG activity was aligned with each stream item. We separated these epochs into target and distractor presentations to determine whether these two types of stimulus were processed differently. All target presentations were used in training (1200 in total; 600 trials with two targets in each), together with a pseudo-random selection of the same number of distractor items. To determine the topography, we used a general linear model to estimate orientation selectivity for each sensor from the sine and cosine of the presentation orientation, and a constant regressor in each presentation. From the weights of the two orientation coefficients we calculated selectivity using Eq.  9 :

A was derived through permutation testing in which the design matrix was shuffled ( N  = 1000) and weights calculated. The non-permuted weights were ranked and compared with the permutation distribution, thus enabling calculation of the z-scored difference. To calculate group-level effects, cluster-based sign-flipping permutation testing ( N  = 1500) across electrodes and time was implemented in Fieldtrip 69 to determine whether the topographies differed between conditions.

All statistical tests were two-sided, and Bonferroni adjustments were used to correct for multiple comparisons where noted. Non-parametric sign permutation tests 69 , 70 were used to determine differences in the time courses of feature selectivity (Figs.  5 and 6 ) between conditions. The sign of the data was randomly flipped ( N  = 20,000), with equal probability, to create a null distribution. Cluster-based permutation testing was used to correct for multiple comparisons over the timeseries, with a cluster-form threshold of p  < 0.05 and significance threshold of p  < 0.05.

Reporting summary

Further information on research design is available in the  Nature Research Reporting Summary linked to this article.

Data availability

The EEG and behavioural data for both experiments are available at: https://osf.io/f9g6h . A reporting summary for this Article is available as a  Supplementary Information file .

Code availability

The code associated with this paper is available at: https://github.com/MatthewFTang/AttentionalBlinkForwardEncoding .

McLeod, P. Parallel processing and the psychological refractory period. Acta Psychologica 41 , 381–396 (1977).

Article   Google Scholar  

Welford, A. T. The ‘psychological refractory period’ and the timing of high-speed performance—a review and a theory. Br. J. Psychol. Gen. Sect. 43 , 2–19 (1952).

Article   ADS   Google Scholar  

Shapiro, K. L., Arnell, K. M. & Raymond, J. E. The attentional blink. Trends Cogn. Sci. 1 , 291–296 (1997).

Article   CAS   PubMed   Google Scholar  

Raymond, J. E., Shapiro, K. L. & Arnell, K. M. Temporary suppression of visual processing in an RSVP task: an attentional blink? J. Exp. Psychol.: Hum. Percept. Perform. 18 , 849–860 (1992).

CAS   Google Scholar  

Broadbent, D. E. & Broadbent, M. H. P. From detection to identification: response to multiple targets in rapid serial visual presentation. Percept. Psychophys. 42 , 105–113 (1987).

Williams, M. A., Visser, T. A. W., Cunnington, R. & Mattingley, J. B. Attenuation of neural responses in primary visual cortex during the attentional blink. J. Neurosci. 28 , 9890–9894 (2008).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Marois, R., Yi, D.-J. & Chun, M. M. The neural fate of consciously perceived and missed events in the attentional blink. Neuron 41 , 465–472 (2004).

Vogel, E. K., Luck, S. J. & Shapiro, K. L. Electrophysiological evidence for a postperceptual locus of suppression during the attentional blink. J. Exp. Psychol.: Hum. Percept. Perform. 24 , 1656–1674 (1998).

Garcia, J. O., Srinivasan, R. & Serences, J. T. Near-real-time feature-selective modulations in human cortex. Curr. Biol. 23 , 515–522 (2013).

Myers, N. E. et al. Testing sensory evidence against mnemonic templates. eLife 4 , e09000 (2015).

Article   PubMed   PubMed Central   Google Scholar  

Wolff, M. J., Jochim, J., Akyürek, E. G. & Stokes, M. G. Dynamic hidden states underlying working-memory-guided behavior. Nat. Neurosci. 20 , 864–871 (2017).

Tang, M. F., Smout, C. A., Arabzadeh, E. & Mattingley, J. B. Prediction error and repetition suppression have distinct effects on neural representations of visual information. eLife 7 , e33123 (2018).

Smout, C. A., Tang, M. F., Garrido, M. I. & Mattingley, J. B. Attention promotes the neural encoding of prediction errors. PLoS Biol. 17 , e2006812 (2019).

Foster, J. J., Sutterer, D. W., Serences, J. T., Vogel, E. K. & Awh, E. The topography of alpha-band activity tracks the content of spatial working memory. J. Neurophysiol. 115 , 168–177 (2016).

Article   PubMed   Google Scholar  

Sprague, T. C. & Serences, J. T. Attention modulates spatial priority maps in the human occipital, parietal and frontal cortices. Nat. Neurosci. 16 , 1879–1887 (2013).

Brouwer, G. J. & Heeger, D. J. Decoding and reconstructing color from responses in human visual cortex. J. Neurosci. 29 , 13992–14003 (2009).

McAdams, C. J. & Maunsell, J. H. R. Effects of attention on orientation-tuning functions of single neurons in macaque cortical area V4. J. Neurosci. 19 , 431–441 (1999).

Maunsell, J. H. R. Neuronal mechanisms of visual attention. Annu. Rev. Vis. Sci. 1 , 373–391 (2015).

Ester, E. F., Sutterer, D. W., Serences, J. T. & Awh, E. Feature-selective attentional modulations in human frontoparietal cortex. J. Neurosci. 36 , 8188–8199 (2016).

Treue, S. & Martinez Trujillo, J. C. Feature-based attention influences motion processing gain in macaque visual cortex. Nature 399 , 575–579 (1999).

Article   ADS   CAS   PubMed   Google Scholar  

Maunsell, J. H. R. & Treue, S. Feature-based attention in visual cortex. Trends Neurosci. 29 , 317–322 (2006).

Maloney, R. T., Jayakumar, J., Levichkina, E. V., Pigarev, I. N. & Vidyasagar, T. R. Information processing bottlenecks in macaque posterior parietal cortex: an attentional blink? Exp. Brain Res. 228 , 365–376 (2013).

Raymond, J. E., Shapiro, K. L. & Arnell, K. M. Similarity determines the attentional blink. J. Exp. Psychol.: Hum. Percept. Perform. 21 , 653–662 (1995).

Lagroix, H. E. P., Spalek, T. M., Wyble, B., Jannati, A. & Di Lollo, V. The root cause of the attentional blink: First-target processing or disruption of input control? Atten., Percept., Psychophys. 74 , 1606–1622 (2012).

Chun, M. M. & Potter, M. C. A two-stage model for multiple target detection in rapid serial visual presentation. J. Exp. Psychol.: Hum. Percept. Perform. 21 , 109–127 (1995).

Duncan, J., Ward, R. & Shapiro, K. Direct measurement of attentional dwell time in human vision. Nature 369 , 313–315 (1994).

Di Lollo, V., Kawahara, J. I., Shahab Ghorashi, S. M. & Enns, J. T. The attentional blink: Resource depletion or temporary loss of control? Psychological Res. 69 , 191–200 (2005).

Akyürek, E. G. et al. Temporal target integration underlies performance at lag 1 in the attentional blink. J. Exp. Psychol.: Hum. Percept. Perform. 38 , 1448–1464 (2012).

Google Scholar  

Akyürek, E. G. & Hommel, B. Target integration and the attentional blink. Acta Psychologica 119 , 305–314 (2005).

Asplund, C. L., Fougnie, D., Zughni, S., Martin, J. W. & Marois, R. The attentional blink reveals the probabilistic nature of discrete conscious perception. Psychological Sci. 25 , 824–831 (2014).

Dehaene, S. & Naccache, L. Towards a cognitive neuroscience of consciousness: basic evidence and a workspace framework. Cognition 79 , 1–37 (2001).

Nieuwenhuis, S. & de Kleijn, R. Consciousness of targets during the attentional blink: a gradual or all-or-none dimension? Atten., Percept., Psychophys. 73 , 364–373 (2011).

Visser, T. A. W. Expectancy-based modulations of lag-1 sparing and extended sparing during the attentional blink. J. Exp. Psychol.: Human Percept. Perform. 41 , 462–478 (2015).

PubMed   Google Scholar  

Fischer, J. & Whitney, D. Serial dependence in visual perception. Nat. Neurosci. 17 , 738–743 (2014).

Alais, D., Leung, J. & Van der Burg, E. Linear summation of repulsive and attractive serial dependencies: orientation and motion dependencies sum in motion perception. J. Neurosci. 37 , 4381–4390 (2017).

Dickinson, J. E., Almeida, R. A., Bell, J. & Badcock, D. R. Global shape aftereffects have a local substrate: a tilt aftereffect field. J. Vision 10 , 1–12 (2010).

Dickinson, J. E., Morgan, S. K., Tang, M. F. & Badcock, D. R. Separate banks of information channels encode size and aspect ratio. J. Vision 17 , 27 (2017).

Tang, M. F., Dickinson, J. E., Visser, T. A. W. & Badcock, D. R. The broad orientation dependence of the motion streak aftereffect reveals interactions between form and motion neurons. J. Vision 15 , 4 (2015).

Visser, T. A. W., Bischof, W. F. & Di Lollo, V. Rapid serial visual distraction: task-irrelevant items can produce an attentional blink. Percept. Psychophys. 66 , 1418–1432 (2004).

Visser, T. A. W., Tang, M. F., Badcock, D. R. & Enns, J. T. Temporal cues and the attentional blink: a further examination of the role of expectancy in sequential object perception. Atten. Percept. Psychophys. 76 , 2212–2220 (2014).

Visser, T. A. W. T1 difficulty and the attentional blink: Expectancy versus backward masking. Q. J. Exp. Psychol. 60 , 936–951 (2007).

Brehaut, J. C., Enns, J. T. & Di Lollo, V. Visual masking plays two roles in the attentional blink. Percept. Psychophys. 61 , 1436–1448 (1999).

Wyble, B., Bowman, H. & Nieuwenstein, M. The attentional blink provides episodic distinctiveness: Sparing at a cost. J. Exp. Psychol.: Hum. Percept. Perform. 35 , 787–807 (2009).

Rees, G. G., Friston, K. K. & Koch, C. C. A direct quantitative relationship between the functional properties of human and macaque V5. Nat. Neurosci. 3 , 716–723 (2000).

Shapiro, K., Driver, J., Ward, R. & Sorensen, R. E. Priming from the attentional blink: a failure to extract visual tokens but not visual types. Psychological Sci. 8 , 95–100 (1997).

Bays, P. M., Catalao, R. F. G. & Husain, M. The precision of visual working memory is set by allocation of a shared resource. J. Vision 9 , 7 (2009).

Johnston, J. C., Hochhaus, L. & Ruthruff, E. Repetition blindness has a perceptual locus: evidence from online processing of targets in RSVP streams. J. Exp. Psychol.: Hum. Percept. Perform. 28 , 477–489 (2002).

Folk, C. L., Remington, R. W. & Johnston, J. C. Involuntary covert orienting is contingent on attentional control settings. J. Exp. Psychol.: Hum. Percept. Perform. 18 , 1030 (1992).

Hanslmayr, S., Gross, J., Klimesch, W. & Shapiro, K. L. The role of alpha oscillations in temporal attention. Brain Res. Rev. 67 , 331–343 (2011).

Pelli, D. G. The VideoToolbox software for visual psychophysics: transforming numbers into movies. Spat. Vis. 10 , 437–442 (1997).

Brainard, D. H. The psychophysics toolbox. Spat. Vis. 10 , 433–436 (1997).

Kanwisher, N. G. Repetition blindness: type recognition without token individuation. Cognition 27 , 117–143 (1987).

Tang, M. F., Badcock, D. R. & Visser, T. A. W. Training and the attentional blink: limits overcome or expectations raised? Psychonomic Bull. Rev. 21 , 406–411 (2014).

Willems, C., Herdzin, J. & Martens, S. Individual differences in temporal selective attention as reflected in pupil dilation. PLoS ONE 10 , e0145056–13 (2015).

Harris, J. A., McMahon, A. R. & Woldorff, M. G. Disruption of visual awareness during the attentional blink is reflected by selective disruption of late-stage neural processing. J. Cogn. Neurosci. 25 , 1863–1874 (2013).

Zauner, A. et al. Alpha entrainment is responsible for the attentional blink phenomenon. NeuroImage 63 , 674–686 (2012).

Wierda, S. M., van Rijn, H., Taatgen, N. A. & Martens, S. Distracting the mind improves performance: an ERP study. PLoS ONE 5 , e15024 (2010).

Article   ADS   PubMed   PubMed Central   CAS   Google Scholar  

Oostenveld, R. & Praamstra, P. The five percent electrode system for high-resolution EEG and ERP measurements. Clin. Neurophysiol. 112 , 713–719 (2001).

Delorme, A. & Makeig, S. EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. J. Neurosci. Methods 134 , 9–21 (2004).

Keil, A. et al. Committee report: publication guidelines and recommendations for studies using electroencephalography and magnetoencephalography. Psychophysiology 51 , 1–21 (2014).

Bigdely-Shamlo, N., Mullen, T., Kothe, C., Su, K.-M. & Robbins, K. A. The PREP pipeline: standardized preprocessing for large-scale EEG analysis. Front. Neuroinformatics 9 , B153 (2015).

Chaumon, M., Bishop, D. V. M. & Busch, N. A. A practical guide to the selection of independent components of the electroencephalogram for artifact correction. J. Neurosci. Methods 250 , 47–63 (2015).

Cheadle, S. et al. Adaptive gain control during human perceptual choice. Neuron 81 , 1429–1441 (2014).

Kok, P., Mostert, P. & De Lange, F. P. Prior expectations induce prestimulus sensory templates. Proc. Natl Acad. Sci. USA 114 , 10473–10478 (2017).

Blankertz, B., Lemm, S., Treder, M., Haufe, S. & Müller, K.-R. Single-trial analysis and classification of ERP components—a tutorial. NeuroImage 56 , 814–825 (2011).

Ledoit, O. & Wolf, M. A well-conditioned estimator for large-dimensional covariance matrices. J. Multivar. Anal. 88 , 365–411 (2004).

Article   MathSciNet   MATH   Google Scholar  

Sprague, T. C., Boynton, G. M. & Serences, J. T. Inverted encoding models estimate sensible channel responses for sensible models. bioRxiv 91 , 642710 (2019).

Sprague, T. C. et al. Inverted encoding models assay population-level stimulus representations, not single-unit neural tuning. eNeuro 5 , ENEURO.0098–18.2018 (2018).

Oostenveld, R., Martin, Maris, E. & Schoffelen, J.-M. FieldTrip: open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data. Comput. Intell. Neurosci. 2011 , 1–9 (2010).

Maris, E. & Oostenveld, R. Nonparametric statistical testing of EEG- and MEG-data. J. Neurosci. Methods 164 , 177–190 (2007).

Download references


The authors would like to thank Brad Wyble for comments on a pre-print of this manuscript. This work was supported by the Australian Research Council (ARC) Centre of Excellence for Integrative Brain Function (CE140100007). MFT was supported by the NVIDIA corporation who donated a TITAN V GPU. EA was supported by an ARC Discovery Project (DP170100908). JTE was supported by a Discovery Grant from the Natural Sciences and Engineering Research Council (Canada). TAWV was supported by an ARC Discovery Project (DP120102313). JBM was supported by an ARC Australian Laureate Fellowship (FL110100103), an ARC Discovery Project (DP140100266), and by the Canadian Institute for Advanced Research (CIFAR).

Author information

Authors and affiliations.

Queensland Brain Institute, The University of Queensland, Brisbane, QLD, Australia

Matthew F. Tang, Lucy Ford & Jason B. Mattingley

Australian Research Council Centre of Excellence for Integrative Brain Function, Victoria, Australia

Matthew F. Tang, Ehsan Arabzadeh & Jason B. Mattingley

Eccles Institute of Neuroscience, John Curtin School of Medical Research, The Australian National University, Canberra, ACT, Australia

Matthew F. Tang & Ehsan Arabzadeh

Department of Psychology, The University of British Columbia, Vancouver, BC, Canada

James T. Enns

School of Psychological Sciences, The University of Western Australia, Perth, WA, Australia

James T. Enns & Troy A. W. Visser

School of Psychology, The University of Queensland, Brisbane, QLD, Australia

Jason B. Mattingley

Canadian Institute for Advanced Research (CIFAR), Toronto, Canada

You can also search for this author in PubMed   Google Scholar


M.F.T.—conception, data gathering, data analysis, original draft, final approval. L.F.—data gathering, draft editing, final approval. E.A.—draft editing, final approval. J.T.E.—draft editing, final approval. T.A.W.V.—conception, draft editing, final approval. J.B.M.—conception, draft editing, final approval.

Corresponding author

Correspondence to Matthew F. Tang .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Peer review information Nature Communications thanks Edward Ester and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information, peer review file, reporting summary, description of additional supplementary files, supplementary movie 1, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Tang, M.F., Ford, L., Arabzadeh, E. et al. Neural dynamics of the attentional blink revealed by encoding orientation selectivity during rapid visual presentation. Nat Commun 11 , 434 (2020). https://doi.org/10.1038/s41467-019-14107-z

Download citation

Received : 16 April 2019

Accepted : 10 December 2019

Published : 23 January 2020

DOI : https://doi.org/10.1038/s41467-019-14107-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Visual temporal attention from perception to computation.

  • Rachel N. Denison

Nature Reviews Psychology (2024)

Expectation violations enhance neuronal encoding of sensory information in mouse primary visual cortex

  • Matthew F. Tang
  • Ehsan Kheradpezhouh
  • Ehsan Arabzadeh

Nature Communications (2023)

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

rapid serial visual presentation

A review of rapid serial visual presentation-based brain-computer interfaces


  • 1 Faculty of Computing and Engineering, Ulster University, Belfast, United Kingdom.
  • PMID: 29099388
  • DOI: 10.1088/1741-2552/aa9817

Rapid serial visual presentation (RSVP) combined with the detection of event-related brain responses facilitates the selection of relevant information contained in a stream of images presented rapidly to a human. Event related potentials (ERPs) measured non-invasively with electroencephalography (EEG) can be associated with infrequent targets amongst a stream of images. Human-machine symbiosis may be augmented by enabling human interaction with a computer, without overt movement, and/or enable optimization of image/information sorting processes involving humans. Features of the human visual system impact on the success of the RSVP paradigm, but pre-attentive processing supports the identification of target information post presentation of the information by assessing the co-occurrence or time-locked EEG potentials. This paper presents a comprehensive review and evaluation of the limited, but significant, literature on research in RSVP-based brain-computer interfaces (BCIs). Applications that use RSVP-based BCIs are categorized based on display mode and protocol design, whilst a range of factors influencing ERP evocation and detection are analyzed. Guidelines for using the RSVP-based BCI paradigms are recommended, with a view to further standardizing methods and enhancing the inter-relatability of experimental design to support future research and the use of RSVP-based BCIs in practice.

Publication types

  • Brain / physiology*
  • Brain-Computer Interfaces* / trends
  • Electroencephalography / methods*
  • Electroencephalography / trends
  • Evoked Potentials, Visual / physiology*
  • Photic Stimulation / methods*
  • Time Factors

Sage Edge Logo, links to Sage Edge Site for Schwartz and Krantz Sensation and Perception


  • Illustration

Attention across time rather than space has been studied with a paradigm called the rapid serial visual presentation (RSVP) paradigm. In RSVP, a series of stimuli appear rapidly in time at the same point in visual space. Indeed, the stimuli may appear as fast as 10 items per second. The stimuli are usually letters or photographs. The task of the participant is to determine when a particular stimulus appears and to press a button or key as fast as possible after that stimulus occurs. Thus, the participant might be following a series of letters flashed 10 per second. The participant’s task is to press a response button every time the letter S occurs.

In this activity, you can manipulate some variables and see how that impacts your ability to pick a stimulus out of a rapid serial visual presentation.

Full Screen Mode

To see the illustration in full screen, which is recommended, press the Full Screen button, which appears at the top of the page.

Illustration Tab

Below is a list of the ways that you can alter the illustration. The settings include the following:

Font Size : change the font size of the letters in the sequence. Presentation Rate : how fast will the stimuli be presented (ms). Target Color : is the target the same or a different (red) color from the rest of the letters. Target Probability : the probability that a target will be in the sequence.a

Pressing this button restores the settings to their default values and changes the screen back to the sentence.

Proceed to Quiz

# Rapid Serial Visual Presentation (RSVP) Task Walkthrough

Rapid serial visual presentation (RSVP) task is a method of displaying texts or images in the central display that streams in quick succession. The observers are required to read or identify the target as quickly as possible in the stream and are notably used in the field of linguistics, neuroscience, and cognitive psychology. This paradigm is commonly used to examine individual reading rates, assessing visual impairments, dyslexia, and attention research (e.g., attentional blink, repetition blindness). There are many different ways to set the RSVP task depending on the researcher’s investigation, but the current walkthrough will create a task that requires detecting word repetition as quickly as possible (see Figure 1 below).


Unlike the previous experiments that required manually assigning texts and images, we will use the Data Frame feature in Labvanced to allow the program to reference the CSV datasheet and present the words according to its prepared structure. The Labvanced Data Frame is an ongoing improving feature in the platform that will allow more efficient stimuli presentation and error avoidance during the stimuli setup. Using the Data Frame feature, the current walkthrough will proceed into four parts, comprising:

  • Data Frame preparation
  • Frames setup
  • Stimuli setup
  • Events setup

As indicated in the first figure above, the trial display sequence will comprise of:

  • Frame 1: 500ms of the fixation cross
  • Frame 2: 350ms of the first word presentation (referenced by the Data Frame)
  • Frame 3: 150ms of a blank screen
  • Frame 4: 350ms of the second word
  • Frame 5: 150ms of a blank screen
  • Frame 6: 350ms of the third word
  • Frame 7: 150ms of a blank screen
  • Frame 8: 350ms of the fourth word
  • Frame 9: 150ms of a blank screen
  • Frame 10: 350ms of the fifth word

open in new window . With that context and introduction, let’s dive into creating the Data Frame of this study.

# Part I: Data Frame Preparation and variable setup

To prepare the Data Frame in Labvanced, we will prepare a separate Google Sheet. Here, we will list the strings of words that we will present, and the figure below will represent the words that will use for the current task build (see Figure 2 below). It is worth noting that each word is presented an equal number of times as a target. The row indicates the number of trials we will set, and each column represents each frame to display the texts. For example, we will show the serial presentation of pumpkin-plane-flower-flower-anvil in the first trial, with the second flower text serving as the target.


Proceeding to the Labvanced Task Editor, we will open a new canvas frame and create some variables. First, click Add Variable on the top-right display and proceed with the following options depicted in the figure below (see Figure 3). Afterward, click on the green Edit Data Frame button and select Upload 2D CSV Data. This will open the Labvanced File Storage , which will transfer the CSV data from the Google Sheet to this repository. After the transfer, select the same data (named DataFrame.csv in this case) to complete the Data Frame preparation. Proceed to click Ok and save the Data Frame in the Data Frame variable.


Before proceeding to the next part, we will create a couple more variables. Similar to the previous steps, locate and click on Add Variable. We will name the variable Word1 with the Data Type set as String (see Figure 4 below). We will replicate this step four more times for variables Word2, Word3, Word4, and Word5. Later in this walkthrough, we will assign each Data Frame column to each variable and link to the frames to display the RSVP texts.


# Part II: Frames setup

The second part of this walkthrough involves frames creation that the observers will view during participation. Again, the current task will follow the general procedure mentioned back in Figure 1 comprised of 10 frames. As depicted, a trial will begin with a fixation cross (frame 1) for 500 ms, followed by the presentation of five combinations of text screen (e.g., frame 2) for 350 ms followed by the blank intertrial interval (e.g., frame 3) for 150 ms. To create these frames, we will click on the Canvas button ten times at the bottom of the Labvanced display (see Figure 5A) and name each frame (see Figure 5B) as follows:

  • Frame 1: fixation
  • Frame 2: word1
  • Frame 3: blank
  • Frame 4: word2
  • Frame 5: blank
  • Frame 6: word3
  • Frame 7: blank
  • Frame 8: word4
  • Frame 9: blank
  • Frame 10: word5

Afterward, we will input 20 for the # Trials on Trials & Conditions (see Figure 5C) as we present twenty text stream trials as we determined from the Data Frame setup. With this, we now have all the necessary frames to show the trial sequence. In the next part, we will add all the stimuli in each canvas that the participants will view during their participation.


# Part III: Stimuli setup (fixation cross, text presentation, and frame duration)

With all the frames we have prepared in the last part, we will now set the individual stimulus in each frame, starting with the fixation cross in the 1st frame. To create a fixation cross, we can start by clicking on the Display Text (see Figure 6A) to implement the textbox in the canvas. Here, we can type in the + in the box with a white-colored font size 36 and position it in the center of the display. We could also type in the specific X & Y frame coordinates at the Object Properties on the right side for the precise center position. If we want to upload the image containing the fixation cross or different stimuli, the Media option (see Figure 6B) can present images, videos, audios, etc. Lastly, we will set 500 ms presentation duration by inserting numeric info in the duration box below the frame name.


# Frames 2, 4, 6, 8, 10 (text presentation)

In the second frame, we will present the text linked to the Data Frame. Similar to the first frame, we will set the text display at the central text position. Delete the default message inside the text box, and click the Insert Variable icon (see Figure 7 below). Here, we will insert the variable (e.g., word1) we have established in the previous part corresponding to the identical frame name (e.g., insert word1 variable in the word1 frame). Lastly, we will set 350 in the duration box for the presentation time. For the remaining blank frames (3, 5, 7, 9), we will only place 150 in the duration box as these frames only serve as the intertrial interval between each text presentation.


# Part IV: Events setup

In this part, we establish a logical sequence to the Labvanced to execute specific actions in each frame (e.g., frame duration and response assessment). Creating this sequence of actions is known as Events in the Labvanced platform. Before proceeding, we will create a new Reaction time variable to measure the keypress response time when the observer detects the text repetition. Click Add Variable on the top-right display and proceed with the following options depicted in the figure below (see Figure 8).


Proceeding forward, we will create the first Event linking the Data Frame to each word variable established in the previous part. To create this Event, click on the Events on the top right next to the Variables and select Frame Event (on this frame only) . In the first window dialogue, we can name the Events as “Data Frame Link” (Figure 9A) and click next to proceed to the Trigger option. Here, the trigger type is Trial and Frame Trigger → Frame Start . With this trigger, we will add Action → Variable Actions → Set/Record variable and select the word1 variable on the left side. On the right, select Variable → Select Value from Data Frame → Data Frame . We will select the Trial_Nr variable in the row option and place the numeric value of “1” in the column option. We will repeat the steps to set the word2, word3, word4, and word5 variables to the Data Frame. Please note that each variable should be set with a corresponding column number (e.g., word2 with column 2 in the Data Frame event). It is noteworthy to mention that the Trial_Nr in the column option will present the word sequence in the exact order of rows following the Data Frame, whereas Trial_Id will randomize the rows.


Afterward, we will create a new Event for the reaction time recording. For this, proceed with Trial Event (on each frame) . In the first window dialogue, we can name the Events as “ Reaction time ” (Figure 10A) and click next to proceed to the Trigger option. Here, the trigger type is the Keyboard trigger with Space as the allowable response. Moving to the Action, add Action → Variable Actions → Set/Record variable and select the RT variable on the left side. On the right, proceed with Trigger (Keyboard) → Time From Frame Onset (see Figure 10B below). With this, we are asking the program to record the keyboard reaction to the target measured in milliseconds from the frame onset. In the same Action window, add Action → Variable Actions → Set/Record variable and select the Frame Name variable on the left side. On the right, proceed with Frame/Task/Object → Frame → Frame Name (see Figure 10C below). Click Finish at the bottom of the window to complete the Events setup for this study.


open in new window and other experimental paradigms. With that said, on behalf of the Labvanced team, we hope this walkthrough provides an essential foundation for your study construction.

By I sak Kim Scientific Support Manager [email protected]

Rapid Serial Visual Presentation (RSVP)


When we read, we spend a lot of our time moving our eyes from one group of letters to the next. Such eye movements are called saccades (from the French word for jerk) and are typically very rapid. Typically, the eyes are still (fixated) for approximately 250 ms (it could be much less or much more depending on your familarity with the word and whether the word agrees with your expectations) and then followed by a saccade which can take another 20 to 30 ms. At first glance, it looks like most of the time is spent reading (~250 ms per fixation) compared to movement (~25 ms per saccade). But the story is not quite so simple. Much of the fixation time is actually spend planning the subsequent saccade and is not involved in extracting meaning from the word.

How fast could you read, and still have good comprehension, if you did not have to plan and make all the saccades? You can answer that question using rapid visual serial presentation, or RSVP. RSVP removes all the saccades by presenting each word in the same location, but at a slightly different time. Because each word appears in the same place, there is no need for the eyes to move. While 250 to 300 words per minute is an average reading speed for relatively easy prose, you can, with virtually no effort, crank that up to over 400 words per minutes using RSVP. With a little practice, you can push your reading speed quite a bit higher with RSVP while retaining reasonably good comprehension. JUST DON'T BLINK!

The Activity:

In this activity, you will read using the RSVP method. Look directly at the plus sign below -- each word will be centered there. When you click the RSVP button a new word will be presented every 150 ms -- that is 400 words per minute. Can you read the message? If so, slide the rate control to the left to make the words come even faster. How fast can you read without much practice using RSVP?

Rate of presentation: One word every 50 ms One word every 200 ms

R apid S erial V isual P resentation

The rsvp training program.

Rapid serial visual presentation (RSVP) is an innovative approach to improving one of the cognitive processes associated with reading fingerspelled words, the rapid processing of serially presented visual information. The sequencing of the RSVP study set allows signers to develop skills incrementally.

rapid serial visual presentation

What is the RSVP Training Program?

The RSVP training program consists of sequences of letters from the printed Roman alphabet that are flashed successively onscreen at approximately twice the speed of a fingerspelled word. This speed has been experimentally determined to be beneficial to learning fingerspelled word recognition skills.

The training program is designed to be used after reading the RSVP book, and before studying the video word lists and monologues. Before starting the training, read the full description of how this program works here .

Using the RSVP Training Program

Ready to get started.

These Apps Could Triple Your Reading Speed

They show you words one-by-one, at incredible speeds—up to 1,000 words per minute

Colin Schultz

Colin Schultz


Most Americans spend just under six hours per week reading books . At the average reading speed of around 300 words per minute , and with an average novel having 64,000 words , this comes out to about a book and a half each week. But a slate of new technologies*, including  Spritz and Velocity , think you can do better. By doing little more than reading through one of these technologies, the companies say, you could push your rate up to four, even five books per week.

The way that we tend to read—top-to-bottom, left-to-right, absorbing the words and sentences on a page—is not ideal, if speed is the goal. There's another way to read, known as rapid serial visual presentation , that speeds up the process. In rapid serial visual presentation, words are shown one-by-one in quick succession, rather than being all on the page in a block of text. This makes us read much more quickly, says 9 to 5 Mac :

Studies have shown that using Rapid Serial Visual Presentation helps increase reader’s reading speed because it forces the reader to stop reading out loud inside their head (subvocalization), and suppresses the tendency for eyes to backtrack the line while reading and searching for the end of the sentence. 

It's easy to be skeptical that something so simple would have such a big effect on reading speed, but playing with these samples from Spritz certainly makes the idea seem convincing:

Pushed up to 500, the process feels less like reading and more like absorbing the text. Spritz and Velocity each offer presentation speeds up to 1,000 words per minute.

Rapid serial visual presentation does have its downsides, though. For one, paying attention to text displayed this way can be tiring. Then, there's a thing called “ attentional blink ,” where if the words are presented too quickly together the brain will skip a beat , missing some of the text.

Maybe software like Spritz and Velocity aren't ideal for absorbing novels or other pleasure reading. And, it seems likely they'd be less useful for challenging texts, like textbooks or papers with lots of complicated ideas or jargon. But we spend more than a quarter of each day dealing with email , on average, and in 2012 Americans wrote roughly 40,000 words of email each . Maybe blinkered text and rapid serial visual presentation could cut down on the email slog, and give us all more time to get things done?

H/T Huffington Post

*The text was edited to reflect the fact that Spritz is not a standalone application, but a technology, says Krystina Puelo on behalf of Spritz: "[Spritz is] a text streaming technology that can be integrated onto mobile devices, wearables, apps, websites, etc."

Get the latest stories in your inbox every weekday.

Colin Schultz

Colin Schultz | | READ MORE

Colin Schultz is a freelance science writer and editor based in Toronto, Canada. He blogs for Smart News and contributes to the American Geophysical Union. He has a B.Sc. in physical science and philosophy, and a M.A. in journalism.

rapid serial visual presentation

  • The Open University
  • Guest user / Sign out
  • Study with The Open University

My OpenLearn Profile

Personalise your OpenLearn profile, save your favourite content and get recognition for your learning

About this free course

Become an ou student, download this course, share this free course.


Start this free course now. Just create an account and sign in. Enrol and complete the course for a free statement of participation or digital badge if available.

2.4 Rapid serial visual presentation

It has been known for a long time that backward masking can act in one of two ways: integration and interruption (Turvey, 1973). When the SOA between target and mask is very short, integration occurs; that is, the two items are perceived as one, with the result that the target is difficult to report, just as when one word is written over another. Of more interest is masking by interruption, which is the type we have been considering in the previous section. It occurs at longer SOAs, and interruption masking will be experienced even if the target is presented to one eye and the mask to the other. This dichoptic (two-eyed) interaction must take place after information from the two eyes has been combined in the brain; it could not occur at earlier stages. In contrast, integration masking does not occur dichoptically when target and mask are presented to separate eyes, so presumably occurs quite early in analysis, perhaps even on the retina. On this basis, Turvey (1973) described integration as peripheral masking, and interruption as central masking, meaning that it occurred at a level where more complex information extraction was taking place.

Another early researcher in the field (Kolers, 1968) described the effect of a central (interruption) mask by analogy with the ‘processing’ of a customer in a shop. If the customer (equivalent to the target) comes into the shop alone, then s/he can be fully processed, even to the extent of discussing the weather and asking about family and holidays. However, if a second customer (i.e. a mask) follows the first, then the shopkeeper has to cease the pleasantries, and never learns about the personal information. The analogy was never taken further, and of course it is unwise to push an analogy too far. Nevertheless, one is tempted to point out that the second customer is still kept waiting for a while. Where does that thought take us? It became possible to investigate the fate of following stimuli, in fact whole queues of stimuli, with the development of a procedure popularised by Broadbent (Broadbent and Broadbent, 1987), who, like Treisman, had moved on from auditory research. The procedure was termed Rapid Serial Visual Presentation, in part, one suspects, because that provided the familiar abbreviation RSVP; participants were indeed asked to repondez s'il vous plait with reports of what they had seen.

Unlike the traditional two-stimulus, target/mask pairing, Rapid Serial Visual Presentation (RSVP) displayed a series of stimuli in rapid succession, so each served as a backward mask for the preceding item. SOAs were such that a few items could be reported, but with difficulty. Typical timings would display each item for 100 ms, with a 20 ms gap between them; the sequence might contain as many as 20 items. Under these conditions stimuli are difficult to identify, and participants are certainly unable to list all 20; they are usually asked to look out for just two. In one variation, every item except one is a single black letter. The odd item is a white letter, and this is the first target; the participant has to say at the end of the sequence what the white letter had been. One or more items later in the sequence (i.e. after the white target), one of the remaining black letters may be an ‘X’. As well as naming the white letter, the participant has to say whether or not X was present in the list. These two targets (white letter and black X) are commonly designated as T1 and T2. Notice that the participant has two slightly different tasks: for T1 (which will certainly be shown) an unknown letter has to be identified, whereas for T2 the task is simply to say whether a previously designated letter was presented. These details, together with a graph of typical results, are shown in Figure 3 .

Figure 3

As can be seen from the graph in Figure 3b , T2 (the X) might be spotted if it is the item immediately following T1, but thereafter it is less likely that it will be detected unless five or six items separate the two. What happens when it is not detected? As you may be coming to expect, the fact that participants do not report T2 does not mean that they have not carried out any semantic analysis upon it. Vogel et al. (1998) conducted an RSVP experiment that used words, rather than single letters. Additionally, before a sequence of stimuli was presented, a clear ‘context’ word was displayed, for a comfortable 1 second. For example, the context word might be shoe , then the item at T2 could be foot . However, on some presentations T2 was not in context; for example, rope . While participants were attempting to report these items, they were also being monitored using EEG (electro-encephalography). The pattern of electrical activity measured via scalp electrodes is known to produce a characteristic ‘signature’, when what might be called a mismatch is encountered. For example, if a participant reads the sentence He went to the café and asked for a cup of tin , the signature appears when tin is reached. The Vogel et al. (1998) participants produced just such an effect with sequences such as shoe – rope , even when they were unable to report seeing rope . This sounds rather like some of the material discussed earlier, where backward masking prevented conscious awareness of material that had clearly been detected. However, the target in the RSVP situation appears to be affected by something that happened earlier (i.e. T1), rather than by a following mask. The difference needs exploring and explaining.

Presumably something is happening as a result of processing the first target (T1), which temporarily makes awareness of the second (T2) very difficult. Measurements show that for about 500 to 700 ms following T1, detection of T2 is lower than usual. It is as if the system requires time to become prepared to process something fresh, a gap that is sometimes known as a refractory period , but that in this context is more often called the attentional blink , abbreviated to AB. While the system is ‘blinking’ it is unable to attend to new information.

Time turns out not to be the only factor in observing an AB effect (‘AB effect’ will be used as a shorthand way of referring to the difficulty of reporting T2). Raymond et al. (1992) used a typical sequence of RSVP stimuli, but omitted the item immediately following the first target. In other words, there was a 100 ms gap, rather than another item following. Effectively, this meant that the degree of backward masking was reduced, and not surprisingly resulted in some improvement in the report rate for T1. Very surprisingly, it produced a considerable improvement in the reporting of T2; the AB effect had vanished (see Figure 4a ). How did removing the mask for one target lead to an even larger improvement for another target that was yet to be presented? To return to our earlier analogy, if the shopkeeper is having some trouble in dealing with the first customer, then the second is kept waiting and suffers. That doesn't explain how the waiting queue suffers (if it were me I should probably chat to the person behind, and forget what I had come for), but that question was also addressed by removing items from the sequence.

Giesbrecht and Di Lollo (1998) removed the items following T2, so that it was the last in the list; again, the AB effect disappeared (see Figure 4b ). So, no matter what was going on with T1, T2 could be seen, if it was not itself masked. To explain this result, together with the fact that making T1 easier to see also helps T2, Giesbrecht and Di Lollo developed a two-stage model of visual processing. At Stage 1, a range of information about target characteristics is captured in parallel: identity, size, colour, position and so on. In the second stage, they proposed, serial processes act upon the information, preparing it for awareness and report. While Stage 2 is engaged, later information cannot be processed, so has to remain at Stage 1. Any kind of disruption to T1, such as masking, makes it harder to process, so information from T2 is kept waiting longer. This has little detrimental impact upon T2 unless it too is masked by a following stimulus (I don't forget what I came to buy, if there is no-one else in the queue to chat with). When T2 is kept waiting it can be overwritten by the following stimulus. The overwriting process will be damaging principally to the episodic information; an item cannot be both white and black, for example. However, semantic information may be better able to survive; there is no reason why shoe and rope should not both become activated. Consequently, even when there is insufficient information for Stage 2 to yield a fully processed target, it may nevertheless reveal its presence through priming or EEG effects. There is an obvious similarity between this account and Coltheart's (1980) suggestion: both propose the need to join semantic and episodic detail.

Figure 4

Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .


Here are 2 public repositories matching this topic..., bourgeoisbear / vim-rsvp.

Rapid Serial Visual Presentation Reader for Vim

  • Updated Jan 29, 2022

BubbsTheSupreme / RSVPReaderCLI

This is a Rapid Serial Visual Presentation (RSVP) reading program.

  • Updated Feb 2, 2023

Improve this page

Add a description, image, and links to the rapid-serial-visual-presentation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the rapid-serial-visual-presentation topic, visit your repo's landing page and select "manage topics."

The attentional blink: A review of data and theory

  • Tutorial Reviews
  • Published: November 2009
  • Volume 71 , pages 1683–1700, ( 2009 )

Cite this article

rapid serial visual presentation

  • Paul E. Dux 1 , 2 &
  • René Marois 1  

12k Accesses

410 Citations

4 Altmetric

Explore all metrics

Under conditions of rapid serial visual presentation, subjects display a reduced ability to report the second of two targets(Target2; T2) in a stream of distractors if it appearswithin200-500 msec of Target 1 (Tl). This effect. known as the attentional blink(AB),has been central in characterizing the limits of humans’ ability to consciously perceive stimuli distributed across time. Here, we review theoretical accounts of the AB and examine how they explain key findings in the literature. We conclude that the AB arises from attentional demands of Tl for selection, working memory encoding, episodic registration,and response selection, which prevents this high-level central resource from being applied to T2 at shortT1-T2 lags. Tl processing also transiently impairs the redeployment of these attentional resources to subsequent targets and the inhibition of distractors that appear in close temporal proximity to T2. Although these findings are consistent with a multifactorial account of the AB,they can also be largely explained by assuming that the activation of these multiple processes depends on a common capacity-limited attentional process for selecting behaviorally relevant events presented among temporally distributed distractors. Thus, at its core, the attentional blink may ultimately reveal the temporal limits of the deployment of selective attention.

Article PDF

Download to read the full article text

Similar content being viewed by others

The attentional blink: why does lag-1 sparing occur when the dependent measure is accuracy, but lag-1 deficit when it is rt, memory search for the first target modulates the magnitude of the attentional blink.

rapid serial visual presentation

An attentional blink in the absence of spatial attention: a cost of awareness?

Avoid common mistakes on your manuscript.

Akyürek, E. G., Hommel, B. , & Jolicœur, P. (2007). Direct evidence for a role of working memory in the attentional blink. Memory & Cognition , 35 , 621–627.

Article   Google Scholar  

Anderson, A. K. , & Phelps, E. A. (2001). Lesions of the human amygdala impair enhanced perception of emotionally salient events. Nature , 411 , 305–309.

Article   PubMed   Google Scholar  

Anderson, J. R. (2007). How can the human mind occur in the physical universe? New York: Oxford University Press.

Book   Google Scholar  

Arend, L., Johnston, S. , & Shapiro, K. (2006). Task-irrelevant visual motion and flicker attenuate the attentional blink. Psychonomic Bulletin & Review , 13 , 600–607.

Arnell, K. M. , & Duncan, J. (2002). Separate and shared sources of dual-task cost in stimulus identification and response selection. Cognitive Psychology , 44 , 105–147.

Arnell, K. M. , & Jenkins, R. (2004). Revisiting within-modality and cross-modality attentional blinks:Effects of target-distractor similarity. Perception & Psychophysics , 66 , 1147–1161.

Arnell, K. M. , & Jolicœur, P. (1999). The attentional blink across stimulus modalities: Evidence for central processing limitations. Journal of Experimental Psychology: Human Perception & Performance , 25 , 630–648.

Arnell, K. M., Killman, K. V. , & Fuavz, D. (2007). Blinded by emotion: Target misses follow attention capture by arousing distractors in RSVP. Emotion , 7 , 465–477.

Arnell, K. M. , & Larson, J. M. (2002). Cross-modality attentional blinks without preparatory task-set switching. Psychonomic Bulletin & Review , 9 , 497–506.

Awh, E., Serences, J., Laurey, P., Dhaliwal, H., van der Jagt, T. , & Dassonville, P. (2004). Evidence against a central bottleneck during the attentional blink: Multiple channels for configural and featural processing. Cognitive Psychology , 48 , 95–126.

Baars, B. (1989). A cognitive theory of consciousness . New York: Cambridge University Press.

Google Scholar  

Bachmann, T. , & Hommuk, K. (2005). How backward masking becomes attentional blink: Perception of successive in-stream targets. Psychological Science , 16 , 740–742.

Baddeley, A. D., Thomson, N. , & Buchanan, M. (1975). Word length and the structure of short-term memory. Journal of Verbal Learning & Verbal Behavior , 14 , 575–589.

Berridge, C. w. , & Waterhouse, B. D. (2003). The locus coeruleus-noradrenergic system: Modulation of behavioral state and state dependent cognitive processes. Brain Research Reviews , 42 , 33–84.

Bowman, H. , & Wyble, B. P. (2007). The simultaneous type, serial token model of temporal attention and working memory. Psychological Review , 114 , 38–70.

Broadbent, D. E. (1958). Perception and communication . London: Pergamon.

Broadbent, D. E. , & Broadbent, M. H. P. (1987). From detection to identification: Response to multiple targets in rapid serial visual presentation. Perception & Psychophysics , 42 , 105–113.

Bundesen, C. (1990). A theory of visual attention. Psychological Review , 97 , 523–547.

Chartier, S., Cousineau, D. , & Charbonneau, D. (2004). A connectionist model of the attentional blink effect during a rapid serial visual task. In Proceedings of the 6th International Conference on Cognitive Modelling (pp. 64–69). Mahwah, NJ: Erlbaum.

Chua, F. K., Goh, J. , & Hon, N. (2001). Nature of codes extracted during the attentional blink. Journal of Experimental Psychology: Human Perception & Performance , 27 , 1229–1242.

Chun, M. M. (1997a). Temporal binding errors are redistnbuted by the attentional blink. Perception & Psychophysics , 59 , 1191–1199.

Chun, M. M. (1997b). Types and tokens in visual processing: A double dissociation between the attentional blink and repetition blindness. Journal of Experimental Psychology: Human Perception & Performance , 23 , 738–755.

Chun, M. M. , & Potter, M. C. (1995). A two-stage model for multiple target detection in rapid serial visual presentation. Journal of Experimental Psychology: Human Perception & Performance , 21 , 109–127.

Chun, M. M. , & Potter, M. C. (2001). The attentional blink and task-switching. In K. Shapiro (Ed.), The limits of attention: Temporal constraints in human information processing (pp. 20–35). Oxford: Oxford University Press.

Chun, M. M. , & Wolfe, J. M. (2001). Visual attention. In B. Goldstein (Ed.), Blackwell handbook of perception (pp. 272–310). Oxford: Blackwell.

Coltheart, V. (Ed.) (1999). Fleeting memories: Cognition of brief visual stimuli . Cambridge, MA: MIT Press.

Coltheart, V. , & Langdon, R. (1998). Recall of short word lists presented visually at fast rates: Effects of phonological similarity and word length. Memory & Cognition , 26 , 330–342.

Coltheart, V., Mondy, S., Dux, P. E. , & Stephenson, L. (2004). Effects of orthographic and phonological word length on memory for lists shown at RSVP and STM rates. Journal of Experimental Psychology: Learning, Memory, & Cognition , 30 , 815–826.

Colzato, L. S., Spapé, M., Pannebakker, M. M. , & Hommel, B. (2007). Working memory and the attentional blink: Blink size is predicted by individual differences in operation span. Psychonomic Bulletin & Review , 14 , 1051–1057.

Dehaene, S., Sergent, C , & Changeux, J. P. (2003). A neuronal network model linking subjective reports and objective physiological data during conscious perception. Proceedings of the National Academy of Sciences , 100 , 8520–8525.

Dell’Acqua, R., Jolic∁ur, P., Luria, R. , & Pluchino, P. (2009). Reevaluating encoding-capacity limitationsas a cause of the attentional blink. Journal of Experimental Psychology: Human Perception & Performance , 35 , 338–351.

Dell’Acqua, R., Pascali, A., Jolicœur, P. , & Sessa, P. (2003). Fourdot masking produces the attentional blink. Vision Research , 43 , 1907–1913.

Dell’Acqua, R., Sessa, P., Jolicœur, P. , & Robitaille, N. (2006). Spatial attention freezes during the attentional blink. Psychophysiology , 43 , 394–400.

Desimone, R. , & Duncan, J. (1995). Neural mechanisms of selective visual attention. Annual Review of Neuroscience , 18 , 193–222.

DiLollo, V., Kawahara, J., Ghorashi, S. M. S. , & Enns, J. T. (2005). The attentional blink: Resource depletion or temporary loss of control. Psychological Research , 69 , 191–200.

Donchin, E. (1981). Surprise!, Surprise? Psychophysiology , 18 , 493–513.

Donchin, E. , & Coles, M. G. H. (1988). Is the P300 component a manifestation of context updating? Behavioural & Brain Sciences , 11 , 357–374.

Drew, T. , & Shapiro, K. (2006). Representational masking and the attentional blink. Visual Cognition , 13 , 513–528.

Duncan, J. (1980). The locus of interference in the perception of simultaneous stimuli. Psychological Review , 87 , 272–300.

Duncan, J. , & Humphreys, G. (1989). Visual search and stimulus similarity. Psychological Review , 96 , 433–458.

Duncan, J., Martens, S. , & Ward, R. (1997). Restricted attentional capacity within but not between sensory modalities. Nature , 387 , 808–810.

Duncan, J., Ward, R. , & Shapiro, K. (1994). Direct measurement of attentionald well time in human vision. Nature , 369 , 313–315.

Dux, P. E., Asplund, C. L. , & Marois, R. (2008). An attentional blink for sequentially presented targets: Evidence in favor of resource depletion accounts. Psychonomic Bulletin & Review , 15 , 809–813.

Dux, P. E., Asplund, C. L. , & Marois, R. (2009). Both exogenous and endogenous target salience manipulations support resource depletion accounts of the attentional blink: A reply to Olivers, Spalek, Kawahara, and Di Lollo (2009). Psychonomic Bulletin & Review , 16 , 219–224.

Dux, P. E. , & Coltheart, V. (2005). The meaning of the mask matters: Evidence of conceptual interference in the attentional blink. Psychological Science , 16 , 775–779.

Dux, P. E., Coltheart, V. , & Harris, I. M. (2006). On the fate of distractor stimuli in rapid serial visual presentation. Cognition , 99 , 355–382.

Dux, P. E. , & Harris, I. M. (2007a). On the failure of distractor inhibition in the attentional blink. Psychonomic Bulletin & Review , 14 , 723–728.

Dux, P. E. , & Harris, I. M. (2007b). Viewpoint costs occur during consolidation: Evidence from the attentional blink. Cognition , 101 , 47–58.

Dux, P. E. , & Marois, R. (2007). Repetition blindnessis immune to the centralbottleneck. Psychonomic Bulletin & Review , 14 , 729–734.

Dux, P. E. , & Arois, R. (2008). Distractor inhibition predictsindividual differences in the attentional blink. PLoS ONE , 3 , e3330.

Enns, J. T., Visser, T. A. W., Kawahara, J. -I. , & Di Lillo, V. (2001). Visual masking and task switching in the attentional blink. In K. Shapiro (Ed.), The limits of attention: Temporal constraints in human information processing (pp. 65–81). Oxford: Oxford University Press.

Folk, C. L., Leber, A. B. , & Egeth, H. E. (2002). Made you blink! Contingent attentional capture produces a spatial blink. Perception & Psychophysics , 64 , 741–753.

Fragopanagos, N., Kockelkoren, S. , & Taylor, J. G. (2005). A neuro-dynamic model of the attentional blink. Cognitive Brain Research , 24 , 568–586.

Giesbrecht, B., Bischof, W. E. , & Kingstone, A. (2003). Visual masking during the attentional blink: Tests of the object substitution hypothesis. Journal of Experimental Psychology: Human Perception & Performance , 29 , 238–255.

Giesbrecht, B. , & Di Lollo, V. (1998). Beyond the attentional blink: Visual masking by object substitution. Journal of Experimental Psychology: Human Perception & Performance , 24 , 1454–1466.

Giesbrecht, B., Sy, J. L. , & Elliott, J. C. (2007). Electrophysiological evidence for both perceptual and post-perceptual selection during the attentional blink. Journal of Cognitive Neuroscience , 19 , 2005–2018.

Grandison, T. D., Ghirardelli, T. G. , & Egeth, H. E. (1997). Beyond similarity: Masking of the target is sufficient to cause the attentional blink. Perception & Psychophysics , 59 , 266–274.

Gross, J., Schmitz, E, Schnitzler, I., Kessler, K., Shapiro, K., Hommel, B. , & Schnitzler, A. (2004). Long-range neural synchrony predicts temporal limitations of visual attention in humans. Proceedings of the National Academy of Sciences , 101 , 13050–13055.

Hommel, B. , & Akyürek, E. (2005). Lag-I sparing in the attentional blink: Benefits and costs of integrating two events into a single episode. Quarterly Journal of Experimental Psychology , 58A , 1415–1433.

Hommel, B., Kessler, K., Schmitz, F., Gross, J., Akyürek, E., Shapiro, K. , & Schnitzler, A. (2006). How the brain blinks: Towards a neurocognitive model of the attentional blink. Psychological Research , 70 , 425–435.

Isaak, M. I., Shapiro, K. L. , & Martin, J. (1999). The attentional blink reflects retrieval competition among multiple rapid serial visual presentation items: Tests of an interference model. Journal of Experimental Psychology: Human Perception & Performance , 25 , 1774–1792.

Jackson, M. C. , & Raymond, J. E. (2006). The role of attention and familiarity in face identification. Perception & Psychophysics , 68 , 543–557.

Jolicœur, P. (1985). The time to name disoriented natural objects. Memory & Cognition , 13 , 289–303.

Jolicœur, P. (1998). Modulation of the attentional blink by on-line response selection: Evidence from speeded and unspeeded Task 1 , decisions. Memory & Cognition , 26 , 1014–1032.

Jolicœur, P. (1999). Concurrent response-selection demand smodulate the attentional blink. Journal of Experimental Psychology: Human Perception & Performance , 25 , 1097–1113.

Jolicœur, P. , & Dell’Acqua, R. (1998). The demonstration of short-term consolidation. Cognitive Psychology , 32 , 138–202.

Jolicœur, P., Dell’Acqua, R. , & Crebolder, I. M. (2001). The attentional blink bottleneck. In K. Shapiro (Ed.), The limits of attention: Temporal constraints in human information processing (pp. 82–99). Oxford: Oxford University Press.

Jolicœur, P., Sessa, P., Dell’Acqua, R. , & Robitaille, N. (2006a). Attentional control and capture in the attentional blink paradigm: Evidence from human electrophysiology. European Journal of Cognitive Psychology , 18 , 560–578.

Jolicœur, P., Sessa, P., Dell’Acqua, R. , & Robitaille, N. (2006b). On the control ofvisual spatial attention: Evidence from human electrophysiology. Psychological Research , 70 , 414–424.

Kahneman, D. (1973). Attention and effort . Englewood Cliffs, NI: Prentice Hall.

Kanwisher, N. (1987). Repetition blindness: type recognition without token individuation. Cognition , 27 , 117–143.

Kawahara, J. -I., Enns, J. T. , & Di Lollo, V. (2006). The attentional blink is not a unitary phenomenon. Psychological Research , 70 , 405–413.

Kawahara, J. -I., Kumada, T. , & Di Lollo, V. (2006). The attentional blink is governed by a temporary loss of control. Psychonomic Bulletin & Review , 13 , 886–890.

Kawahara, J. -I., Zuvic, S. M., Enns, J. T. , & Di Lollo, V. (2003). Task switching mediates the attentional blink even without backward masking. Perception & Psychophysics , 65 , 339–351.

Kranczioch, C., Debener, S., Schwarzbach, J., Goebel, R. , & Engel, A. K. (2005). Neural correlates of consciou sperception in the attentional blink. Neurolmage , 24 , 704–714.

Landau, A. N. , & Bentin, S. (2008). Attentional and perceptual factors affecting the attentional blink for faces and objects. Journal of Experimental Psychology: Human Perception & Performance , 34 , 818–830.

Lavie, N. (1995). Perceptual load as a necessary condition for selective attention. Journal of Experimental Psychology: Human Perception & Performance , 21 , 451–468.

Lawrence, D. H. (1971). Two studies of visual search for word targets with controlled rates of presentation. Perception & Psychophysics , 10 , 85–89.

Luck, S. J., Vogel, E. K. , & Shapiro, K. L. (1996). Word meanings can be accessed but not reported during the attentional blink. Nature , 383 , 616–618.

Maki, W. S., Bussard, G., Lopez, K. , & Digby, B. (2003). Sources of interference in the attentional blink: Target-distractor similarity revisited. Perception & Psychophysics , 65 , 188–201.

Maki, W. S., Couture, T., Frigen, K. , & Lien, D. (1997). Sources of the attentional blink during rapid serial visual presentation:Perceptual interference and retrieval competition. Journal of Experimental Psychology: Human Perception & Performance , 23 , 1393–1411.

Maki, W. S., Frigen, K. , & Paulson, K. (1997). Associative priming by targets and distractors during rapid serial visual presentation: Does word meaning survive the attentional blink? Journal of Experimental Psychology: Human Perception & Performance , 23 , 1014–1034.

Maki, W. S. , & Mebane, M. W. (2006). Attentional capture triggers an attentional blink. Psychonomic Bulletin & Review , 13 , 125–131.

Maki, W. S. , & Padmanabhan, G. (1994). Transient suppression of processing during rapid serial visual presentation: Acquired distinctiveness of probes modulates the attentional blink. Psychonomic Bulletin & Review , 1 , 499–504.

Marois, R., Chun, M. M. , & Gore, I. C. (2004). A common parieto-frontal network is recruited under both low visibility and high perceptual interference conditions. Journal of Neurophysiology , 92 , 2985–2992.

Marois, R. , & Ivanoff, I. (2005). Capacity limits of information processing in the brain. Trends in Cognitive Sciences , 9 , 296–305.

Marois, R., Yi, D. J. , & Chun, M. M. (2004). The neural fate of consciously perceived and missed events in the attentional blink. Neuron , 41 , 465–472.

Martens, S., Munneke, J., Smid, H. , & Johnson, A. (2006). Quick minds don’t blink: Electrophysiological correlates of individual differences in attentional selection. Journal of Cognitive Neuroscience , 18 , 1423–1438.

Martin, E. W. , & Shapiro, K. L. (2008). Does failure to mask T1 cause lag-I sparing in the attentional blink? Perception & Psychophysics , 70 , 562–570.

McAuliffe, S. P. , & Knowlton, B. I. (2000). Dissociating the effects of featural and conceptual interference on multiple target processing in rapid serial visual presentation. Perception & Psychophysics , 62 , 187–195.

McLaughlin, E. N., Shore, D. I. , & Klein, R. M. (2001). The attentional blink is immune to masking-induced data limits. Quarterly Journal of Experimental Psychology , 54A , 169–196.

Miller, G. A. (2003). The cognitive revolution: A historical perspective. Trends in Cognitive Sciences , 7 , 141–144.

Most, S. B., Chun, M. M., Widders, D. M. , & Zald, D. H. (2005). Attentional rubbernecking: Cognitive control and personality in emotion induced blindness. Psychonomic Bulletin & Review , 12 , 654–661.

Neisser, U. (1967). Cognitive psychology . New York: Appelton-Century Crofts.

Nieuwenhuis, S., Aston-Jones, G. , & Cohen, J. D. (2005). Decision making, the P3, and the locus coeruleus-norepinephrine system. Psychological Bulletin , 131 , 510–532.

Nieuwenhuis, S., Gilzenrat, M. S., Holmes, B. D. , & Cohen, J. D. (2005). The role of the locus coeruleus in mediating the attentional blink: Aneurocomputational theory. JournalofExperimental Psychology: General , 134 , 291–307.

Nieuwenstein, M. R. (2006). Top-down controlled, delayed selection in the attentional blink. Journal of Experimental Psychology: Human Perception & Performance , 32 , 973–985.

Nieuwenstein, M. R., Chun, M. M., van der Lubbe, R. H. J. , & Hooge, I. T. C. (2005). Delayed attentional engagement in the attentional blink. Journal of Experimental Psychology: Human Perception & Performance , 31 , 1463–1475.

Nieuwenstein, M. R. , & Potter, M. C. (2006). Temporal limits of selection and memory encoding:A comparison of whole versus partial report in rapid serial visual presentation. Psychological Science , 17 , 471–475.

Nieuwenstein, M. R., Potter, M. C. , & Theeuwes, I. (2009). Unmasking the attentional blink. Journal of Experimental Psychology: Human Perception & Performance , 35 , 159–169.

Olivers, C. N. L. , & Meeter, M. (2008). A boost and bounce theory of temporal attention. Psychological Review , 115 , 836–863.

Olivers, C. N. L. , & Nieuwenhuis, S. (2005). The beneficial effect of concurrent task-irrelevant mental activity on temporal attention. Psychological Science , 16 , 265–269.

Olivers, C. N. L. , & Nieuwenhuis, S. (2006). The beneficial effects of additional task load, positive affect, and instruction on the attentional blink. Journal of Experimental Psychology: Human Perception & Performance , 32 , 364–379.

Olivers, C. N. L., Spalek, T. M., Kawahara, J. -I. , & Di Lollo, V. (2009). The attentional blink: Increasing target salience provides no evidence for resource depletion. A commentary on Dux, Asplund, and Marois (2008). Psychonomic Bulletin & Review , 16 , 214–218.

Olivers, C. N. L., van der Stigchel, S. , & Hulleman, J. (2007). Spreading the sparing: Against a limited-capacity account of the attentional blink. Psychological Research , 71 , 126–139.

Ouvers, C. N. L. , & Watson, D. G. (2006). Input control processes in rapid serial visual presentations:Target selection and distractor inhibition. Journal of Experimental Psychology: Human Perception & Performance , 32 , 1083–1092.

Olson, I. R., Chun, M. M. , & Anderson, A. K. (2001). Effects of phonological length on the attentional blink for words. Journal of Experimental Psychology: Human Perception & Performance , 27 , 1116–1123.

Ouimet, C. , & Jolicœur, P. (2007). Beyond Task 1difficulty:The duration of T1 encoding modulates the attentional blink. ViSual Cognition , 15 , 290–304.

Pashler, H. (1994). Dual-task interference in simple tasks: Data and theory. Psychological Bulletin , 116 , 220–244.

Pashler, H. (1998). The psychology of attention . Cambridge, MA: MIT Press.

Potter, M. C. (1975). Meaning in visual search. Science , 187 , 965–966.

Potter, M. C. (1976). Short-term conceptual memory for pictures. Journal of Experimental Psychology: Human Learning & Memory , 2 , 509–522.

Potter, M. C. (1993). Very short-term conceptual memory. Memory & Cognition , 21 , 156–161.

Potter, M. C., Chun, M. M., Banks, B. S. , & Muckenhoupt, M. (1998). Two attentional deficits in serial target search: The visual attentional blink and an amodal task-switch deficit. Journal of Experimental Psychology: Learning, Memory, & Cognition , 24 , 979–992.

Potter, M. C., Dell’Acqua, R., Pesciarelli, E., Job, R., Peressotti, F. , & O’Connor, D. H. (2005). Bidirectional semantic priming in the attentional blink. Psychonomic Bulletin & Review , 12 , 460–465.

Potter, M. c. , & Faulconer, B. A. (1975). Time to understand pictures and words. Nature , 253 , 437–438.

Potter, M. C; & Levy, E. I. (1969). Recognition memory for a rapid sequence of pictures. Journal of Experimental Psychology , 81 , 10–15.

Potter, M. C., Staub, A. , & O’Connor, D. H. (2002). The time course of competition for attention: Attention is initially labile. Journal of Experimental Psychology: Human Perception & Performance , 28 , 1149–1162.

Raymond, J. E., Shapiro, K. L. , & Arnell, K. M. (1992). Temporary suppression of visual processing in an RSVP task: An attentional blink? Journal of Experimental Psychology: Human Perception & Performance , 18 , 849–860.

Raymond, J. E., Shapiro, K. L. , & Arnell, K. M. (1995). Similarity determines the attentional blink. Journal of Experimental Psychology: Human Perception & Performance , 21 , 653–662.

Reeves, A. , & Sperling, G. (1986). Attention gating in short-term visual memory. Psychological Review , 93 , 180–206.

Ruthruff, E. , & Pashler, H. E. (2001). Perceptual and central interference in dual-task performance. In K. Shapiro(Ed.), The limits of attention: Temporal constraints in human information processing (pp. 100123) Oxford: Oxford University Press.

Seiffert, A. , & Di Lollo, V. (1997). Low-level masking in the attentional blink. Journal of Experimental Psychology: Human Perception & Performance , 23 , 1061–1073.

Sergent, C., Baillet, S. , & Dehaene, S. (2005). Timing of the brain events underlying access to consciousness during the attentional blink. Nature Neuroscience , 8 , 1391–1400.

Sergent, C. , & Dehaene, S. (2004). Is consciousness a gradual phenomenon? Evidence for an all-or-none bifurcation during the attentional blink. Psychological Science , 15 , 720–728.

Shapiro, K. L., Arnell, K. M. , & Raymond, J. E. (1997). The attentional blink. Trends in Cognitive Sciences , 1 , 291–296.

Shapiro, K. L., Caldwell, J. , & Sorensen, R. E. (1997). Personal names and the attentional blink: A visual “cocktail party” effect. Journal of Experimental Psychology: Human Perception & Performance , 23 , 504–514.

Shapiro, K. L., Driver, J., Ward, R. , & Sorensen, R. E. (1997). Priming from the attentional blink:A failure to extract visual tokens but not visual types. Psychological Science , 8 , 95–100.

Shapiro, K. L. , & Raymond, J. E. (1994). Temporal allocation of visual attention: Inhibition or interference? In D. Dagenbach & T. H. Carr (Eds.), Inhibitory mechanisms in attention, memory and language (pp. 151–188). Boston: Academic Press.

Shapiro, K. L., Raymond, J. E. , & Arnell, K. M. (1994). Attention to visual pattern information produces the attentional blink in rapid serial visual presentation. Journal of Experimental Psychology: Human Perception & Performance , 20 , 357–371.

Shapiro, K. L., Schmitz, F., Martens, S., Hommel, B. , & Schnitzler, A. (2006). Resource sharing in the attentional blink. NeuroReport , 17 , 163–166.

Shiffrin, R. M. , & Schneider, W. (1977). Controlled and automatic human information processing: II. Perceptual learning, automatic attending, and a general theory. Psychological Review , 84 , 127–190.

Shih, S. I. (2008). The attention cascade model and the attentional blink. Cognitive Psychology , 56 , 210–236.

Shore, D. I., McLaughlin, E. N. , & Klein, R. (2001). Modulation of the attentional blink by differential resource allocation. Canadian Journal of Experimental Psychology , 55 , 318–324.

PubMed   Google Scholar  

Smith, S. D., Most, S. B., Newsome, L. A. , & Zald, D. H. (2006). An “emotional blink” of attention elicited by aversively conditioned stimuli. Emotion , 6 , 523–527.

Taatgen, N. A., Juvina, I., Schipper, M., Borst, J. P. , & Martens, S. (2009). Too much control can hurt: A threaded cognition model of the attentional blink. Cognitive Psychology , 59 , 1–29. doi:10. l0I6/j. cogpsych. 2008. 12. 002

Taylor, J. G. , & Rogers, M. (2002). A control model of the movement of attention. Neural Networks , 15 , 309–326.

Thorpe, S., F’ize, D. , & Marlot, C. (1996). Speed of processing in the human visual system. Nature , 381 , 520–522.

Treisman, A. M. (1969). Strategies and models of selective attention. Psychological Review , 76 , 282–299.

Visser, T. A. W. (2007). Masking T1 difficulty: Processing time and the attentional blink. Journal of Experimental Psychology: Human Perception & Performance , 33 , 285–297.

Visser, T. A. W., Bischof, W. F. , & Di Lollo, V. (1999). Attentional switching in spatial and nonspatial domains: Evidence from the attentional blink. Psychological Bulletin , 125 , 458–469.

Vogel, E. K. , & Luck, S. J. (2002). Delayed working memory consolidation during the attentional blink. Psychonomic Bulletin & Review , 9 , 739–743.

Vogel, E. K., Luck, S. J. , & Shapiro, K. L. (1998). Electrophysiological evidence for a postperceptual locus of suppression during the attentional blink. Journal of Experimental Psychology: Human Perception & Performance , 24 , 1656–1674.

Vul, E., Nieuwenstein, M. , & Kanwisher, N. (2008). Temporal selection is suppressed, delayed, and diffused during the attentional blink. Psychological Science , 19 , 55–61.

Ward, R., Duncan, J. , & Shapiro, K. (1996). The slow time-course of visual attention. Cognitive Psychology , 30 , 79–109.

Ward, R., Duncan, J. , & Shapiro, K. (1997). Effects of similarity, difficulty, and nontarget presentation on the time course of visual attention. Perception & Psychophysics , 59 , 593–600.

Wee, S. , & Chua, E K. (2004). Capturing attention when attention blinks. Journal of Experimental Psychology: Human Perception & Performance , 30 , 598–612.

Weichselgartner, E. , & Sperling, G. (1987). Dynamics of automatic and controlled visual attention. Science , 238 , 778–780.

Williams, M. A., Visser, T. A., Cunnington, R. , & Mattingley, J. B. (2008). Attenuation of neural responses in primary visual cortex during the attentional blink. Journal of Neuroscience , 8 , 9890–9894.

Wyble, B., Bowman, H. , & Nieuwenstein, M. (2009). The attentional blink provides episodic distinctiveness: Sparing at a cost. Journal of Experimental Psychology: Human Perception & Performance , 35 , 787–807.

Download references

Author information

Authors and affiliations.

Vanderbilt University, Nashville, Tennessee

Paul E. Dux & René Marois

School of Psychology, University of Queensland, 463 McElwain Building, St. Lucia, QLD4072, Brisbane, Queensland, Australia

Paul E. Dux

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Paul E. Dux .

Additional information

This work was supported by an ARC grant (DP0986387) to P.E.D. and NIMH (ROI MH70776) and NSF (0094992) grants to R.M.

Rights and permissions

Reprints and permissions

About this article

Dux, P.E., Marois, R. The attentional blink: A review of data and theory. Attention, Perception, & Psychophysics 71 , 1683–1700 (2009). https://doi.org/10.3758/APP.71.8.1683

Download citation

Received : 22 October 2008

Accepted : 06 June 2009

Issue Date : November 2009

DOI : https://doi.org/10.3758/APP.71.8.1683

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Stimulus Onset Asynchrony
  • Attentional Blink
  • Rapid Serial Visual Presentation
  • Repetition Blindness
  • Attentional Blink Magnitude
  • Find a journal
  • Publish with us
  • Track your research


  1. Rapid serial visual presentation (RSVP) EEG paradigm

    rapid serial visual presentation

  2. Schematic representation of the rapid serial visual presentation (RSVP

    rapid serial visual presentation

  3. Rapid serial visual presentation (RSVP) paradigm. The RSVP sequence

    rapid serial visual presentation

  4. Readima

    rapid serial visual presentation

  5. Rapid serial visual presentation (RSVP) task and illustration of

    rapid serial visual presentation

  6. Schematic illustration of a rapid serial visual presentation (RSVP

    rapid serial visual presentation


  1. Cloud SIEM в действии

  2. Rapid Serial Visual Presentation Beginner Level Video Sample

  3. Autodesk VRED

  4. 가장 효과적인 영어 독해 속독훈련 [원어민 mp3] l 청킹 RSVP l rapid serial visual presentation

  5. Comunicacion Serial Entre Visual Basic.NET y PIC16F877A con visual Studio 2005 y MicroC pro for PIC

  6. Multi-Spindle Automatic Screwdriver Component VPD4-58.4 for Machine Builders from Visumatic


  1. Rapid serial visual presentation

    Rapid serial visual presentation (RSVP) is a scientific method for studying the timing of vision. In RSVP, a sequence of stimuli are shown to an observer at one location in their visual field. The observer is instructed to report one of these stimuli - the target - which has a feature that differentiates it from the rest of the stream. For ...

  2. AccelaReader

    To get started, just copy and paste text that you would like to read in this text box. AccelaReader will help you read faster by flashing words at you using what is called a Rapid Serial Visual Presentation. In the settings menu, you can determine how fast you would like to read (words per minute). You can also choose how many words you would ...

  3. A review of rapid serial visual presentation-based brain-computer

    Rapid serial visual presentation (RSVP) is the process of sequentially displaying images at the same spatial location at high presentation rates with multiple images per second, e.g. with a stimulus onset asynchrony no greater than 500 ms but often lower than 100 ms, i.e. >10 stimuli presented per second.

  4. Rapid serial visual presentation: An approach to design

    A familiar action in the physical world is the riffling of a book's pages in order to gain, in a very short time, some idea about its content. That action is appropriately termed rapid serial visual presentation (RSVP). A computational embodiment of RSVP can provide such a variety of possible image presentation modes and flexibility of ...

  5. Rapid Serial Visual Presentation: Design for Cognition

    A book that explains RSVP, a powerful image presentation technique based on pre-attentive processing, and its applications to various tasks. It provides guidance to interaction designers and illustrates RSVP with empirical evidence and examples.

  6. Human EEG recordings for 1,854 concepts presented in rapid serial

    The influence of image masking on object representations during rapid serial visual presentation. NeuroImage 197 , 224-231 (2019). Article Google Scholar

  7. Rapid Serial Visual Presentation Task

    The rapid serial visual presentation task (RSVP task) is an experimental paradigm that investigates the temporal aspect of attentional blink by requiring participants to identify target stimuli presented rapidly in a sequential visual stream. Attentional blink refers to the phenomenon where accurate identification of a preceding target in a ...

  8. Neural dynamics of the attentional blink revealed by encoding ...

    Fig. 1: Schematic of stimuli and timing of displays in the rapid serial visual presentation (RSVP) task. a An illustration of a typical trial in the RSVP task, which consisted of 20 sequentially ...

  9. Rapid Serial Visual Presentation (RSVP)

    Rapid serial visual presentation (RSVP) is a method for bypassing eye movements during reading. In RSVP, each word (or small group of words) appears in the same location, serially. This chapter emphasizes faster-than-normal rates of presentation in using RSVP to study reading, but RSVP is also useful at slower rates that are in the range of ...

  10. Exploring Binocular Visual Attention by Presenting Rapid Dichoptic and

    The dichoptic presentation of rapid serial visual presentation (RSVP) is a case of interocular competition, and binocular rivalry is the most studied phenomenon on interocular competition [].In dichoptic RSVP (DRSVP), such interocular competition would only affect attentional selection on the onset of RSVPs, i.e., the first phase of binocular rivalry, known as the onset of rivalry, but not the ...

  11. A review of rapid serial visual presentation-based brain-computer

    Abstract. Rapid serial visual presentation (RSVP) combined with the detection of event-related brain responses facilitates the selection of relevant information contained in a stream of images presented rapidly to a human. Event related potentials (ERPs) measured non-invasively with electroencephalography (EEG) can be associated with infrequent ...

  12. Rapid Serial Visual Presentation (RSVP)

    Background. Attention across time rather than space has been studied with a paradigm called the rapid serial visual presentation (RSVP) paradigm. In RSVP, a series of stimuli appear rapidly in time at the same point in visual space. Indeed, the stimuli may appear as fast as 10 items per second. The stimuli are usually letters or photographs.

  13. Rapid Serial Visual Presentation: Bilingual Lexical and Attentional

    Rapid serial visual presentation (RSVP) is a method of presenting letters, digits, or words one at a time for a fixed duration. This chapter reviews how RSVP is used to study bilingual lexical processing, repetition blindness, attentional blink, and executive control.

  14. Rapid Serial Visual Presentation Task

    Rapid serial visual presentation (RSVP) task is a method of displaying texts or images in the central display that streams in quick succession. The observers are required to read or identify the target as quickly as possible in the stream and are notably used in the field of linguistics, neuroscience, and cognitive psychology. ...

  15. A review of deep learning methods for cross-subject rapid serial visual

    The rapid serial visual presentation (RSVP) paradigm has garnered considerable attention in brain-computer interface (BCI) systems. Studies have focused on using cross-subject electroencephalogram data to train cross-subject RSVP detection models. In this study, we performed a comparative analysis of the top 5 deep learning algorithms used by ...

  16. RSVP -- Rapid Serial Visual Presentation

    Rapid Serial Visual Presentation (RSVP) Background: When we read, we spend a lot of our time moving our eyes from one group of letters to the next. Such eye movements are called saccades (from the French word for jerk) and are typically very rapid. Typically, the eyes are still (fixated) for approximately 250 ms (it could be much less or much ...

  17. What makes a visual scene more memorable? A rapid serial visual

    The stimuli were presented using an RSVP (Rapid Serial Visual Presentation) procedure without any interruption. Subjects have performed 40 trial blocks, and, in each block, they were presented with either 5 biological or 5 non-biological motion clips in a random order. Order of movies was randomized for each participant.

  18. Rapid Serial Visual Presentation

    Learn how to improve your reading skills of fingerspelled words with rapid serial visual presentation (RSVP), an innovative approach that flashes letters onscreen. Enter your access code to start the training program after reading the RSVP book.

  19. These Apps Could Triple Your Reading Speed

    In rapid serial visual presentation, words are shown one-by-one in quick succession, rather than being all on the page in a block of text. This makes us read much more quickly, says 9 to 5 Mac :

  20. Attention: 2.4 Rapid serial visual presentation

    2.4 Rapid serial visual presentation. It has been known for a long time that backward masking can act in one of two ways: integration and interruption (Turvey, 1973). When the SOA between target and mask is very short, integration occurs; that is, the two items are perceived as one, with the result that the target is difficult to report, just as when one word is written over another.

  21. Assessing the Impact of Rapid Serial Visual Presentation (RSVP): A

    As designers continue to rely on mechanical-age paradigms, they are often blinded to the potential of the computer to do a task in a better, albeit different, way. The main objective of this research was to investigate the impact of a different reading technique (Rapid Serial Visual Presentation) on reading speed and comprehension.

  22. rapid-serial-visual-presentation · GitHub Topics · GitHub

    To associate your repository with the rapid-serial-visual-presentation topic, visit your repo's landing page and select "manage topics." GitHub is where people build software. More than 100 million people use GitHub to discover, fork, and contribute to over 420 million projects.

  23. The attentional blink: A review of data and theory

    Abstract. Under conditions of rapid serial visual presentation, subjects display a reduced ability to report the second of two targets (Target2; T2) in a stream of distractors if it appearswithin200-500 msec of Target 1 (Tl). This effect. known as the attentional blink (AB),has been central in characterizing the limits of humans' ability to ...