- Split View
-
Views
-
Cite
Cite
Marios G. Philiastides, Paul Sajda, Temporal Characterization of the Neural Correlates of Perceptual Decision Making in the Human Brain, Cerebral Cortex, Volume 16, Issue 4, April 2006, Pages 509–518, https://doi.org/10.1093/cercor/bhi130
- Share Icon Share
Abstract
Single and multi-unit recordings in primates have identified spatially localized neuronal activity correlating with an animal's behavioral performance. Due to the invasive nature of these experiments, it has been difficult to identify such correlates in humans. We report the first non-invasive neural measurements of perceptual decision making, via single-trial EEG analysis, that lead to neurometric functions predictive of psychophysical performance for a face versus car categorization task. We identified two major discriminating components. The earliest correlating with psychophysical performance was consistent with the well-known face-selective N170. The second component, which was a better match to the psychometric function, did not occur until at least 130 ms later. As evidence for faces versus cars decreased, onset of the later, but not the earlier, component systematically shifted forward in time. In addition, a choice probability analysis indicated strong correlation between the neural responses of the later component and our subjects' behavioral judgements. These findings demonstrate a temporal evolution of component activity indicative of an evidence accumulation process which begins after early visual perception and has a processing time that depends on the strength of the evidence.
Introduction
Identifying neural activity directly responsible for perceptual decision making is a major challenge for systems and cognitive neuroscience. A number of investigators have studied the neural correlates of decision making in awake behaving animals, in particular primates, where single and multi-unit recordings have been analyzed using signal detection theory (Green and Swets, 1966) and subsequently correlated with the animal's observed behavior/decisions. For instance, Newsome and colleagues (Newsome et al., 1989; Britten et al., 1992, 1996) used visual stimuli consisting of varying amounts of coherent motion and showed that neurometric functions constructed from the activity of specific individual and small populations of neurons were indistinguishable from the animal's psychometric functions. They also computed choice probabilities and showed that the neural responses had a small but significant association with the animal's decisions. This approach of comparing neurometric and psychometric functions and considering choice probabilities enables one to directly address the decision making process since it explicitly relates the variability of the neural activity to the variability observed in the behavioral response. The technique has been applied in a variety of perceptual decision making paradigms, including discrimination of visual objects such as faces (Keysers et al., 2001) and tactile discrimination tasks (Hernandez et al., 2000; Romo et al., 2002). The approach, though powerful, has been limited to animal studies which use invasive recordings of single-trial neural activities. Yet to be demonstrated is whether decision making could be studied in a similar fashion, though under the constraint that single-trial neural activity be measured non-invasively in humans.
Decision making during face processing has been extensively studied in both primates and humans, in particular within the context of face categorization and identification. Human subject studies using trial-averaged electroencephalography (EEG) have identified waveforms at specific times (e.g. N170, N200) that are well-correlated with the presentation of faces compared with nonface objects (Jeffreys, 1996; Halgren et al., 2000; Liu et al., 2000; Rossion et al., 2003). Recent studies using magnetoencephalography (MEG) (Liu et al., 2002) have found earlier trial-averaged activity (M100) which is correlated with face categorization but not identification of individual faces, suggesting a two-stage processing strategy in face perception. However, these experiments do not directly address the detailed temporal aspects of the decision making process during human face perception since they do not consider single-trial variability of the activity relative to the variability of the response.
Keysers et al. (2001), using rapid serial visual presentation (RSVP), identify activity of individual neurons in macaque temporal cortex which predict whether the monkey responds that it saw a face in the stimulus. These investigators attempt to relate this neural activity to human subject performance by comparing neurometric functions for single-neurons in macaque with psychometric functions of human subjects' doing a similar task. Their comparison, though indirect and qualitative, indicates that the monkey's neurometric function has a similar shape to the human subject's psychometric function, though the curves themselves are quantitatively very different. This approach is problematic, for activities and decisions are not only being compared across species but also across subjects and different experimental sessions. Thus the inter-subject and intra-subject variability in the decision making process cannot be captured/measured. Rather this would require simultaneous measurement of the neuronal and psychophysical performance while the human subject performs the task.
In this paper we used single-trial analysis of the EEG to identify the cortical correlates of decision making during face perception in human subjects. We used a machine learning approach to identify linear spatial weightings of the EEG sensors for specific temporal windows which optimally discriminate between target (faces) and non-target (cars) trials during a simple categorization task. From these discriminating components we constructed neurometric curves, as a function of stimulus evidence (phase coherence) and compared them to subjects' psychometric functions. We analyzed the temporal characteristics of those components which were strongly correlated with psychophysical performance and considered how changes in their temporal onset was affected by the stimulus evidence for the decision.
Materials and Methods
Subjects
Six people (three females and three males, age range 21–37 years) participated in the study. All had normal or corrected to normal vision and reported no history of neurological problems. Informed consent was obtained from all participants in accordance with the guidelines and approval of the Columbia University Institutional Review Board.
Stimuli
Behavioral Paradigm
Subjects performed a simple categorization task where they had to discriminate between images of faces and cars. Within a block of trials, face and car images over a range of phase coherences were presented in random order. The range of phase coherence levels was chosen to span psychophysical threshold. All subjects performed nearly perfectly at the highest phase coherence but performed near chance for the lowest one. Subjects reported their decision regarding the type of image by pressing one of two mouse buttons — left for faces and right for cars — using their right index and middle fingers respectively. A block of trials consisted of 24 trials of both face and car images at each of six different phase coherence levels, a total of 144 trials. There were a total of four blocks in each experiment. At the beginning of a block of trials subjects fixated at the center of the screen. Images were presented for 30 ms followed by an inter-stimulus-interval (ISI) which was randomized in the range of 1500–2000 ms. Subjects were instructed to respond as soon as they identified the type of image and before the next image was presented. A schematic representation of the behavioral paradigm is given in Figure 1. Trials where subjects failed to respond within the ISI were marked as no-choice trials and were discarded from further analysis.
Data Acquisition
EEG data was acquired simultaneously in an electrostatically shielded room (ETS-Lindgren, Glendale Heights, IL) using a Sensorium EPA-6 Electrophysiological Amplifier (Charlotte, VT) from 60 Ag/AgCl scalp electrodes and from three periocular electrodes placed below the left eye and at the left and right outer canthi. All channels were referenced to the left mastoid with input impedance <15kΩ and chin ground. Data were sampled at 1000 Hz with an analog pass band of 0.01–300 Hz using 12 dB/octave high pass and eighth-order Elliptic low pass filters. Subsequently, a software based 0.5 Hz high pass filter was used to remove DC drifts and 60 and 120 Hz (harmonic) notch filters were applied to minimize line noise artifacts. These filters were designed to be linear-phase to minimize delay distortions. Motor response and stimulus events recorded on separate channels were delayed to match latencies introduced by digitally filtering the EEG.
Movement Artifact Removal
Prior to the main experiment, subjects completed an eye muscle calibration experiment during which they were instructed to blink repeatedly upon the appearance of a white-on-black fixation cross and then to make several horizontal and vertical saccades according to the position of the fixation cross subtending 1° × 1° of visual field. Horizontal saccades subtended 33° and vertical saccades subtended 22°. The timing of these visual cues was recorded with EEG. This enabled us to determine linear components associated with eye blinks and saccades (using principal component analysis) that were subsequently projected out of the EEG recorded during the main experiment (Parra et al., 2003). Trials with strong eye movement or other movement artifacts were manually removed by inspection. There were at least 40 artifact-free trials for any given condition (i.e. at least 80 trials for both sets of face and car trials at each phase coherence level).
Data Analysis
At a given coherence level we also constructed discriminant component maps. We aligned our trials to the onset of visual stimulation as shown in Figure 2a (black vertical line) and sorted them by reaction time (sigmoidal curves). Each row of the discriminant component map represents a single trial across time. Discriminant components are represented by the y vectors. To construct this map we chose a training window, indicated by the white vertical bars (for this example starting at 180 ms post-stimulus), during which we trained our linear discriminator to estimate the weighting vector w for all of our sensors in X, such that y is maximally discriminating between face and car trials. Once the w vector has been estimated using data derived only from the training window, we applied w to the data across all trials and all time. The resultant discriminant component map is shown in Figure 2a. Red represents positive and blue negative activity. For this example the resultant discriminating component appears ∼180 ms after the onset of visual stimulation and it is stimulus-locked. Response-locked components are expected to have a sigmoidal profile similar to the subject's actual response times.
Training the discriminator over several temporal windows allowed us to visualize the temporal evolution of the discriminating component activity. We used a coarse-to-fine approach to identify the two most discriminating time windows at every phase coherence level. Initially we trained the discriminator while sliding our training window in non-overlapping segments of 30 ms from the onset of visual stimulation to the earliest response time. We subsequently re-trained the discriminator by sliding the window in finer steps (10 ms) only around the two most discriminating regions as identified by the coarser search.
Taking advantage of the linearity of our model, we subsequently used the foward model to project this discriminating activity back to the sensors. The resultant scalp projections a are shown in Figure 2b and are used for interpreting the neuroanatomical significance of the resultant discriminating component.
We quantified the performance of the linear discriminator by the area under the receiver operator characteristic (ROC) curve, referred to as Az, with a leave-one-out approach (Duda et al., 2001). We used the ROC Az metric to characterize the discrimination performance while sliding our training window from stimulus onset to response time (varying τ). Finally in order to assess the significance of the resultant discriminating component we used a bootstrapping technique to compute an Az value leading to a significance level of P = 0.01. Specifically we computed a significance level for Az by performing the leave-one-out test after randomizing the truth labels of our face and car trials. We repeated this randomization process 100 times to produce an Az randomization distribution and compute the Az leading to a significance level of P = 0.01 shown in Figure 2b by the dotted red line.
Psychometric and Neurometric Functions
Choice Probabilities
We computed choice probabilities at selected phase coherence levels for all our subjects. We labeled all trials based on the type of response/choice rather than the type of stimulus (i.e. all face responses – correct face and incorrect car trials – versus all car responses – correct car and incorrect face trials). We then repeated the ROC analysis described in Data Analysis and classified between ‘face’ and ‘car’ choice trials such that the new set of Az values now represented choice probabilities.
Statistical Tests
Results
EEG-derived Neurometric Functions
We measured the psychophysical performance of six subjects to a face versus car categorization task (Fig. 1a) while simultaneously recording neuronal activity using a high-density EEG electrode array. We changed the stimulus evidence by manipulating the phase coherence of our images (Fig. 1b). Within a block of trials, face and car images over a range of phase coherences were presented in random order. All subjects performed nearly perfect at the highest phase coherence but performed near chance at the lowest coherence.
We compared the EEG activity obtained for the two types of images at each phase coherence level on a single-trial basis in order to capture the response variability which is normally concealed by trial averaging. Using a linear discriminator which integrates EEG over space rather than across trials, we identified components that maximally discriminated between the two experimental conditions. We used ROC analysis to quantify the discriminator's performance. Furthermore, taking advantage of the linearity of our model, we computed sensor projections of the discriminating component activity. These scalp projections can provide a forward model for interpreting the neuroanatomical significance of the resultant discriminating components.
At each phase coherence level we identified the two most discriminating components in the interval between the onset of the visual stimulation and the earliest reaction time. In order to characterize the neuronal performance at these two times, in a manner compatible with the description of the psychophysical sensitivity as captured by the psychometric functions (Green and Swets, 1966), we constructed neurometric functions by plotting the area under the ROC curves (Az values) against the corresponding phase coherence levels and fitting the data with Weibull functions (Quick, 1974). We trained our linear discriminator at these two temporal windows separately and combined. When the discriminator was trained with data integrated across both time windows we generally observed, for the discriminator, improved performance and hence higher Az values. Figure 3 shows comparisons of the psychometric and neurometric functions for all subjects in our dataset.
In order to quantify the degree of similarity between the psychometric and neurometric functions and demonstrate that our EEG-derived neurometric functions can account for psychophysical performance, we fit the best single Weibull function jointly to the two data sets in addition to the individual fits. We subsequently used a likelihood-ratio test (Hoel et al., 1971) which showed that for all our subjects a single function can fit the behavioral and neuronal data sets as well as the two separate functions. We concluded that these neurometric functions can be used to quantify the relationship between the neuronal signals and behavior by isolating the components that are associated with the perceptual discrimination.
Early and Late Face-selective Responses
In the time interval between the stimulus onset and the earliest response time we identified two face-selective components that were maximally discriminating between face and car trials. To visualize the temporal evolution of these components we constructed discriminant component maps. Figure 4 summarizes the complete results for one subject. The early component was consistent with the well-known N170 and its temporal onset appeared to be consistent across all subjects. The late component, which was of opposite sign, appeared on average 130 ms after the first at the highest coherence level. Its temporal onset varied across subjects in the range 300–450 ms from stimulus onset across all coherence levels.
As can be seen by the single-trial component projections (Fig. 4) from discrimination of stimulus-locked face versus car trials, both face-selective responses appeared to be more correlated with the onset of visual stimulation (black vertical line) rather than the response (sigmoidal curves). To verify this point with respect to possible bias in our stimulus locked analysis, we reanalyzed the results response locked and found the components remained strongly stimulus locked (not shown). This observation indicated that the discriminating activity is not directly predictive of reaction time; rather, it appears to be related to the stimuli reaching a perceptual level of processing (Super et al., 2001).
In addition we observed that the late face-selective component resulted in a better match to the psychophysical data as shown in Figure 3. In fact, for four of our six subjects (Fig. 3a–d) using the Az values obtained only from the late training window to construct our neurometric function was sufficient to show that the psychometric and neurometric functions were statistically indistinguishable. Interestingly this was never true when neurometric functions were derived solely based on analysis from the early face-selective N170 component.
Evidence Changes Onset of Late Component
We investigated the relationship between the temporal onset of the early and late face-selective responses and task difficulty. The most discriminating time window at each phase coherence level for each one of the two face-selective temporal components appears on the top of each set of projections in Figure 4. We used these times to study the extent to which these face-selective components systematically shifted in time. We used a bootstrapping method, to identify all significantly discriminating components (P < 0.01; red dotted line). Only times from these statistically significant components were used for this analysis. We regressed a line through these time points for each one of the face-selective components and computed their corresponding slopes (Fig. 5a). We found that for the early component (N170) there was no significant shift in time, as the slopes did not differ significantly from zero (two-tailed t-test, P > 0.45). On the other hand, the optimal onset of the late face-selective component had a systematic forward shift in time as the task became more difficult and the subjects took longer to respond. The slopes for the late component were statistically less than zero (left-tailed t-test, P < 0.007) and also statistically more negative from the slopes of the earlier component (right-tailed, paired t-test, P < 0.005).
These findings seem to suggest an early component related to bottom-up processing of the stimulus and a late component responsible for the evaluation of the evidence (Shadlen and Newsome, 1999; Keysers and Perrett, 2002), occurring later for more ambiguous stimuli. In addition, the timing of the second component (300–450 ms post-stimulus) is consistent with previous findings suggesting recurrent processing of the stimulus/evidence (Super et al., 2001; VanRullen and Koch, 2003), with an average reverberation time of 130 ms. Such reverberatory activity is likely to reflect the integration of information that underlies perception considering that recurrent/feedback connections are shown to mediate processes such as perceptual organization, attention and visual awareness (Lamme et al., 1998; Hupe et al., 1998; Super et al., 2001).
Association between Neuronal Responses and Behavioral Decisions
To address the possibility that the neural responses associated with the discriminating components are correlated with our subjects' choices, we employed a method based on signal detection theory, analogous to the ROC analysis used earlier, to compute choice probabilities as in Britten et al. (1996). Unlike traditional uses of signal detection theory, however, which establish the relationship between the stimulus and neural responses, this alternative formulation quantifies a relationship between neuronal activity and a subject's choices/decisions. A choice probability value of 0.5 represents chance performance and a value of 1.0 represents perfect association between neuronal and behavioral responses. In order for the choice probability metric to be meaningful, neural responses from stimuli near threshold are to be used (so that the subjects make a useful number of errors on the psychophysical task).
We pooled data across two coherence levels (30 and 35%) that were near threshold where subjects made both ‘face’ and ‘car’ choices in response to any particular stimulus, and computed a choice probability value for every subject. We computed three sets of choice probabilities using neuronal data from (i) the early component, (ii) the late component and (iii) the early and late components combined. The results are summarized in Table 1. To assess the significance of these choice probabilities we employed a bootstrap technique where we randomly permuted the trial labels 500 times and computed choice probability distributions for every subject. This permutation test ensured that the association between the neuronal and behavioral responses was abolished, while the distributions of neuronal and behavioral judgements remained untouched. We then checked if the observed choice probability values were outside the 95% confidence intervals of these distributions, in which case we concluded that they were statistically significant.
. | S1 . | S2 . | S3 . | S4 . | S5 . | S6 . |
---|---|---|---|---|---|---|
Early window | 0.58 | 0.65* | 0.64* | 0.52 | 0.73** | 0.61 |
Late window | 0.74** | 0.63** | 0.65** | 0.76** | 0.81** | 0.61* |
Early + late windows | 0.69** | 0.68** | 0.69** | 0.74** | 0.82** | 0.64** |
. | S1 . | S2 . | S3 . | S4 . | S5 . | S6 . |
---|---|---|---|---|---|---|
Early window | 0.58 | 0.65* | 0.64* | 0.52 | 0.73** | 0.61 |
Late window | 0.74** | 0.63** | 0.65** | 0.76** | 0.81** | 0.61* |
Early + late windows | 0.69** | 0.68** | 0.69** | 0.74** | 0.82** | 0.64** |
Statistically significant values as identified by a permutation test (values outside the 95% confidence interval).
Choice probabilities represent values outside the 99% confidence interval.
. | S1 . | S2 . | S3 . | S4 . | S5 . | S6 . |
---|---|---|---|---|---|---|
Early window | 0.58 | 0.65* | 0.64* | 0.52 | 0.73** | 0.61 |
Late window | 0.74** | 0.63** | 0.65** | 0.76** | 0.81** | 0.61* |
Early + late windows | 0.69** | 0.68** | 0.69** | 0.74** | 0.82** | 0.64** |
. | S1 . | S2 . | S3 . | S4 . | S5 . | S6 . |
---|---|---|---|---|---|---|
Early window | 0.58 | 0.65* | 0.64* | 0.52 | 0.73** | 0.61 |
Late window | 0.74** | 0.63** | 0.65** | 0.76** | 0.81** | 0.61* |
Early + late windows | 0.69** | 0.68** | 0.69** | 0.74** | 0.82** | 0.64** |
Statistically significant values as identified by a permutation test (values outside the 95% confidence interval).
Choice probabilities represent values outside the 99% confidence interval.
Only three out of six subjects had a choice probability significantly greater than chance when data from only the earlier component were considered. Interestingly, however, the observed choice probabilities for the late component were shown to be statistically significant for all six subjects. These results clearly demonstrate that the neuronal responses, especially of the later component, for all our subjects had a significant positive association with their behavioral choices. Taken together, this finding and the systematic forward shift of the second component as evidence decreased suggest that the second component may be associated with the actual decision making process, or at the very least reflect an intermediate stage of perceptual processing that is situated between purely sensory and decision stages (Super et al., 2001).
Spatial Distribution of Activity and the Importance of Spatial Integration
For both the early (N170) and late face-selective responses, at each phase coherence level, we constructed scalp maps of the discriminating components, and the results for one subject are shown in Figure 4a,b. The Az values which describe the discriminator's performance at each phase coherence level are also shown. For the subject shown in Figure 4, the discriminant activity was statistically significant down to a 30% phase coherence for both temporally distributed components as assessed by our bootstrapping technique (P < 0.01; red dotted line).
The average scalp projections from significantly discriminating components for the early face-selective component (Fig. 5b) indicated significant differences between face versus car trials at occipito-temporal electrode sites in the left and right hemispheres (negative correlation) and a number of centro-frontal sites (positive correlation). These results are consistent with functional neuroimaging studies (Kanwisher et al., 1996, 1997; Puce et al., 1996; Hasson et al., 2002) and several ERP/MEG studies (Botzel et al., 1995; Bentin et al., 1996; Halgren et al., 2000; Liu et al., 2000; Rossion et al., 2003), where face-sensitive activations are always found relative to objects in occipito-temporal cortex (mainly the inferior occipital and fusiform gyri) bilaterally. Some studies have also identified face-selective responses (Jeffreys, 1989, 1996) and target/nontarget responses (VanRullen and Thorpe, 2001), in addition to the occipito-temporal sites, in centro-frontal locations. These are also consistent with recent findings which identified active regions in the dorsal lateral prefrontal cortex (DLPFC) which are thought to be associated with decision making during a face versus house categorization task (Heekeren et al., 2004). The late face-selective component also demonstrated a very similar activation pattern to the early/N170 component (Fig. 5b), though with opposite sign. This was an interesting observation, though only a simultaneous fMRI study could determine definitively which cortical systems contributed to this component.
To emphasize the importance of spatial integration for identifying discriminating components predictive of the psychophysical sensitivity of our subjects, we used an alternative approach to computing a neurometric function. Instead of using our spatial integration algorithm, which weights the activity across all EEG sensors, we repeated the discrimination using only the activity from a single electrode. We chose electrode PO8, over the right face-selective activation area, where the greatest difference in the EEG signal between face and car trials was identified. The neurometric function that we constructed with this approach was not nearly as predictive of the psychophysical performance as the neurometric function which was computed using our spatial integration technique. Figure 6 illustrates this point.
Motor Activity not Predictive of Psychophysical Performance
In some cases we observed small, though significant, differences in reaction time for face versus car responses (e.g. see the sigmoidal reaction time curves in Fig. 4). To test whether the discriminating activity we identified was due to a difference in this reaction time, for example a component associated with motor activity, we generated neurometric functions using temporal windows near the reaction time. Specifically, for each subject we used a training window around the median reaction time at each phase coherence level. A typical curve derived during this period is also shown in Figure 6. It is clear from this plot that discriminating components extracted near reaction times (e.g. components representative of preparatory motor, motor or somatosensory activity) were not predictive of the psychophysical sensitivity of our subjects. Additional evidence that the two components, specifically the second component, are not reflective of response selection/motor programming can be seen by considering the change in the strength of the component as a function of coherence level. If the component were reflective of response selection, then one would expect no difference in the strength of the component at different coherence levels. Finally, the second component is strongly stimulus-locked, providing further evidence that it is not reflective of response selection/motor programming. We can conclude that the earlier and late component activities are not artifacts of reaction time differences but are in fact closely linked to the perception and decision making processes respectively.
Comparison to ERP Component Analysis
To reinforce the main findings of our study, we also present a brief summary of a traditional ERP analysis on our data. ERPs, though the result of trial averaging which compromises single-trial variability, are easier to interpret and can provide useful insights about the time course of the undelying visual processing. In our task we found strong differential activity (i.e. faces – cars) at virtually all electrode locations which emphasizes the magnitude of the effect. Figure 7 summarizes these results. Specifically, we computed difference ERP waveforms at six centro-frontal locations as well as at six occipito-parietal sites for one subject at the highest coherence level (i.e. 45%). Both waveforms indicate significant differential responses that peak at ∼170 ms (early component) and 330 ms (late component) after the onset of visual stimulation. Overlaid are the two 60 ms training windows used to achieve maximum single-trial discrimination for this particular subject (early window, light gray; late window, dark gray).
In addition to these maximally discriminating components, we observed an even earlier component (∼100 ms), the magnitude of which was not as strong as the other two. The presence of this component and its spatial distribution (as indicated by the corresponding scalp maps) are consistent with what was reported by Liu et al. (2002). In all cases, however, the Az values we computed for this component were not sufficiently high to account for the psychophysical performance of our subjects.
Scalp maps for each component, constructed based on the difference ERPs and our sensor projections a, are also shown in Figure 7. Red presents positive correlation between the sensors and the underlying discriminating component and blue represents negative correlation. Note that the sign of the differential ERP activity at the different electrode locations is consistent with the scalp topology. For instance, there is a negative correlation between the N170 component and the occipitoparietal sensors and positive correlations with centrofrontal locations. The signs flip for the later component.
Even though ERP analysis can identify both the early and later components, it could not unequivocally associate, especially the later component, to our subjects' decision making process. Single-trial analysis, on the other hand, provides a more rigorous, and direct, method to compare neuronal responses to psychophysical performance (which is not obtainable using simple correlation of sorted ERP derived amplitudes) and therefore directly addresses the decision making process.
Discussion
Our results demonstrate that neural correlates of perceptual decision making can be identified using high-spatial density EEG and that the corresponding component activities are temporally distributed. Clearly important to identification of these neural correlates is the spatial, and to a lesser extent the temporal integration of the EEG component activities. This approach is complementary to approaches using single and multi-unit recordings since it sacrifices spatial and some temporal resolution (local field potentials versus spike-trains) for a more spatially distributed view of the neural activity during decision making. The fact that we were able to identify neural correlates of perceptual decision making using relatively poor spatial resolution of EEG suggests that these neural correlates represent strong activities of neural populations and not the activity of a small number of neurons.
It is interesting to consider the temporal characteristics of the discriminating components we identified relative to models of evidence accumulation in decision making (Kim and Shadlen, 1999; Shadlen and Newsome, 2001; Mazurek et al., 2003). Unlike studies which use dynamic stimuli, such as moving random dot sequences, where evidence (via analysis of spatio-temporal correlation structure) can accumulate across time, our stimuli are static images with no (or little) spatiotemporal correlation from one to the next. Thus, the temporal nature of our neural correlates is directly related to the underlying nature of the internal processing for static object recognition.
Previous work using trial-averaged ERPs has attempted to identify the timing and activity of early and late components in object recognition. Several studies have claimed a component at 150 ms poststimulus representative of the speed of visual processing (Thorpe et al., 1996) and which is correlated with subject behavior (VanRullen and Thorpe, 2001). Other studies have claimed that a second, later component (∼300 ms) is in fact more directly correlated with the recognition process, with the earlier component corresponding to low-level feature discrimination (Johnson and Olshausen, 2003). Our findings provide additional evidence that the later component is more closely linked to a recognition event/decision, while also providing evidence for a cortical processing strategy that enables a trade-off between processing time and accuracy. Assuming a fast feed-forward recognition process within 150 ms of stimulus onset, the early component appears to represent a quick evaluation of the evidence which, while less accurate, could enable a faster response.
Recent work using fMRI has identified, for a similar categorization task, a region in the posterior portion of the DLPFC yielding a blood-oxygen-level-dependent (BOLD) signal that correlated with a difference signal between the two categories (|face – house|) and subsequently with subject performance (Heekeren et al., 2004). Correlation of performance with face-selective or house-selective regions in ventral temporal cortex was lower, leading to the conclusion that a strong neural correlate of perceptual decision making is localized to DLPFC and is essentially characterized by feed-forward processing. In all cases correlation was rather low and was done for the average BOLD signal.
Our results complement this study by characterizing the temporal evolution of component activities that are correlates of perceptual decision making. In addition, our approach precisely quantifies the relationship between the neural signal and behavior through the comparison of psychometric and neurometric functions. The construction of neurometric functions was enabled by our single-trial EEG analysis methods. We saw component activity predictive of decision making and consistent with signaling between occipito-temporal and frontal networks. Unlike fMRI, however, EEG does not have sufficient spatial resolution to precisely identify the cortical regions responsible for these components. Simultaneous methods for acquiring EEG and fMRI may provide a better picture of these spatio-temporal network dynamics indicative of cortical processing underlying perceptual decision making.
We thank Lucas Parra, Jim Muller, and Robin Goldman for valuable comments on earlier versions of this manuscript. This work was funded by the Office of Naval Research (N00014-01-1-0625) and by the National Institutes of Health (EB004730).
References
Bentin S, Allison T, Puce A, Perez A, McCarthy G (
Botzel K, Schulze S, Stodieck SR (
Britten KH, Shadlen MN, Newsome WT, Movshon JA (
Britten KH, Newsome WT, Shadlen MN, Celebrini S, Movshon JA (
Dakin SC (
Halgren E, Raij T, Marinkovic K, Jousmaki V, Hari R (
Hasson U, Levy I, Behrmann M, Hendler T, Malach R (
Heekeren HR, Marrett S, Bandettini PA, Ungerleider LG (
Hernandez A, Zainos A, Romo R (
Hupe JM, James AC, Payne BR, Lomber SG, Girard P, Bullier J (
Jeffreys DA (
Johnson JS, Olshausen BA (
Jordan MI, Jacobs RA (
Kanwisher N, Chun MM, McDermott J, Leden PJ (
Kanwisher N, McDermott J, Chun MM (
Keysers C, Perrett DL (
Kim JN, Shadlen MN (
Lamme VA, Super H, Spekreijs H (
Liu J, Higuchi M, Marantz A, Kanwisher N (
Liu J, Harris A, Kanwisher N (
Mazurek ME, Roitman JD, Ditterich J, Shadlen MN (
Newsome WT, Britten KH, Movshon JA (
Parra LC, Alvino C, Tang A, Pearlmutter B, Young N, Osman A, Sajda P (
Parra LC, Spence CD, Gerson AD, Sajda P (
Puce A, Allison T, Asgari M, Gore JC, McCarthy G (
Romo R, Hernandez A, Zainos A, Brody C, Salinas E (
Rossion B, Joyce CA, Cottrell GW, Tarr MJ (
Shadlen MN, Newsome WT (
Shadlen MN, Newsome WT (
Super H, Spekreijse H, Lamme VA (
Thorpe S, Fize D, Marlot C (
VanRullen R, Koch C (
VanRullen R, Thorpe S (