Diagnostic Features of Emotional Expressions Are Processed Preferentially

Elisa Scheller; Christian Büchel; Matthias Gamer

doi:10.1371/journal.pone.0041792

Abstract

Diagnostic features of emotional expressions are differentially distributed across the face. The current study examined whether these diagnostic features are preferentially attended to even when they are irrelevant for the task at hand or when faces appear at different locations in the visual field. To this aim, fearful, happy and neutral faces were presented to healthy individuals in two experiments while measuring eye movements. In Experiment 1, participants had to accomplish an emotion classification, a gender discrimination or a passive viewing task. To differentiate fast, potentially reflexive, eye movements from a more elaborate scanning of faces, stimuli were either presented for 150 or 2000 ms. In Experiment 2, similar faces were presented at different spatial positions to rule out the possibility that eye movements only reflect a general bias for certain visual field locations. In both experiments, participants fixated the eye region much longer than any other region in the face. Furthermore, the eye region was attended to more pronouncedly when fearful or neutral faces were shown whereas more attention was directed toward the mouth of happy facial expressions. Since these results were similar across the other experimental manipulations, they indicate that diagnostic features of emotional expressions are preferentially processed irrespective of task demands and spatial locations. Saliency analyses revealed that a computational model of bottom-up visual attention could not explain these results. Furthermore, as these gaze preferences were evident very early after stimulus onset and occurred even when saccades did not allow for extracting further information from these stimuli, they may reflect a preattentive mechanism that automatically detects relevant facial features in the visual field and facilitates the orientation of attention towards them. This mechanism might crucially depend on amygdala functioning and it is potentially impaired in a number of clinical conditions such as autism or social anxiety disorders.

Citation: Scheller E, Büchel C, Gamer M (2012) Diagnostic Features of Emotional Expressions Are Processed Preferentially. PLoS ONE 7(7): e41792. https://doi.org/10.1371/journal.pone.0041792

Editor: David Whitney, University of California, Berkeley, United States of America

Received: March 21, 2011; Accepted: June 29, 2012; Published: July 25, 2012

Copyright: © 2012 Scheller et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: The study was supported by a grant from the Bundesministerium für Bildung und Forschung (BMBF, Network “Social Cognition”). C.B. is additionally funded by the German Research Foundation (DFG) SFB TRR 58, project B3. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Human faces are stimuli we are exposed to every day. Throughout our lives, we have probably seen thousands of faces and certainly looked at some of them more closely to discover that they show various expressions. Human communication not only consists of voice messages but is disambiguated by gesture and facial expression. In line with this reasoning, emotionally expressive faces seem to be processed preferentially as compared to neutral ones [1], [2]. Thus, as social beings, it is important for us to be able to understand and interpret facially displayed emotions correctly. But how do we analyze faces to determine which expression they show us? The simple answer is that we seem to use diagnostic facial features. Already in 1944, Hanawalt showed that different facial features are important to distinguish between different specific emotions [3]. For example, he suggested the mouth to be most informative for recognizing happy faces and the eyes to be most important for detecting fearful facial expressions.

Recently, these findings were confirmed with the help of technically more sophisticated approaches. In 2001, the Bubbles technique was developed [4] and used to reveal that diagnostic features differ as a function of the task at hand [5]. This latter study revealed that the eye and mouth region across a wide range of spatial frequencies were diagnostic in an identity recognition task, which may indicate that the relationship between these features is crucial for the identification of a person. By contrast, the left side of the face around the eye region was most diagnostic for gender discrimination and the mouth region was most relevant for determining whether the depicted face showed a happy or neutral facial expression. Thus, it already appeared that different sets of facial features are diagnostic for different types of task. Using a similar technique, the distribution of diagnostic facial features was determined for each of the six basic emotional expressions [6]. In this study, observers yielded best results when the eye region was visible for fearful, the mouth region for happy and a mixture of these and other facial features for neutral facial stimuli.

An extraction of information from different facial features can also be assessed by monitoring eye movements and analyzing fixation patterns. Such a procedure was adopted by a number of studies focusing on clinical populations. For example, the comparison of visual scan paths of persons with autism spectrum disorder and a control group in an emotion classification task revealed a strong bias for primarily scanning the eye region in both groups [7]. Comparable results were reported by Hernandez and colleagues, who found a clear preference for fixating the eye region of sad, happy and neutral faces in autistic as well as in control subjects [8]. Additionally, there seemed to be a trend for spending relatively less time on the eye region and more time on the mouth of happy faces as compared to the other expressions in healthy controls. This may indicate that observers’ scanning behavior is sensitive to the diagnostic features of different emotional expressions. Although it is still debated whether patients with autism spectrum disorder scan (emotional) faces differently than healthy controls [9], [10], these studies reveal that eye tracking can be highly useful for elucidating information extraction processes during face perception.

However, a major drawback of these studies is the use of comparably long exposure times (typically longer than 2 s) which only allow for characterizing explicit face perception mechanisms which presumably are under conscious control. Recent evidence suggests that briefly presented faces also trigger very early, potentially reflexive, eye movements that are sensitive to the distribution of diagnostic facial features [11]. In this particular study, fearful, happy, angry and neutral faces were presented briefly (150 ms) so that observers were only able to accomplish a saccade after stimulus offset. Furthermore, the initial fixation was manipulated and subjects fixated on either the eye or mouth region in one half of all trials, respectively. Surprisingly, although saccades did not allow for extracting further information from the stimuli, observers showed a relatively large amount of fixation changes after stimulus offset. Across all facial expressions, reflexive gaze changes toward the eye region occurred much more frequently than fixation changes leaving the eye region. This is consistent with previous findings documenting a clear preference for using information from the eye region already at an early point in the time [12]. However, reflexive eye movements after stimulus offset were also sensitive to diagnostically relevant regions in the face [11]. Thus, gaze preferences for the eye region were largest for fearful and neutral faces and substantially reduced for happy facial expressions that triggered more fixation changes toward the mouth.

These findings indicate that human observers exhibit a tendency of automatically extracting information from the eye region of conspecifics. Additionally, they seem to be prone to quickly search for salient facial features that allow for validly identifying the current emotional state of the opponent. However, as most above-mentioned studies explicitly required participants to recognize the depicted emotional expression, it is unclear whether the observed gazing pattern reflects an automatic mechanism or is driven by task demands. Furthermore, previous studies did not examine whether eye movements were only modulated by low-level image features that trigger bottom-up attentional processes [13], [14] or reflect the influence of a top-down mechanism. To clarify these issues, we carried out two eye tracking studies. In Experiment 1, observers had to accomplish an emotion classification, a gender discrimination and a passive viewing (oddball) task using negative (fearful), positive (happy) and neutral facial expressions. To differentiate between early, potentially reflexive, eye movements on the one hand and a more elaborate scanning on the other, faces were either presented briefly (150 ms) or long enough to accomplish several saccades during stimulus presentation (2000 ms). In this experiment, we manipulated whether participants initially fixated on the eye or mouth region and we determined whether saccades and fixations showed a preference for the eye region in general [12] and for the varying diagnostic features of the emotional expressions [6], [11]. To rule out that low level saliency in certain parts of the images was driving these responses and to examine the general influence of the presentation position in the visual field, Experiment 2 was carried out using a specifically tailored set of stimuli with comparable saliency in the eye and mouth region across different emotional facial expressions. These faces were either presented in the upper half, the middle or the lower half of the display screen.

Experiment 1

Materials and Methods

Participants.

This study was approved by the ethics committee of the medical faculty of the University of Rostock and conducted according to the principles expressed in the Declaration of Helsinki. All participants gave written informed consent and were paid for participation. Twenty-five students participated voluntarily in the experiment. One male subject was excluded because of too many invalid eye tracking trials. The final sample consisted of 12 women and 12 men, aged between 19 and 27 years (M = 24.13; SD = 3.62). All had normal or corrected to normal vision and were informed beforehand to wear contact lenses instead of glasses and to refrain from using eye make-up.

Design.

The experiment was based on a 3 × 2 × 3 × 2 within-subjects design with the factors task, presentation time, emotional expression and initial fixation. These factors were specified as follows: 1) Subjects had to accomplish three different tasks in separate experimental blocks while eye tracking data was recorded: An emotion classification, a gender discrimination and a (passive) target detection task. 2) Within each block, portrait pictures of faces were either presented for 150 or 2000 ms and 3) these faces either showed fearful, happy, or neutral expressions. 4) Additionally, the initial fixation was systematically varied by unpredictably shifting faces down- or upwards on each trial such that subjects initially fixated either on the eye or mouth region.

Stimuli and tasks.

Stimuli were presented on a 20″ Samsung SyncMaster 204B display (40.64 cm × 30.48 cm) with a resolution of 1600 by 1200 and a refresh rate of 60 Hz. The distance from the participants’ eyes to the monitor was 58 cm. The fearful, happy and neutral faces that were shown during the experiment were selected from several picture sets (The Karolinska directed emotional faces, KDEF, [15]; Pictures of facial affect, [16]; NimStim, www.macbrain.org/resources.htm; and the FACES database, [17]). These faces were slightly rotated such that both pupils were always on the same imagined horizontal line. Subsequently, pictures were converted to grayscale images and cropped with an ellipse in order to hide features that do not carry information on the emotional status of the conspecific (e.g. hair, ears). Finally, cumulative brightness was normalized across pictures. Overall, the pool of stimuli consisted of 126 fearful, 144 happy, and 140 neutral facial expressions. To control for the initial fixation, the stimuli within each emotional expression were shifted either downward or upward on each trial. This resulted in either the eye or the mouth region appearing at the location of the fixation cross.

Each trial began with a fixation cross shown for 2000 ms on a uniform grey background. Afterwards, faces were presented either for 150 or 2000 ms. The short duration was chosen to ensure that subjects could reliably identify the facial expression without being able to change their fixation during stimulus presentation. Any “reflexive” saccade that is related to the stimulus presentation would start when the picture already disappeared from the screen (for a similar procedure see [11], [18]). By contrast, within the larger presentation time of 2000 ms, subjects are able to accomplish several saccades during stimulus presentation to visually scan the picture in detail. To achieve an overall trial length of 5000 ms, a blank screen following picture presentation was either presented for 2850 or 1000 ms depending on whether the preceding picture was shown for the short or the long period. The period between two successive trials was varied randomly between 1000 and 3000 ms (see Figure 1). Presentation 13.0 (Neurobehavioral Systems) was used to control stimulus presentation and to record behavioral responses during the tasks.

Download:

Figure 1. Illustration of the trial structure (Experiment 1).

https://doi.org/10.1371/journal.pone.0041792.g001

In each experimental task, 36 pictures of different male and female persons were presented in a randomized order. The pictures were selected randomly for each subject from the whole pool of stimuli. Overall, males (M = 51.3%, SD = 2.7%) and females (M = 48.6%, SD = 2.7%) were presented approximately equally often. Each picture was shown twice and subjects had to accomplish different tasks as accurately and as quickly as possible by pressing the corresponding key on the computer keyboard. In the emotion classification task, subjects had to decide whether the face showed a fearful, happy or neutral facial expression. In the gender discrimination task, subjects were required to decide whether the displayed person was male or female. The target detection task was more or less passive with respect to the evaluation of the face. Subjects simply had to press one button whenever a rarely presented color image was shown. To this aim, eight color images (two males and two females with neutral facial expression) were randomly interspersed into the stimulus sequence.

Eye tracking data were recorded during the three tasks with a sampling rate of 1000 Hz using a Table-mounted Eyelink 1000 remote infrared pupil-corneal reflection eye imaging system (SR Research Ltd., Ottawa, Canada).

Procedure.

After arriving at the laboratory, subjects completed a brief questionnaire asking for sociodemographic data (age, gender, profession, current medication). Afterwards, participants were verbally instructed before the eye tracking tasks started. In general, they were told to look at the screen during the experiment and to avoid large head and body movements. Whenever a fixation cross was presented during the tasks, they were told to fixate it continuously. While a picture was shown or when the screen was uniformly gray, they should feel free to change their gaze and look everywhere on the screen they wanted to. However, blinks during that period should be avoided. The order of the three tasks involving eye tracking was counterbalanced across participants. Each task started with 12 training trials (+2 color images in the passive condition) using a set of different faces. Subsequently, the eye tracking system was calibrated using nine points, the calibration was validated and the actual task started.

Data reduction and analysis.

From the behavioral data, we calculated the proportion of hits (correct reactions) for each of the three eye tracking tasks. For the emotion classification- and the gender discrimination tasks, we examined effects of the experimental manipulations on the behavioral data using a 2 × 3 × 2 repeated measures ANOVA with the factors presentation time, emotion and initial fixation.

Two different measures were extracted from the eye tracking data: First, we analyzed the first saccade that was accomplished after stimulus onset. To this aim, we eliminated trials containing blinks and trials with saccades >1° occurring within a period of −300 to 150 ms relative to stimulus onset. Subsequently, we subtracted the prestimulus baseline from the position data of each valid trial to remove drifts. Afterwards, the first saccade exceeding 1° within 1000 ms after stimulus onset was detected. Furthermore, the saccade was required to occur at least 150 ms after stimulus onset (i.e. after the stimulus offset of the briefly presented faces). This saccade was classified according to whether it was directed towards the other major facial feature. Thus, when the eyes were presented at fixation, we identified the number of downward fixation changes toward the mouth and when the mouth followed the fixation cross, we calculated the corresponding proportion of upward saccades toward the eyes. These numbers were divided by the total number of valid trials to obtain proportions of fixation changes as a function of the experimental manipulations. Using a 3 × 2 × 3 × 2 repeated measures ANOVA on these proportions we tested for effects of task, presentation time, emotion and initial fixation on these “reflexive” saccades.

Second, fixation durations were analyzed for the 2000 ms stimulus duration. Valid trials were identified similarly to the saccadic data as described above and for these trials, we determined the amount of time subjects spent looking at either the eye region or the mouth region using predefined rectangular regions of interest that were centered on the respective facial feature. This cumulative fixation time on these regions was divided by the amount of time subjects spent looking at the presented face in general. To determine whether the experimental manipulations affected fixations on the eye and mouth region, we calculated a 3 × 3 × 2 × 2 repeated measures ANOVA on these proportions using the factors task, emotion, initial fixation and the facial feature that was actually fixated.

Since it is possible that differential attention toward the eye or mouth region of the presented faces is driven by differences in low-level image features, we accomplished a second, post-hoc analysis taking into account the distribution of saliency across the face. To this aim, saliency maps were calculated for all faces that were used in the experiment. By using a biologically plausible processing hierarchy [14], the algorithm analyzes, for every pixel, how distinct that location on the image was along dimensions of luminance, contrast, orientation and spatial scale [13], [19]. The analysis results in a map for every stimulus used in the experiment showing image locations that stand out in terms of their low-level features from the background. For these maps, we computed the mean saliency in the eye or mouth region, respectively, using the same regions of interest as for the analysis of the fixation data. Finally, we calculated a saliency ratio by dividing the mean saliency of the eye region by that of the mouth region. As can be seen by the histogram depicted in Figure 2, the eye region had a generally higher saliency than the mouth region (ratios are above 1 on average). Furthermore, the distribution of saliency ratios differed between the emotional expressions. To ensure that the latter difference did not affect our results regarding the saccadic and the fixation data, we selected for each participant a subset of 6 pictures for each emotional expression, respectively, that had a similar saliency ratio across facial expressions. Mean ratios were 1.40 (SD = 0.05) for fearful, 1.39 (SD = 0.06) for happy, and 1.41 (SD = 0.05) for neutral facial expressions. These values did not differ significantly, F_(2,46) = 2.30, ε = .58, p = .14, partial η² = .09. In addition to the analyses of the whole stimulus set, we calculated comparable repeated measures ANOVAs on the saccadic and the fixation data for this subset of faces. Due to the small amount of stimuli that had equal saliency ratios, we collapsed data across tasks and presentation times for these analyses.

Download:

Figure 2. Frequency histograms showing the distribution of saliency ratios (average saliency in the eye region divided by the average saliency in the mouth region) for fearful, happy and neutral facial expressions.

https://doi.org/10.1371/journal.pone.0041792.g002

For all repeated-measures ANOVAs involving more than one degree of freedom in the enumerator, the Huynh-Feldt procedure was applied to correct for potential violations of the sphericity assumption. A rejection criterion of p<.05 was used for all statistical tests and partial η² is reported as an effect size index.

Results and Discussion

Behavioral data.

Overall, participants were very accurate in all three tasks. In the emotion classification task, mean proportions of correct responses were above 93% in all experimental conditions (see Table 1) and the statistical analysis did not reveal any significant effect of the experimental manipulations on the hit rates in this task.

Download:

Table 1. Experiment 1: Proportions of correct responses in the emotion classification and the gender discrimination task as a function of presentation time, initial fixation and emotional expression.

https://doi.org/10.1371/journal.pone.0041792.t001

For the gender discrimination task, hit rates were similarly high (Table 1) but we observed larger hit rates when faces were visible for 2000 ms (main effect presentation time: F_(1,23) = 13.35, p<.001, partial η² = .37). Furthermore, gender was identified more accurately for happy as compared to neutral and fearful faces (main effect emotion: F_(2,46) = 11.89, ε = 1.00, p<.001, partial η² = .34). The main effect of the initial fixation as well as all interactions did not reach statistical significance.

In the passive viewing task, subjects reached an overall hit rate of 99.8% in detecting a non-target which means that they were almost always correct in not delivering a keypress when a monochrome stimulus was presented. The hit rate of target detections (i.e. keypresses for colored faces) was 100%.

Eye tracking data: First saccade after stimulus onset.

The average number of valid trials without blinks or fixation changes between −300 and 150 ms relative to stimulus onset was 65.96 of 72 (SD = 5.03) in the emotion classification task, 67.21 of 72 (SD = 4.67) in the gender discrimination task and 68.33 of 72 (SD = 4.46, only trials with non-target stimuli) in the passive viewing condition.

Overall proportions of fixation changes as a function of task, emotional expression and initially fixated feature are depicted in Figure 3. It can be clearly seen that across all conditions there were far fewer saccades leaving the eye region than comparable fixation changes towards the eyes (main effect initial fixation: F_(1,23) = 83.80, p<.001, partial η² = .79). Moreover, the proportion of fixation changes was larger for the longer presentation time (main effect presentation time: F_(1,23) = 40.05, p<.001, partial η² = .64) and more saccades occurred in the emotion classification and the gender discrimination task than in the passive viewing condition (main effect task: F_(2,46) = 6.56, ε = .95, p<.01, partial η² = .22). The proportion of fixation changes across different facial expressions were comparable and the main effect of emotion did not reach statistical significance (F_(2,46) = 0.84, ε = 1.00, p = .44, partial η² = .04).

Download:

Figure 3. Proportions of fixation changes towards the other major facial feature as a function of task, presentation time, facial expression and initial fixation (Experiment 1).

Error bars indicate standard errors of the mean.

https://doi.org/10.1371/journal.pone.0041792.g003

With respect to the interaction effects, we observed a statistically significant interaction between task and initial fixation (F_(2,46) = 8.40, ε = .90, p<.01, partial η² = .27). If subjects initially fixated on the eyes, fixation changes occurred most often in the emotion classification task and least often in the passive task; but if they initially fixated on the mouth, the proportion of saccades was more similar across tasks (Figure 3).

Additionally, we observed an interaction between the factors emotion and initial fixation (F_(2,46) = 7.17, ε = 1.00, p<.01, partial η² = .24; Figure 4A). When displaying fearful or neutral as opposed to happy facial expressions, participants showed a higher preference for the eye region. Thus, they tended to shift their gaze when initially looking at the mouth and they showed a lower number of saccades in the opposite direction. By contrast, participants showed a higher preference for the mouth region when viewing happy faces. This interaction effect was similar across tasks and presentation times (non-significant interaction task × emotion × initial fixation: F_(4,92) = 1.10, ε = .99, p = .36, partial η² = .05; non-significant interaction presentation time × emotion × initial fixation: F_(2,46) = 0.52, ε = 1.00, p = .60, partial η² = .02).

Download:

Figure 4. Illustration of the modulatory effect of facial expression on fixation changes (A,C) and fixation durations (B,D) across all experimental tasks and presentation times (Experiment 1).

The upper panels (A,B) show the values for the whole stimulus set whereas the lower panels (C,D) only depict the respective values for a subset of faces with a comparable saliency ratio in the eye as compared to the mouth region. Error bars indicate standard errors of the mean.

https://doi.org/10.1371/journal.pone.0041792.g004

Eye tracking data: Fixation duration during the long presentation time (2000 ms).

Figure 5A illustrates the normalized cumulative time subjects spent looking at different regions of fearful, happy, and neutral faces. The overall density of fixations was higher when an emotional face (fearful or happy) was shown but across all expressions, participants fixated the eyes for a large amount of time. Additionally, the mouth of happy expressions was fixated longer than that of fearful or neutral faces. Note that the high density of fixations on the bridge of the nose most likely resulted from the initial fixation on that position in one half of all trials.

Download:

Figure 5. Heat maps illustrating the normalized fixation time on different face regions for the long presentation time of Experiment 1 (A) and the normalized distribution of saliency (B) as derived from a computational model of bottom-up visual attention [13], [14] for fearful, happy and neutral facial expressions.

https://doi.org/10.1371/journal.pone.0041792.g005

For a more detailed analysis of the fixation data, proportions of time looking at either the eyes or the mouth relative to the whole time participants spent fixating the presented face were calculated. Because of missing data, results are based on 23 of the 24 subjects. Proportions of time spent fixating on the two major facial features as a function of task, initially fixated location and emotional expression can be seen in Figure 6. Overall, subjects fixated much longer on the eyes than on the mouth (main effect feature: F_(1,22) = 93.15, p<.001, partial η² = .81). For happy facial expressions (main effect emotion: F_(2,44) = 17.10, ε = 1.00, p<.001, partial η² = .44) as well as when initially looking at the eyes (main effect initial fixation: F_(1,22) = 7.33, p<.05, partial η² = .25), the overall proportion of fixations on the eye or mouth region was slightly larger. All these effects were independent of the task at hand. Neither the main effect of task (F_(2,44) = 0.24, ε = 1.00, p = .79, partial η² = .01) nor the interactions with the other three factors reached statistical significance.

Download:

Figure 6. Proportion of time spent fixating either the eye or the mouth region in relation to the time subjects spent fixating the overall face in the long presentation time condition (Experiment 1).

Mean proportions for fixations on the eye or mouth region are shown as a function of task, initial fixation and facial expression with error bars indicating standard errors of the mean. The regions of interest that were used to define fixations in the eye or mouth region, respectively, are shown on the right side.

https://doi.org/10.1371/journal.pone.0041792.g006

Furthermore, we observed that the proportion of fixations on the eye or mouth region depended on the initial fixation (interaction between the factors initial fixation and fixated facial feature: F_(1,22) = 24.43, p<.001, partial η² = .53). That is, when subjects initially fixated the eyes, they also showed longer fixations on the eye region across the whole trial. A similar pattern was observed when initially looking at the mouth (see Figure 6).

Corresponding to the saccadic data, subjects spent a larger amount of time fixating on the eyes when neutral or fearful faces were displayed but fixated the mouth more often when faces displayed a happy emotional expression (interaction between fixated facial feature and emotion: F_(2,44) = 19.34, ε = .86, p<.001, partial η² = .47, see Figure 4B). Similar to the saccadic data, this effect was independent of the currently accomplished task (non-significant interaction task × emotion × fixated facial feature: F_(4,88) = 2.05, ε = .68, p = .12, partial η² = .09).

Eye tracking data: The influence of low-level attentional saliency.

Figure 5B shows the average saliency maps for all facial expressions. These maps indicate that the distribution of low-level stimulus characteristics that may drive eye movements differs between facial expressions. To examine whether this differential distribution can account for the observed pattern of saccades and fixations, we repeated the analyses for a subset of facial expressions that had a similar ratio of saliency in the eye and mouth region, respectively. Because of missing data, these analyses are based on 23 of the 24 subjects.

The analysis of the saccades that were triggered by the faces across tasks and presentation times revealed a significant main effect of initial fixation (F_(1,22) = 53.19, p<.001, partial η² = .71) as well as a significant interaction of emotional expression and initial fixation (F_(2,44) = 3.43, ε = .87, p<.05, partial η² = .13). As these effects are comparable to the original analysis for the whole stimulus set (see Figure 4C vs. 4A), they indicate that the distribution of low-level attentional features as a function of the emotional expression does not account for the observed effects.

A comparable analysis of the fixation data for the long presentation time (2000 ms) revealed significant main effects of facial feature (F_(1,22) = 106.56, p<.001, partial η² = .83) and emotional expression (F_(2,44) = 5.50, ε = .93, p<.01, partial η² = .20) as well as a significant interaction between both factors (F_(2,44) = 5.52, ε = 1.00, p<.01, partial η² = .20). Comparable to the saccadic data, subjects spent more time fixating the eye region for fearful and neutral faces whereas a relatively enhanced fixation time was observed for the mouth region of happy facial expressions. As this pattern of results was also highly similar to the analysis of the whole stimulus set (see Figure 4D vs. 4B), it indicates that differences in the viewing pattern as a function of the emotional expression cannot be accounted for by low-level attentional processes.

Taken together, these results show that participants preferentially examine the eye region and show a bias to quickly scan diagnostic features of emotional facial expressions. That is, the eye region was more extensively scanned for fearful and neutral facial expressions whereas relatively more attention was paid to the mouth region when happy faces were shown. This effect is not related to differences in low-level image features since a post-hoc analysis of faces with a comparable ratio of saliency in the eye as compared to the mouth region revealed similar results as the analysis of the whole stimulus set. The general preference for scanning the eye region, however, might be a result of the larger amount of saliency in this image region. Moreover, since we manipulated the initial fixation on the faces, this general preference might also be related to the fact that the eyes were presented in the upper part of the visual field in one half of all trials whereas the mouth was never presented above the center of the screen. Thus, the supposed preference for the eye region can in principle result from a general bias to pay more attention to the upper part of the visual field. This bias might result from a different distribution of functional units in the lower and upper visual field [20], [21] and such bias was already demonstrated for eye movements in a visual search task [22]. To examine to what degree such bias is reflected in the results of this experiment, we carried out a second experiment where we manipulated the position where the faces were presented. Fearful, happy and neutral faces were either shown in the upper half, the middle or the lower half of the display screen. Moreover, stimuli were preselected according to the distribution of low-level image saliency in the eye and mouth region and we specifically selected a subset of faces with similar values in these facial areas to test whether participants would still show a preferential scanning of the eye region.