First impressions have a profound effect on our everyday lives. We use them to determine who we should approach and who we should avoid. They can be a deciding factor in mate choice, trustworthiness judgments, and hiring decisions. Moreover, there is evidence that they may influence court decisions (Zebrowitz & McDonald, 1991; Zebrowitz & Montepare, 2008), election results (Olivola & Todorov, 2010; Verhulst, Lodge, & Lavine, 2010), and professional evaluations (Ambady & Rosenthal, 1993). A growing number of studies are examining the way in which we quickly and automatically make trait impressions of others and use that knowledge (Cloutier, Kelley, & Heatherton, 2010; Uleman, Saribay, & Gonzalez, 2008; Van Overwalle, 2009; Van Overwalle & Labiouse, 2004), but few (Harvey, Fossati, & Lepage, 2007; Mitchell, Macrae, & Banaji, 2004; Schiller, Freeman, Mitchell, Uleman, & Phelps, 2009) have examined the conditions under which we remember these impressions. This is surprising, because the memory of these impressions has the capacity to influence our future actions. Though current research suggests that we are experts at forming quick, automatic impressions, little is known about the processes that support retaining these impressions in long-term memory.

Though much can be learned about first impressions from behavioral measures, an investigation of the factors that influence first-impression formation and the corresponding neural underpinnings would allow one to ask more nuanced questions about forming impressions and their storage in memory. Accumulating evidence in the memory literature has suggested that the broad distinction between the neural substrates supporting semantic, episodic, and procedural memory may also extend to distinct classes of elaborative semantic encoding processes, perhaps including those in the social domain. From their review of the literature, Macrae and colleagues suggested that largely disparate neural networks are activated during the successful formation of memories in response to verbal, visual, emotional, and self-referential processing, consistent with the idea that different processes contribute to the formation of distinct varieties of episodic memories (Macrae, Moran, Heatherton, Banfield, & Kelley, 2004). Recent investigations have upheld this division, in particular as it relates to the processing of social and emotional information (Gutchess, Kensinger, & Schacter, 2010; Haas & Canli, 2008; Harvey et al., 2007; Mitchell et al., 2004). Though the hippocampus plays a key role in the encoding network for memory for many classes of information, additional disparate brain regions support specific subcategories. Thus, as in a comparable system that aids in encoding emotional information into memory (Schacter, Gutchess, & Kensinger, 2009), there may be a dedicated system for encoding first impressions (and more broadly, social information) into memory. Given how important social interaction is to the human condition, we would expect to find evidence for the contributions of a social cognition network to the encoding of first impressions into memory.

Applying a social neuroscience approach to understand how people form impressions of others advances our understanding of the component processes and the feedforward and feedback loops that shape our perceptions of others (for reviews, see Ames, Fiske, & Todorov, 2011; Rule & Ambady, 2008). A number of neural regions respond to impression formation, reflecting the complexity of the processes involved and the interconnectedness of the network that allows impressions to be invoked so instantaneously. These regions include the amygdala, which responds to emotional and evaluative conditions (Schiller et al., 2009) and to appearance-based cues, such as trustworthiness (Said, Baron, & Todorov, 2009; Winston, Strange, O’Doherty, & Dolan, 2002); the caudate nucleus, which responds to reward and feedback (Delgado, Nystrom, Fissell, Noll, & Fiez, 2000); the superior temporal sulcus, which responds to others’ intentions (Saxe, Xiao, Kovacs, Perrett, & Kanwisher, 2004); and the fusiform gyrus, which is invoked by face processing (Winston, Henson, Fine-Goulden, & Dolan, 2004). The dorsal medial prefrontal cortex (dmPFC), a region of the frontal cortex, has been particularly implicated in impression formation, as well as in a wide array of social processes (Amodio & Frith, 2006; D’Argembeau et al., 2007; Harvey et al., 2007; Macrae et al., 2004; Mitchell, Cloutier, Banaji, & Macrae, 2006; Mitchell, Macrae, & Banaji, 2004, 2005, 2006). Furthermore, virtually all studies that have investigated first impressions in their manifestation as trait judgments have implicated the dmPFC (Mason, Dyer, & Norton, 2009; Mitchell, 2008; Mitchell, Ames, Jenkins, & Banaji, 2009; Mitchell, Cloutier, et al., 2006; Mitchell, Macrae, & Banaji, 2004, 2005, 2006; Todorov, Gobbini, Evans, & Haxby, 2007; the exception is Heberlein & Saxe, 2005, whose control was an emotional task, which might also engage dmPFC). Not only does the dmPFC respond to social information, but it also mediates the encoding of first impressions into memory (Mitchell et al., 2004). In Mitchell et al.’s (2004) study, participants read sentences depicting actions that were paired with a picture of a face. Participants were asked either to form an impression of the face–sentence pair or to remember the sequence of the presented actions. While forming an impression, dmPFC activity was higher when the face–statement pair was later remembered rather than later forgotten. However, activity in this region did not predict successful encoding when participants were oriented to the sequence of statements. In contrast, the right hippocampus predicted successful encoding for the sequencing task, but not for the impression task. Mitchell et al.’s (2004) study suggests that successful encoding of social information, but not of nonsocial information, may be mediated by the dmPFC.

DmPFC not only contributes to the encoding of impressions, but also is more engaged under specific conditions of impression formation. Mitchell, Cloutier, et al. (2006) found that the dmPFC was sensitive to the nature of the information on which an impression was based. Participants viewed face–sentence pairs with sentences depicting actions that were either diagnostic for a specific trait (i.e., behaviors that conveyed information that allowed one to form an impression, such as generous, boring, lazy, or friendly) or nondiagnostic (i.e., behaviors that did not strongly indicate a specific personality trait, such as “he photocopied the article”). When participants were told to form an impression, dmPFC was engaged, but the activity did not differ for diagnostic versus nondiagnostic information, suggesting that the region is highly engaged when forming impressions, regardless of the content of the information. However, when impression formation was unintentional (i.e., participants focused on the sequences in which statements were presented), dmPFC response differentiated diagnostic from nondiagnostic trials. This suggests that when incidentally forming impressions, the region selectively and spontaneously responds to information that supports impression formation, and that it can be engaged when attempting to form an impression consciously, even when the available information does not indicate a particular trait.

Taken together, these studies highlight the important role that the dmPFC plays in encoding first impressions. However, the conditions under which the region contributes to impression formation are little understood. Mitchell, Cloutier, et al.’s (2006) surprising finding that the dmPFC does not differentiate between diagnostic and neutral information when people explicitly attempt to form impressions (although this is not the case for implicit impression formation) suggests that an individual’s goals and the type of information with which he is presented have implications for how impressions are stored in memory for future use. Importantly, this study did not test memory, so it is unknown whether these differential effects also influence the encoding of trait impressions into memory.

In addition to the dmPFC, one may expect to find that structures implicated in memory formation, such as the hippocampus and medial temporal lobes, would also support memory for impressions. However, the hippocampus did not emerge in a study investigating the encoding of impressions, and in fact was only implicated in forming memories for the nonsocial-sequencing comparison condition (Mitchell et al., 2004). This is consistent with the patient literature, in which patients with damaged hippocampi are able to learn trait associations to faces, but not when the amygdalae and temporal lobes are damaged (Todorov & Olson, 2008), and circumstantially from patient H.M., who exhibited some semantic knowledge of famous people postoperatively (O’Kane, Kensinger, & Corkin, 2004). In the neuroimaging literature, however, it is unclear whether Mitchell et al.’s (2004) use of a sequencing task is the best nonsocial comparison task with which to assess the contribution of the hippocampus to the encoding of impressions, because the hippocampus is implicated in sequencing, regardless of memory demands (Eichenbaum, 2004; Lehn et al., 2009). A comparison task that is not known to be hippocampally dependent may be more appropriate to test the involvement of the dmPFC and the hippocampus in the successful formulation of memories.

Thus, we can make a number of predictions regarding the function of a dedicated social memory system. We would expect this system to respond differentially on the basis of intentionality, such that intentionally formed impressions would contribute to encoding success differently than would incidentally formed impressions. Though some research has suggested that intentionality does not affect the processing of trait impressions (Todorov & Uleman, 2003; J. Willis & Todorov, 2006), other research has suggested that additional neural regions are recruited during intentional impression formation, perhaps reflecting broader consideration of the available information in order to confirm initial impressions (Ma, Vandekerckhove, Van Overwalle, Seurinck, & Fias, 2011). Explicit effort supports the ability to recognize emotional facial expressions, although implicit conditions reveal impairments in patients with lesions to the orbitofrontal cortex (M. L. Willis, Palermo, Burke, McGrillen, & Miller, 2010). Given the importance of the orbitofrontal cortex to emotional processing, this finding suggests that explicit judgments recruit and utilize additional resources as compared to implicit impression formation. Mitchell, Cloutier, et al. (2006) also found differences in dmPFC activation in implicit but not explicit person perception, suggesting that additional resources may be recruited when explicitly forming impressions even for nondiagnostic events. Thus, we expect that information that easily lends itself to forming trait impressions (e.g., diagnostic information) would engage dmPFC structures, as well as contributing to encoding success more than neutral information does.

Thus, our study aimed to investigate the neural mechanisms supporting the encoding of trait impressions into memory. Through it, we hoped to assess the importance of one’s state of mind (intention to form an impression) and of the type of information that one is presented with (whether diagnostic of a particular trait or not) when making trait impressions, as well as the way in which these factors influence the social cognition network in the brain.

Method

Participants

A group of 22 participants (11 male, 11 female) completed the study (mean age = 21.8, SD = 3.28) in exchange for payment of $25/h. Two of the participants were excluded from the analysis (1 male, 1 female), one for excessive movement in the scanner (>10 mm), and the other for near-chance performance (52.3% across all conditions) on the task. All participants signed informed consent forms and were screened for fMRI eligibility, including right-handedness; English learned before the age of 8; good neurological, psychological, and physical health; and the absence of medications that affect the central nervous system and devices or implants contraindicated for magnetic resonance scanning. The protocol was approved by the Brandeis University and Massachusetts General Hospital Institutional Review Boards.

Materials and procedure

The encoding stimuli consisted of 240 face–sentence pairs, with half of the faces (120) female and half male. The faces had neutral expressions and were evenly distributed across four different age groups, ranging from 18 to 89, and each face was displayed once during encoding. The faces were color photographs selected from the Center for Vital Longevity Face Database: https://pal.utdallas.edu/facedb/request/index

The first independent variable was the type of encoding task performed by the participants: impression based or semantic based. During the encoding phase, participants were asked to make one of two judgments regarding each face–sentence pair. One task emphasized the semantic nature of the behavior, in which we asked participants to indicate whether the action the person performed took place at home or away from home (SEM); this served as a comparison task that was not highly social or evaluative. In the other task, participants were asked to think of a trait that described the person depicted in the face–sentence pair and to decide whether this trait was positive or negative (IMP), a highly social task.

The content of the sentences served as the second independent variable. Half of the sentences (120) were diagnostic, in that they implied one of 24 traits (e.g., boring); 12 of these traits were positive, and 12 negative.Footnote 1 The remaining 120 sentences were “neutral” and described an action that did not readily lend itself to a trait judgment, such as “He made a peanut butter and jelly sandwich.” All of the diagnostic sentences had previously been used in studies by Mitchell, Macrae, and Banaji (2004, 2006), and the neutral ones were created by the experimenters. Each sentence appeared below a face during the encoding phase (see Fig. 1). Participants were not asked to intentionally memorize the face–sentence pairs.

Fig. 1
figure 1

Experimental displays at encoding and retrieval. (i) Four stimuli are displayed, representing the four encoding conditions, either impression (a and c) or semantic (b and d), denoted by the question posed at the top of the screen. Each face was paired with either a diagnostic (a and b) or a neutral (c and d) sentence, determined by the content of the sentence at the bottom of the screen. Faces were matched across all conditions for gender and age. Stimuli were presented without the condition-identifying boxes shown here in the top right of panels A–D. (II) Participants indicated with a buttonpress (left or right) which face they had previously seen paired with the sentence

Participants performed the task during six functional runs, each lasting 6 min and consisting of 40 face–sentence pairs, with 20 male and 20 female targets. The faces were equally distributed across each age group. Face–sentence pairs were pseudorandomly assigned to either the SEM (nonsocial) or IMP (social) condition, for a total of four experimental conditions: SEM–diagnostic, SEM–neutral, IMP–diagnostic, and IMP–neutral. Each face–sentence pair was presented for 6 s, during which time the participants made a response with a buttonpress to indicate “home” or “not home,” for SEM judgments, or “positive” or “negative,” for IMP judgments (see Fig. 1). The 6-s trial length was necessary because the reaction times to longer sentences were approximately 5 s, due to the amount of information to be processed (e.g., condition label, face, sentence). Baseline trials in which participants saw a fixation cross were intermixed throughout the run. Trials were presented in a jittered design, with intertrial intervals ranging from 0 to 12 s, and were ordered using optseq2 (Dale, 1999).

Stimuli were assigned to conditions in a within-subjects design. Four counterbalanced versions were used in order to account for stimulus-specific effects. Across participants, the face–sentence pairs originally coupled with the SEM (nonsocial) task were recoupled with the IMP (social) task, and vice versa, and new face–sentence pairs were formed in an attempt to reduce face-specific effects. Both pairs were counterbalanced for trait (i.e., faces previously paired with diagnostic statements were now paired with neutral statements, and vice versa) and gender (i.e., sentences paired with a female face were switched so that they were paired with a male face).Footnote 2

Between the encoding and retrieval phases was an approximately 12-min retention interval, during which time structural images of the brain were acquired while the participants did not perform any task. During the retrieval phase, participants saw two faces and a target sentence that had previously been paired in the encoding phase with one of the faces (see Fig. 1 bottom). Participants chose the face that they remembered as having been paired with the target sentence during encoding and indicated their response with a buttonpress during the 6-s trial interval. As in the encoding phase, participants completed 40 trials during each of the six runs; each run lasted 6 min. Each face was presented twice during recognition, once as the correct answer and once as a lure, and whether each particular face appeared first as a correct answer or a lure was counterbalanced across participants. The first three retrieval runs contained novel face pairs, whereas in the remaining runs the same pairings of faces were repeated with the remaining sentences. Face pairs were always from the same age group and were matched on gender. In addition, faces were matched across conditions (e.g., a face originally encoded in the IMP condition was paired with a lure also encoded in the IMP condition) but were not matched in terms of sentence diagnosticity. Runs were also balanced with regard to encoding conditions, such that face–sentence pairs from all conditions present at encoding were included in each encoding run. The retrieval data will be analyzed in a separate study.

Prior to the encoding and retrieval stages of the experiment, participants practiced the task with additional face–sentence pairs. Participants also completed measures to assess cognitive ability, including the Vocabulary subscale of the Shipley Institute of Living Scale (Shipley, 1986) and the digit comparison task (Hedden et al., 2002; modeled after Salthouse & Babcock’s, 1991, letter comparison task) to measure speed of processing. The experimental task was presented using E-Prime software (Psychology Software Tools, Pittsburgh, PA), which also recorded participants’ yes/no responses and reaction times from a button box.

Image acquisition and data analysis

The data were acquired with a Siemens Avanto 1.5 T scanner, using an echo-planar imaging sequence (TR = 2,000 ms, TE = 40 ms, FOV = 200 mm, flip angle = 90º) to acquire 26 AC/PC-oriented slices 3.2 mm thick with a 10% skip. Slices covered most of the cortex, with the exception of the dorsal portion of the parietal lobes and the ventral portion of the temporal lobes. Stimuli were back-projected onto a screen behind the scanner and viewed by the participants using a mirror attached to the head coil. High-resolution anatomical images were acquired using a multiplanar rapidly acquired gradient echo (MP-RAGE) sequence.

The preprocessing and data analysis were conducted with SPM5 (Wellcome Trust Centre for Neuroimaging, London, UK). Functional images were slice-time corrected, realigned to the first image in order to correct for motion, normalized to the Montreal Neurological Institute template, resampled to 3-mm cubic voxels, and spatially smoothed using a 6-mm full-width half maximum isotropic Gaussian kernel (Dale, 1999).

We used a subsequent-memory (Dm) paradigm in order to sort our imaging results according to the success of memory formation (Wagner et al., 1998). In a Dm paradigm, one separates encoding trials on the basis of the success of memory formation (i.e., whether the trials were later remembered or forgotten at the time of retrieval). Thus, we used the recognition data to distinguish all face–sentence pairs that were incorrectly identified at retrieval and binned these as “forgotten.” We then binned all of the face–sentence pairs that were correctly identified at retrieval as “remembered.” This analysis approach, applied to the encoding data, allows one to identify brain regions that are more engaged during the successful encoding of information (i.e., remembered > forgotten), in what will be referred to as a Dm effect.

To model each participant’s data, events were convolved with a canonical hemodynamic response function in an event-related design. A total of eight regressors were created for the combination of conditions: orientation at encoding (IMP/SEM), sentence diagnosticity (diagnostic/neutral), subsequent memory (remembered/forgotten), and a ninth regressor was included for participants who had trials with no response. Regressors were also created to model the six separate runs. Contrasts of interest were defined using these regressors of interest and then estimated for each participant’s fMRI data.

In addition to the regressors stemming from the study design, we introduced a parametric regressor when modeling the data to control for the effects of face attractiveness. Face attractiveness has the capacity to influence first impressions, most notably by adding to a general positive impression (Langlois et al., 2000) as well as activating the social network (such as dmPFC; Ishai, 2007; Liang, Zebrowitz, & Zhang, 2010). To control for these effects, we added an attractiveness rating for each face as a parametric regressor. Ratings were based on the average attractiveness rating provided by a separate sample of 15 participants (9 male, 6 female) in the laboratory. Participants responded to the prompt of “how attractive is this person?” on a 1–7 scale.

We then pooled the relevant contrasts across participants in a series of random-effects whole-brain group analyses. These were thresholded at p < .001 at the voxel level and with a spatial extent threshold of 10 voxels. Note that this threshold surpasses an overall correction level of p < .05, which can be achieved through the combination of a voxel-level correction of p < .001 and a cluster-level threshold of 7 voxels. To estimate the cluster-level threshold, we used a script that determined the number of contiguous voxels required to achieve an overall correction of p < .05, on the basis of the parameters of the data (e.g., slice thickness) and the selected voxel-level threshold (as in Slotnick, Moo, Segal, & Hart, 2003). We highlight regions in medial prefrontal cortex (particularly BA 10/32/8) because prior studies have identified these regions as relevant for the encoding of social information (Amodio & Frith, 2006; Harvey et al., 2007; Mitchell, Cloutier, et al., 2006; Mitchell et al., 2004; Schiller et al., 2009). We also focused on regions of the medial temporal lobes (MTL) that have been broadly implicated in explicit encoding processes (Macrae et al., 2004; Squire, 2004), although, to date, little evidence has associated MTL with the encoding of social information. By comparing two different types of tasks (IMP/SEM), we have the potential to distinguish the involvement of MTL regions in the encoding of social stimuli during conditions conducive to impression formation, relative to more semantic or knowledge-based conditions. This may be a sensitive comparison with which to reveal a role for this region in the encoding of social stimuli or to further support the separation of a “social memory system” from an MTL-based explicit memory system.

To explore memory effects in regions that responded to information relevant for impression formation, we first identified regions responding to the social relevance of the information as an effect of the orientation of the participant (IMP vs. SEM) or the content of the information (diagnostic vs. neutral) in whole-brain contrasts. We identified medial prefrontal and MTL regions emerging from these analyses, and in a second step, probed these regions for orthogonal effects of memory by focusing on interactions or main effects of memory (remembered/forgotten). This allowed us to assess the extent to which regions involved in the processing of social information also respond to memory formation. To characterize the activity in each individual condition, we used MarsBaR (Brett, Anton, Valabregue, & Poline, 2002) to extract percent signal change from each region of interest (ROI) based on the three factors of interest (intentionality, diagnosticity, memory), each with two levels: Participants formed first impressions with an intentional focus on social information (IMP) or incidentally with a more semantic, nonsocial focus (SEM); the sentences were either diagnostic or neutral; and each pair was either remembered or forgotten.

As an additional test of the involvement of MTL in the encoding of social information, we conducted an a priori ROI analysis on MTL regions. To do so, we created anatomical masks of MTL regions, including hippocampal, parahippocampal, and amygdala regions, using PickAtlas software (Maldjian, Laurienti, Kraft, & Burdette, 2003). We then applied these masks to the SPM analyses. While this ROI analysis overlapped with the whole-brain approach described earlier, it offered a more lenient test to detect the contribution of MTL regions to the encoding of social information, regions that have not emerged in the literature to date.

Results

Behavioral results

We conducted a 2 × 2 repeated measures ANOVA to probe the effects of sentence diagnosticity (diagnostic/neutral) and task orientation (IMP/SEM) on memory performance. We found a main effect for sentence diagnosticity, such that diagnostic (M = 69.7%, SD = 8.2%) face–sentence pairs were better remembered than neutral (M = 64.4%, SD = 8.4%) face–sentence pairs, F(1, 19) = 21.4, p < .001, η 2p = .53. Surprisingly, memory performance was not significantly better in the IMP condition (M = 67.9%, SD = 8.9%) than in the SEM condition (M = 66.2%, SD = 8.5%), F(1, 19) = 0.67, p = .42, η 2p = .03. The interaction was also not significant (F < 1). See the data in Fig. 2.

Fig. 2
figure 2

Memory effects. The behavioral results indicated that diagnostic face–sentence pairs (leftmost bars) were better remembered than neutral ones (rightmost bars). ** p < .001

fMRI results

As expected, our neuroimaging analysis contrasting task orientation (IMP > SEM) identified neural regions previously implicated in person evaluation, particularly medial prefrontal cortex. This contrast yielded brain regions that responded more to intentionally formed impressions with a social focus (IMP) than to incidentally formed impressions (SEM) and suggested that intentional impression formation evokes more social (e.g., medial prefrontal), emotional (e.g., orbitofrontal and insula), and semantic (temporal lobes) processing than does incidental impression formation (Table 1). Of the regions that emerged in the IMP > SEM contrast, we selected two ROIs in medial prefrontal regions, along with anterior cingulate cortex, for further analysis of memory effects (bolded in Table 1), because in previous research these regions had shown robust activation in response to person evaluation, particularly during successful encoding of information into memory (Harvey et al., 2007; Mitchell, 2008; Schiller et al., 2009). No mPFC or MTL regions emerged from the contrast of SEM > IMP.

Table 1 Montreal Neurological Institute coordinates of neural activations for contrasts of the social impression formation (IMP) versus the nonsocial semantic (SEM) conditions

A region in dmPFC (3, 30, 42; see Fig. 3) near those previously implicated in person evaluation was the only region to exhibit effects of encoding success. The region was initially selected from the random-effects analysis because it revealed a main effect of intentionality (verified in the ROI analysis: F(1, 19) = 22.06, p < .001, η 2p = .53), with greater deactivation for the SEM (nonsocial) condition than for the IMP (social) condition. In addition, there was an orthogonal main effect of diagnosticity, F(1, 19) = 331, p < .001, η 2p = .94, with greater deactivation for diagnostic (M = −0.88%, SD = 0.39) than for neutral sentences (M = −0.36%, SD = 0.35). In addition, we found an interaction between intentionality and memory, F(1, 19) = 4.55, p < .05, η 2p = .19, such that a Dm effect emerged for the IMP condition but not for the SEM condition. To characterize the nature of the memory effects across conditions, we conducted two-way ANOVAs separately for the IMP (social) and SEM (nonsocial) conditions with the factors Diagnosticity (diagnostic/neutral) and Memory (remembered/forgotten). For intentionally formed impressions (IMP), a main effect of memory emerged, F(1, 19) = 6.34, p < .03, η 2p = .25, such that the dmPFC deactivated more for remembered than for forgotten sentences. For incidentally formed impressions (SEM), we did not find a main effect of memory (p = .47). None of the other ROIs selected in the contrast of IMP > SEM reached significance in ANOVAs with the Diagnosticity or Memory factors.

Fig. 3
figure 3

Subsequent memory effects in dmPFC. A region in the dmPFC (3, 30, 42), which emerged from the contrast of impression formation (IMP, social) > semantic (SEM, nonsocial), showed an interaction of orientation with memory. R stands for “remembered,” and F for “forgotten.” The region contributed to memory formation for the IMP condition, but not the SEM condition

The contrast of diagnostic and neutral revealed a number of regions suggesting classic memory networks (e.g., hippocampus) and social processes (e.g., mPFC) in addition to the lingual and fusiform gyri, likely consistent with increased processing time for diagnostic sentences (as discussed in Mitchell, Cloutier, et al., 2006). As in Mitchell, Cloutier, et al. (2006), we found activity in the dmPFC associated with person perception. We performed ANOVAs on medial prefrontal and MTL regions (bolded in Table 2), but none of the ROIs returned significant main effects or interactions involving memory.

Table 2 Montreal Neurological Institute coordinates of neural activations for the contrasts of diagnostic (DIAG) versus neutral (NEU)

In the a priori ROI analyses of MTL regions, based on anatomical masks, no significant effects involving memory emerged in the contrast of IMP > SEM or diagnostic > neutral in the restricted volumes. To further probe this region, we contrasted remembered versus forgotten collapsed across all conditions, and also separately estimated the contrast for the IMP (social) and SEM (nonsocial) conditions. These analyses also failed to reveal significant effects.

Although a direct contrast of memory (remembered/forgotten), diagnosticity (diagnostic/neutral), and intentionality (semantic/intentional) pooled across participants would have allowed for a whole-brain test of any interactions between all three factors, this exploratory analysis did not return any regions that achieved significance.

Discussion

Our neuroimaging results reveal two primary findings about the memory system engaged in the encoding of social information, one of which is related to encoding, the other to processing social information. First, our findings converge with previous data indicating that social information is encoded by a distinct memory system, even when compared with another person-centered condition. However, we did identify some selectivity to the role of the social memory system, in that it is primarily engaged when encoding impressions formed intentionally with a social focus (IMP), rather than implicitly (SEM). Second, we found that the diagnosticity of impression information affected the engagement of the social system, but that this effect occurred regardless of the success of memory formation and, contrary to previous findings, regardless of intentionality. Interestingly, our results did not indicate a role for medial temporal regions in the encoding of social information, further indicating the potential for the social memory system to rely on mechanisms distinct from other types of explicit memory. These findings will be discussed in turn.

Our first result suggests that the dmPFC supports encoding of first impressions when intentionally trying to form impressions, but not when incidentally forming impressions. Previous studies have highlighted Dm effects for social versus nonsocial information (Harvey et al., 2007) or for encoding intentionally formed first impressions versus memorizing order (Mitchell et al., 2004). However, both of these studies contrasted against control conditions that did not require participants to make evaluative, person-centered judgments. Therefore, the previous results may reflect the act of evaluation rather than a dedicated social process per se, or may reflect differences in the attention devoted to evaluating a single individual. By contrasting a social task (form impression, IMP) with a person-centered and evaluative task emphasizing semantic rather than social components (judging the location where a behavior occurred, SEM), we showed in the present study that when forming and encoding first impressions intentionally, the dmPFC is recruited. We believe that this result advances our current knowledge because it better characterizes the involvement of the dmPFC in social evaluation, as well as highlighting its role in encoding first impressions into memory.

Somewhat surprisingly, this result is based on differences in deactivations, as opposed to previous studies in which activations have been found (Mitchell et al., 2004). Closer scrutiny of our neural data in the remembered tasks across both intentional- and incidental-encoding tasks reveals that the differences across conditions seem to be driven by the forgotten rather than by the remembered trials. We believe this pattern of deactivation is in line with the activity of the “task-negative” network of the brain, or the “default mode” network (Buckner, Andrews-Hanna, & Schacter, 2008; Raichle et al., 2001). Previous studies have found that activity of the default mode network hampers encoding, such that deactivating it would support better memory performance (Daselaar et al., 2009). While posterior regions, such as posterior cingulate, precuneus, and bilateral ventral posterior parietal cortices, have emerged more consistently in the literature, some work has implicated anterior regions as well, such as anterior cingulate and medial prefrontal cortices, such that deactivation supports successful encoding (Kim, Daselaar, & Cabeza, 2010).

Though the previous studies did not implicate dmPFC as part of the task-negative network, it is possible that the social nature of our stimuli and the unique neural regions engaged to encode this type of information account for these differences. Previous studies (Daselaar et al., 2009; Kim et al., 2010) focused on the encoding of words, scenes, and faces, but did not incorporate stimuli relevant to socioemotional goals. Under such conditions, dmPFC could be associated with the network of regions deactivated during encoding and activated during retrieval. Deactivating this network to support encoding processes implies that focusing on internal processes detracts from the ability to focus on external stimuli and successfully encode them. In our task, this could mean that focusing too much on internal cognition, perhaps retrieving autobiographical memories of familiar individuals with appearances similar to the target stimulus or creating associations based on facial features alone, impairs one’s ability to form a memory trace of the face–behavior association presented in the study (see Shrager, Kirwan, & Squire, 2008, for a discussion of similar effects in the hippocampus). This is consistent with our pattern of results showing less deactivation (i.e., greater activity) in the forgotten trials of the impression condition, perhaps reflecting a failure to inhibit distracting internal associations that hamper successful memory formation. Although one might expect to see a similar pattern for remembered versus forgotten trials in the semantic condition (incidentally forming impressions), it may be that internal information is most interfering when one is focused on impression formation. Thus, when trying to intentionally form impressions, inhibiting interference is important in order for one to later remember externally presented information leading to an impression. Future research explicitly testing the behavioral and neural effects of potential interference, such as from facial characteristics (e.g., Zebrowitz & Montepare, 2008), during the encoding of face–behavior pairings would help to resolve this question, particularly if the effects vary with the goals of the task (e.g., incidental vs. intentional impression formation).

Our second major finding is that diagnostic information that easily lends itself to forming an impression deactivates the dmPFC more than does neutral information. We would expect that diagnostic information would engage regions implicated in forming impressions more strongly than would neutral information, regardless of orientation. This is because diagnostic information lends itself more easily toward forming trait inferences (as shown by Uleman et al., 2008). However, Mitchell, Cloutier, et al. (2006) found that diagnostic information fails to engage the dmPFC more than does neutral information when one intentionally forms impressions, and that diagnosticity differentially affects neural engagement only when one incidentally (unconsciously) forms impressions. Mitchell, Cloutier, et al.’s interpretation was that when one is intentionally forming impressions, everything is “diagnostic” (even neutral information), but when unconsciously forming impressions, only diagnostic information activates frontal regions associated with forming impressions.

In contrast to their findings, we find that both intentional and incidental impression formation engage the dmPFC, such that it deactivates more for diagnostic than for neutral information. In conjunction with our finding of better memory for diagnostic than for neutral face–sentence pairs, our data lend support to the notion that increased depth of processing may be associated with dmPFC activity for social tasks. This idea is consistent with previous behavioral studies showing that depth of processing contributes to better memory (Craik & Lockhart, 1972; Craik & Tulving, 1975) and, in the social domain, that making complex judgments about peoples’ traits leads to better memory than do simple judgments of their sex (Bower & Karlin, 1974; Wenger & Ingvalson, 2002). Again, our pattern of greater deactivation, rather than activation, for diagnostic information is surprising, but perhaps reflects the mismatch between behavioral information and facial appearance, which is more salient for diagnostic than for neutral trials. Relative to prior studies that did not find this pattern, the need to integrate face and behavior could be more salient for one-shot impression formation tasks, as were employed in the present study.

Another possible explanation for differences from the prior studies may lay in the different designs employed across studies. Though both we and Mitchell et al. (2004) used similar statements in our studies, our procedure was very different, in that we presented each face once, paired with a single unique sentence, whereas Mitchell et al. (2004) presented each face paired with 10 different sentences. Pooling impression information across multiple trials may decrease the importance of diagnostic information on any single trial when intentionally forming an impression of an individual. Our use of a single actor–single behavior design may be more consistent with research on spontaneous trait inferences (STI), according to Uleman et al.’s (2008) claim that in order to generate the most robust STI, one must be presented with a single or very few related behaviors and integrate these with an actor representation. In contrast, “integrating meanings and/or evaluations of one target’s many behaviors is less likely to occur spontaneously and requires high levels of relevant chronic goals” (p. 333). This argument indicates that a more naturalistic setting, in which we form impressions based on a range of behaviors, is not ideal for forming a lasting, distinct first impression. With a single defined behavior, there is evidence that first impressions occur spontaneously, even in the absence of conscious effort to create an impression (Ambady, Krabbenhoft, & Hogan, 2006; Todorov & Uleman, 2002, 2003; Uleman et al., 2008).

Because our study presented only a single sentence for each person, this might have increased the perceived diagnosticity of each sentence, in contrast to studies with multiple sentences converging on a single trait. Distinct impression formation processes may be recruited when one is more concerned with updating and comparing impressions to current knowledge (e.g., Schiller et al., 2009), as opposed to forming initial impressions (as in our study). Although more research will be needed to elucidate the underlying processes that contribute to encoding first impressions, we believe that our finding highlights the sensitivity of dmPFC to diagnostic information, regardless of the state of mind one adopts. One would expect that a system that operates as seemingly automatic, such as impression formation (Todorov et al., 2007; J. Willis & Todorov, 2006), would be sensitive to the type of information at all times, whether impressions are incidental or intentional.

Surprisingly, diagnosticity did not influence the role that the dmPFC played in encoding impressions. This is particularly unexpected because the behavioral memory measures indicated that diagnostic information was better remembered than neutral information. While this suggests that we are better at encoding impressions that are based on meaningful behavior than impressions that do not contain trait diagnostic information, one might also expect intentionality to impact memory (e.g., Mitchell et al., 2004). This was not the case for our data: Intentionally and incidentally formed impressions were encoded equally well. However, orientation did affect the engagement of dmPFC during successful versus unsuccessful encoding trials. This apparent inconsistency between the behavioral and neural measures of memory may reflect the greater sensitivity of neural measures in some instances, allowing us to reveal a contribution of intentionality using neural measures. However, this potential for greater sensitivity may rely heavily on the selectivity of particular regions for specific processes.

Our results are consistent with those of Mitchell et al. (2004) in indicating the lack of an MTL contribution to the encoding of social information. This is surprising given the pervasive nature of MTL contributions to explicit memory (Shrager et al., 2008; Squire, 2009; Tulving & Markowitsch, 1998), although a small body of literature has suggested that amnesics may be able to form new memories of impressions of others under some circumstances, despite MTL impairment (Johnson, Kim, & Risse, 1985; Todorov & Olson, 2008). Importantly, MTL regions that did emerge in our diagnostic > neutral contrast did not show a Dm effect. This was further tested using an anatomical MTL mask in order to have greater sensitivity to detect effects, and these results also indicated that MTL regions do not respond significantly or differentiate between social and nonsocial tasks or between information that is diagnostic or neutral. Although we are limited in the inferences that we can make based on a null finding, our data add to the growing evidence that social memory formation may not be MTL dependent.

While our focus was on mPFC and MTL contributions to the encoding of impressions, our analyses also probed effects of the orientation and content of information more broadly. While some of the regions implicated in intentional over incidental impression formation are consistent with prior studies (Schiller et al., 2009) that have suggested a role of emotion (e.g., insula) and face processing (e.g., fusiform) in forming impression of others, our results also identify some novel regions, such as the anterior cingulate cortex (ACC). The ACC is known to be involved in decision making and conflict monitoring (Pochon, Riis, Sanfey, Nystrom, & Cohen, 2008), which might suggest that intentional impression formation involves deeper processing and may allow one to be better able to account for ambiguity or inconsistency (e.g., a pretty face engaged in an ugly behavior). Notably, the reverse contrast (SEM > IMP) did not produce any MTL regions, suggesting that our contrast was successful in contrasting social with nonsocial evaluation while avoiding activation of classic memory networks.

While we have achieved some success in differentiating conditions loading more heavily on social information, such as diagnosticity and intentional trait impressions, from those that invoke these processes less, it would have been helpful to have a true control condition that did not involve social information. This would have allowed us to further differentiate social from nonsocial information processing, potentially allowing us to more directly investigate the role of the MTL in encoding nonsocial and social information. In addition, the relatively lengthy trials might have resulted in some blurring of the intentional trait impression condition and the unintentional semantic condition. The amount of time available in which to deeply process information, as well as the interspersing of trial types within a participant, might have encouraged participants to form impressions even on the semantic trials. Such a possibility may account for the lack of memory differences across these conditions.

In conclusion, our finding that encoding first impressions relies on the dmPFC only when intentionally trying to form impressions highlights the importance of orientation and the unique role played by this region when intentionally forming first impressions. It adds to our current knowledge in that it shows that this is true not only in comparison to a nonsocial sequencing task, but also when compared to a more nuanced, person-centered evaluative task. In addition, we found a role for the dmPFC in processing diagnostic information that easily lends itself to impression formation, as compared to neutral information. This highlights the role of the region as being dedicated to forming first impressions, and it may indicate the importance of the task in engaging this region. Compared with previous findings, our results suggest that the processes differ when impressions are based on single versus multiple behaviors. Further investigations will likely highlight the types of diagnostic information that most impact impression formation, the effects of multiple versus singular impression formation, and the roles of these regions during retrieval of first impressions from memory.