Introduction

Individuation in primates primarily relies on face recognition. However, the contributions of developmental processes to the process of face perception are relatively unclear. Recent evidence for an early developmental mechanism comes from a case studies of bilateral cataracts1 and a face deprivation study in monkeys2. In humans, it has been shown that holistic processing3, the hallmark of face processing, does not occur in humans born with bilateral cataracts, even after many years of exposure with human faces after the removal of the cataracts within 2 to 6 months of age1. Sugita2 showed that without exposure to faces, monkeys displayed a preference for human and monkey faces, but soon after they had been exposed to one face class the perceptual system selectively tuned toward that class resulting in difficulty in discriminating the non-exposed face class. These studies provide two major insights into the early developmental processing of faces, which are (a) the existence of a sensitive period early in life, when an inborn neural system decays away if the expected stimulation is absent soon after birth4 and (b) evidence for perceptual narrowing, when a broad ability at birth, e.g. the ability to discriminate monkey and human faces2, narrows to a subtype of stimulation, e.g. to the ability to discriminate monkey faces only. Perceptual narrowing typically tunes the recognition system towards the predominant race5 and species6. Sensitive periods early in life and perceptual narrowing are strongly suggestive for an innate component in face perception4, however, we here refer to these processes as early processes in face perception without making implications on genetic predetermination. On the other hand, it goes without saying that individuals with lifelong intense exposure to faces know more about faces than newborns do. Representations qualitatively change with experience7,8. It has been shown that extensive training is sufficient to develop the ability to discriminate an object class for which no innate representation was present, e.g. dog experts classify dogs equally fast at subordinate as at basic level and more likely use subordinate level label in speeded naming tasks8,9. This ability is comparable to typical adults that can identify faces (subordinate level) as quickly as they categorize them at the basic level (human)10. Thus, our perception not only passively describes the world, but it is modulated by experience11,12,13. We refer to the process of perceptual learning as the late component of face perception. Both, early as well as late components can stand for themselves and describe the processes of face perception to a great extent. Conceptually, an ability to do better in one class than in the other class of faces with only a little amount of experience (in the first years of life) provides evidence for an early component, while such an ability to do better with one class than the other with immense experience accumulated over a lifetime of exposure provides evidence for a late component in the development of face processing. In the naturalistic setup, to disentangle these two components is challenging, since individuals are exposed to the same class of faces in early and late phases of their lives (e.g. conspecifics, own race). The other-species effect, in which the discrimination abilities of one species regarding conspecific faces is contrasted with non-conspecific faces14,15, reveals an advantage for the own species, but leaves the relative contribution of early and late components unanswered: this effect might be due to an early tuning towards conspecific face morphology, the high degree of exposure to conspecific faces or both4. In other words, the participants in those studies were adults that experienced human faces early as well as later in life. Our approach to separate early and late components in the development of face perception is to investigate captive chimpanzees, a species with immense exposure to non-conspecific faces and with limited exposure to the own species' face. ‘Limited exposure' refers to the number of individuals which chimpanzees are exposed to in the wild16 as well as the number of human individuals the chimpanzees in the current study are exposed to. Moreover, we test individuals at different stages of development to vary the amount of exposure to human faces. Assuming that an early component in development tunes the face perception system rapidly to the natural class of faces, here chimpanzee faces, we predict a relative increased performance of discrimination of chimpanzee as opposed to human faces for the young chimpanzee participants. Assuming that life-long exposure modulates the perceptual system along the critical dimensions of faces which it is exposed to predominantly, here human faces, we predict a relative increased, or an equal performance of discrimination of human as opposed to chimpanzee faces for the old chimpanzee participants. To quantify and characterize the components and their interaction is the aim of the study. We present a detailed description of how the early and late components drive the ability of face discrimination. We simulate the processes using Hebbian learning in an artificial neuron under the assumptions stated above, indicating that the face discrimination system maintains life-long adaptability to optimize its performance in the current environment.

Results

We evaluated face discrimination abilities of six chimpanzees of two age groups (Young chimpanzees (YC): 10.75 +/− 0.17 (s.d.) versus old chimpanzees (OC): 30.78 +/− 3.82 (s.d.) years) during presentations of familiar and unfamiliar chimpanzee and human face stimuli. Six chimpanzees (three of each group) classified chimpanzee and human faces in a delayed matching-to-sample (DMS) task (Figure 1a). At the level of participants, we found a systematic modulation in discrimination performance (percent correct responses) between chimpanzee and human faces (Figure 1b, Figure S1a). OC showed a better discrimination of human than chimpanzee faces, while YC showed the opposite trend. Using a mixed model ANOVA (with stimulus class and age group as fixed factors and participants as random factor nested in age group) this trend is reflected in a significant interaction between the factors age group and stimulus class (F(1,11) = 29.09, p < .01, mean square = .016. There were no significant main effects for the factors age group (p = .94) and stimulus class (p = .17). Jarque-Bera tests affirmed normally distributed samples in both age groups and stimulus classes (all p > 0.16). T-test comparisons revealed a significant difference for performance scores for chimpanzee and human faces in OC (T(2) = 4.58, p < .05) and YC (T(2) = 5.72, p < .05). Further, using a mixed model ANOVA (with stimulus class, age group and face familiarity as fixed factors and participants as random factor nested in age group) we found no significant modulation of face familiarity on the interacting factors stimulus class and age groups (F(1,23) = 6.21, p = .10, mean square .01). To account for possible violations of assumptions for parametrical statistical comparisons due to the small sample size, we confirmed that the classification bias found at the group level is solid at the level of individual runs (i.e. 50 trials each) by using a linear classifier to predict individual runs' outcome. In detail, we paired mean performances for chimpanzee face runs with those of human face runs, pooled the data of all participants and used support vector machines (SVM) to predict the age group (YC or OC) for each pair (Figure S1b; Supplementary material). This approach is conservative, nevertheless yielded the following results: detection rate (10-fold cross-validation): 77.64% (SEM: 0.11); accuracy: 76.9% (SEM: 0.09); precision: 78.15% (SEM: 0.14).

Figure 1
figure 1

Face discrimination task and modulation by age.

a, procedure. In each trial, a face picture of an individual (cue) was centrally presented on the display for 750 ms, followed by an inter trial interval (ITI) of 500 ms and a presentation of two horizontally aligned face pictures of the same individual (match), but not the identical picture as the cue and a different individual (distractor). Chimpanzees indicated their choice by touching either the match or the distractor stimulus. The correct answer (match) depicts a face picture of the same individual as shown in the cue stimulus. b, Proportion of correct responses. Performance scores (correct trials / number of trials) were average according to age group (YC, OC) and the stimulus class (chimpanzee (C, blue boxes), human (H, light red boxes)). Boxplots describe the samples of individual runs. Dots and text labels mark the mean performance of the participants.

In theory, differential similarity within the chimpanzee and human face stimuli could facilitate the discrimination of one class as opposed to the other, as discussed in the literature4. However, our results show a full cross-over interaction in which YC discriminate the chimpanzee better than human faces and OC discriminate the human better than the chimpanzee faces. Nevertheless, using similarity estimates, we further excluded that the main effect was driven by stimulus similarity (Figure S2). The approach of estimating similarity scores via Gabor jet filters is based on a model by Okada et al.17, designed for artificial face recognition and has been successfully implemented18. It also has been successfully introduced to experimental research14 and could serve as a future standard procedure to evaluate the contribution of similarity in any visual object discrimination task. The core principle of this model was inspired by a theoretical face recognition model19. In detail, the similarity among human and chimpanzee faces was evaluated by computing the Euclidean distance (1/(1+ Euclidean distance)) for pairs of human and chimpanzee pictures (for details, see method section). The similarity scores are shown in Figure S2d, illustrating the distribution of similarity scores for both face types as well as the overlapping region of shared similarity (indicated by the green lines). To estimate whether similarity can account for the main effect, we applied a very conservative test, focusing on the performance values within the shared similarity scores: If similarity does not account for the main effect, a differential effect for chimpanzee and human stimuli must occur independent of the similarity scores. Therefore performance scores of the two face types must still be separable when the analysis was restricted to the trials within the range of equal mean similarities of chimpanzee and human stimuli. Indeed, the performance values in the shared range of similarity scores reflect the main effect (Figure S2e). Using a mixed model ANOVA (with stimulus class and age group as fixed factors and participants as random factor nested in age group) this trend is reflected in a significant interaction between the factors age group and stimulus class (F(1,11) = 26.17, p < .01, mean square = .015) at the level of participants. There were no significant main effects for the factors age group (p = .93) and stimulus class (p = .10). Thus, despite the physical similarity differences between the two face types, we here demonstrate that the overall performance for the two face types was not affected. We exclude similarity as the main component contributing to the discrimination performance presented in this experiment.

In the following, we attempt to further characterize early and late developmental processes. Under the assumption that only the unique history of exposure of the chimpanzee participants to the two classes of faces can account for the differences in human and chimpanzee face discrimination, we simulate a neuronal adaption process of the face processing system with parameter chosen to be in accordance with the participants' real-life exposure to conspecific and human faces. The model predicts the time courses of late and early components and thus puts the experimental observation in a broader context. We estimated the exposure toward chimpanzee and human faces for 35 years of life and simulated how the neural system adapts to the consistently changing environment. We used a simplified neuronal model simulating the adaption of the face processing area to a greater exposure to one class of faces (chimpanzee faces) in the beginning of the learning period and a gradual change over time into greater exposure to a second class of faces (human faces) having distinct features in respect to the first class. The model aims to create an exposure condition to class 1 and class 2 which is comparable to the condition of the chimpanzee participants. The discrimination ability for the faces of class 1 and class 2 was tested in each learning step on 10 newly drawn individuals from the respective distributions (comparable to the DMS task). The performance of the neural system at time t was evaluated as the variance of the neural output y on these test sets (for details, see method section). The results show a clear tendency in “early childhood” to discriminate pattern type 1 to a greater extent than pattern type 2 (Figure 2a). “Over the years” of exposure to pattern type 2, we observe a gradual switch towards better discrimination of those exemplars as opposed to exemplars of pattern type 1 (Figure 2a,b). This time courses allow characterizing the early and late developmental processes: We quantified early and late developmental processes by contrasting the specific chimpanzee scenario with the natural scenario16, i.e. equal exposure to one class exclusively over a lifetime. We calculated the natural scenario by setting the constant exposure of class one to 100 and the variable exposure to 1. Contrasting the chimpanzee scenario from the natural scenario reveals the relative contribution of the late component being a slow process of specialization (time constant of 11.55 (95%-CI: 11.37, 11.73) years; Figure 2e). The best fitting function describing these results is the following: ylate(t) = α(1−exp(−t/Τlate)), with Τlate being the time constant in years (Τlate = 11.55 (95%-CI: 11.37, 11.73)) and α being the performance towards infinity (α = 0.52 (95%-CI: 0.51, 0.52)). The relative contribution of the early component of face perception can be described by contrasting the later phase of the natural scenario (steps 1500 to 3500) with the earlier phase, revealing a rapid increase of discrimination performance fitted with an exponential decay of time constant of 1.54 years (95%-CI: 1.52, 1.57) (Figure 2d), which was fitted by the following exponential function: yearly(t) = −β exp(−t/Τearly), with Τearly = 1.54 (95%-CI: 1.52, 1.57) and β = 0.13 (95%-CI: 0.13, 0.13). We set the parameter of the model in accordance with the experimental conditions (see Methods section) and found that the model could qualitatively well describe out findings from chimpanzees. The exact time courses, however, were not constraint by the data and therefore reflect the assumptions and internal dynamics of the model simulation. The time courses can thus be seen as testable model predictions.

Figure 2
figure 2

Simulation using simplified neuronal model.

a, Face discrimination performance of a neuronal model following the Oja-learning rule42, trained on numeric values representing face features. Performances from 200 simulation runs are plotted as a function of time (yrs.), showing a decrease for class 1 (blue) (equivalent to chimpanzee faces) and a gradual increase for class 2 (red) (equivalent to human faces). Solid lines indicate the means, dashed lines the standard errors. The raw data is plotted in light colors. b, The distributions of the input feature space of the two classes. Lines indicate the approximation from class 1 (blue) to class 2 (red) in time (yrs). The dots indicate the distributions of exemplars of the two classes from which individual exemplars were selected during the simulation. The neuronal model gradually aligns the coding dimension (weight vector) from class 1 to class 2. c, Illustration of the number of constantly and variably presented exemplars of both classes (y-axis logarithmically scaled) over time. d, e, Relative contribution of processes in early and late phases of development. The relative contribution of early and late components to face perception, as suggested by this simulation, is shown as a function of time. The closest fitting function is indicated by the solid line; the raw data is plotted in light colors.

Discussion

We presented a systematic modulation of face discrimination abilities for two classes of faces (chimpanzee and human) in chimpanzee participants by age. We showed better performance for the natural class of faces (chimpanzee) in YC than in OC and a better performance for human faces in OC than in YC. We claimed that these results reflect a distinctive involvement of early and late developmental processes, i.e. a perceptual narrowing process early in life that tunes the system towards the natural category of faces and a perceptual learning process that selectively drives the perceptual system towards the morphological specifications of the faces dominantly exposed to. We further support the claim by illustrating the development of face perception in a computational simulation under the assumption that this process is governed by neural adaption of the face discrimination system to retain optimal discrimination performance in its current environment. We quantified early and late developmental contribution mathematically. This data is in accordance with the psychological framework face feature space20,21,22, which is a flexible construct that undergoes experimental and, as shown here, experience-dependent changes.

Along with the presented results, an earlier study has found an equally strong or stronger discrimination performance for non-conspecific faces14. Chimpanzees with plentiful exposure to humans and chimpanzees were equally good in discriminating both types of faces, however, chimpanzees from a different primate center with intense exposure to humans, but limited exposure to chimpanzee, showed an advantage for human above chimpanzee faces. Despite the conceptual importance and the coherence of the findings of that study and our study, the results in the given context were not entirely conclusive. Here, we account for potential confounding factors: First, the quality of exposure: the social relevance to the participant could contribute to the discrimination advantage for the other species. In14 the chimpanzee participants consisted of two groups from different primate centers following different levels and styles of human-chimpanzee interaction, leading to a different level of social relevance of chimpanzee and human faces across the two groups. In contrast, our participants received an equal amount of caretaking and established an equal qualitative relationship to humans. However, we cannot quantify the quality of exposure and hence we cannot entirely rule out any influence by remaining differences of quality of exposure among individuals as an influencing factor contributing to the findings, but under the given circumstances it is unlikely to assume that the effects are solely driven by quality of exposure. Second, representations of conspecific faces due to early developmental or innate processes might generalize to morphologically close human faces4. In the study by Martin-Malivel and Okada14 and in our study this factor can be ruled out, since a complete cross-over to better discrimination in human faces as opposed to chimpanzee faces seems highly unlikely based on similarity to an innate template for conspecific faces. Hence, morphological similarity cannot be the cause of our findings, instead only the history of exposure towards one or the other type of faces can better fit our results. In addition, the simulation showed that the face discrimination system adapts to the feature dimensions of the face type most dominantly exposed to and that morphological similarity explains a minor “recognition benefit” for the faces which at a given point in time the perceptual system is not tuned to (Figures 1b, 2a): the model is capable to generalize to some extent given the amount of overlap of the two distributions of the face classes.

Our chimpanzee experiments suggest that an early component, such as perceptual narrowing, tunes the face processing system towards conspecific faces through initial exposure (YC), which is in accordance with child development studies showing that 3- and 6-month old children are able to discriminate faces of non-experienced races and monkeys while 9-month-old children and adults have lost this ability6,23. However, even after an initial period of perceptual narrowing (early component), the system responds flexibly upon changeable aspects in later life (OC), such as extensive exposure towards a novel category (late component). In the chimpanzee scenario, the initial, rather short tuning period has a strong effect on the perception of the participants, but cannot fully counter the influences of long-lasting tuning in their later lives. In the case of two competing face categories this results in a reduced discrimination performance of the initially acquired face category. This is in accordance with the findings that adult Koreans living among Caucasians since mid-childhood demonstrated a reverse other-race effect24. In the natural scenario, the early component builds a repertoire of conspecific face representations, while the late component serves the fine-tuning leading to the ability of very precise individuation among conspecific faces.

These findings provide insights into a series of interesting issues: (1) Are conspecific and non-conspecific faces processed in the same recognition system? It has been shown that subordinate level categorization can be achieved in areas not assigned to face processing and not relying on a holistic or configural type of processing25, instead areas involved in object processing seem to be involved more strongly than those areas involved in face processing26,27. On the other hand, in macaques face patches elicited by presentations of human faces overlap with those elicited by presentations of macaque faces to a great extent28. In the current study, we show that a recognition system processing both types of faces can explain the experimental findings in chimpanzees, by adapting the neural weights to select the current subset of (facial) features for optimal discrimination. Logically, the adaptation leads to an optimal performance for one class of faces and a decreased performance for the other class of faces. The amount of both increase and decrease depends on the amount of shared features, hence the morphological similarity between the classes. This is along the line with electrophysiological findings that face individuation relies on neurons tuned to different subsets of features, while neurons peak to one extreme on a feature dimension (monotonic tuning)22. (2) Theories in object recognition predict a switch from one processing system to another, e.g. the representations change according to the processing mechanisms (holistic versus part-based)29, memory representations change30,31 or the decision processes change due to holistic effects32. However, the transition from part-based to configural or holistic processing has been shown to appear gradually33. Thus, it might well be that instead of a representational change acquisition of expertise is a process of fine-tuning, in the same representational system. The current case support this assumption: it seems unlikely that a previous expert class, such as the chimpanzee faces for the OY, which turns into a non-expert class, is reassigned to different neural substrate. Again, more plausible is an explanation predicting a readjustment of synaptic weights from one distribution of a face class to a distribution of a novel face class (Figure 2b).

We here described a specific case of a species with life-long exposure to non-conspecific and only little exposure to conspecific faces. In our model we described how the neural machinery for face perception adapts to the given environment. Hence, we can predict a general trend of development of any point in time. However, as pointed out, the exact time courses are predictions and will depend on assumptions about the model as well as the model parameters. If our assumptions are correct, a shift from a discrimination advantage of chimpanzee to human faces occurs around the age of 15. This remains to be investigated. What seems clear is that the neural system for face perception stays plastic even after initial early tuning (perceptual narrowing) toward the conspecific face class. However, this later developmental process has long-lasting, but strong re-tuning characteristics and can fully overwrite the initial tuning.

In contrast to the face exposure in humans, in captive chimpanzees estimations about the face exposure toward any type of face classes can be determined more precisely. Further the unique exposure to human and chimpanzee faces makes the captive chimpanzees at the Primate Research Institute of Kyoto University a suitable model to study early and late developmental processes of face perception. Given the phylogenetic proximity of chimpanzees and humans and shared principles of visual processing34,35, this study goes beyond face perception in chimpanzees, but is transferable to the perceptual system of humans.

Methods

Participants

Six chimpanzees (Pan troglodytes; 1 male adolescent, 2 female adolescents and 3 female adults; YC: 10.75 +/− 0.17 (s.d.), OC: 30.78 +/− 3.82 (s.d.) years) participated in the study. The age nomenclatures were determined according to commonly assumed standards in literature36,37. Chimpanzees are socially housed in a group of 14 individuals in outdoor (770 m2) as well as indoor compounds containing environmental enrichment. Chimpanzees are exposed to human faces with and without mouth mask on a daily basis (i.e. researcher, care takers, visitors, media, construction workers, etc.). They have been engaged in various types of computer-controlled perceptual–cognitive tasks in their past, also involving faces of both chimpanzees and humans. All chimpanzees participated in pairs (mother and offspring).

Stimuli

Black-and-white pictures of 20 chimpanzee and 20 human individuals were used. For both, chimpanzee and human faces, half of the faces were familiar and half were unfamiliar to the participants. Per individual two pictures were taken at different timepoints. All faces were normalized for luminance and contrast and placed in an image canvas of 533 × 702 pixels (at 40 cm distance approximately 10.7 × 14.25 degrees of visual angle). Generally, chimpanzees recognize and discriminate individuals based on black-and-white pictures of faces as shown in previous studies14,38.

Apparatus

Stimuli were presented at a 17-inch LCD touch panel display (1280 × 1024 pixels) controlled by custom-written software under Visual Basic 2010 (Microsoft Corporation, Redmond, Washington, USA). Chimpanzee participants sat in two connected experimental booth (each 2.5 m wide, 2.5 m deep, 2.1 m high), with the experimenter and the participants separated by transparent acrylic panels. The display was embedded into the acrylic panel. The distance between the display and the participants was approximately 40 cm. One degree of gaze angle corresponded to approximately 0.7 cm on the screen at a 40 cm viewing distance. Responses were given by touching the display surface with a finger. The display was protected from deterioration by a transparent acrylic panel fitted with an armhole (10 × 47 cm) allowing hand contact with the display. Below the display a food tray was installed in which pieces of food reward was delivered by a custom-designed feeder. Display and feeder were controlled by a personal computer.

Procedure

We used a delayed-matching-to sample (DMS) paradigm. In each trial, a face picture of an individual (cue) was presented for 750 ms, followed by an inter trial interval (ITI) of 500 ms and a presentation of two face pictures of the same individual (match) and a different individual (distractor). The spatial separation between the match and the distractor pictures was 20 mm. Identities of faces as well as the positions of match and distractor were counter-balanced across the whole sequence of trials. We divided the sequence into runs of 50 trials and alternated between runs of chimpanzee and human stimulus presentations. The chimpanzees did 84.17 +/− 12.89 runs with chimpanzee and 86 +/− 13.77 runs with human faces over 32 +/− 6.5 days of experimental testing. Given the total set of faces, this is equivalent to 2.77 +/− 0.42 repetitions of stimulus pairs for the chimpanzee faces and 2.82 +/− 0.45 repetitions of stimulus pairs for the human faces.

Data analysis

The analyses were performed using Matlab (Mathworks Inc., Natick, MA, USA). The dependent variable was error rates. Trials with response times faster than 200 ms and slower than 4000 ms were excluded and an analysis of variances among the participants was conducted using a mixed model ANOVA with stimulus class and age group as fixed factors and participants as random factor nested in age group as well as an analysis using support vector machines (SVM) and linear classification based on pairs of session of human and chimpanzee stimuli.

Similarity estimation

We convolved each individual picture with Gabor Filters (Sinusoid x Gaussian) of 5 scales (spatial frequency of the sine wave of the gabor: 4, 2, 1, 0.5, 0.25) and 8 orientations (0, 22.5, 45, 67.5, 90, 112.5, 135, 157.5 degree)39 with an aspect ratio of 4.54 and a number of sub-regions of 2.65 (Figure S2a). The scales reflect the hierarchy of receptive field sizes across the visual system40 and the shape parameters are taken from previous modeling work on the primary visual system41. We then correlated the filter output vectors (Figure S2b,c) for each individual face picture with each other picture of the same face type, resulting in two similarity matrices for the faces of both face types. We report similarity values (1/(1+ Euclidean distance)), ranging from 0 to 1, indicating increasing similarity.

Computational simulation

We set the exposure frequency to class 1 (equivalent to chimpanzee faces) the following way: Over 35 years of life, long-term face exposure slightly varied among chimpanzee individuals in our lab, but in order to generalize the results, we set the numbers of constantly perceived fellow chimpanzee individuals (class 1) to 8. Then, we presented one random novel exemplar in each simulation step (3.5 days), representing pictures of novel or unfamiliar chimpanzee faces seen by our chimpanzee participants. The number of human faces perceived over time (class 2) was chosen according to the amount of exposure to human faces. Human visitors (short-term face exposure) were represented by 3 novel exemplars randomly drawn each step. Researchers and care takers (long-term face exposure) were simulated by a gradual increase to 90 exemplars over the 35 years, each of which kept constant face features for the time of the simulation (Figure 2c, Figure S3a). However, we allowed for a slight jitter (std. 0.05) on each face feature, to account for some noise in the assumed face feature extraction process (see below). In more mathematical terms, we assumed that the neural network coding for the discriminability of faces has access to complex non-linear face features. The complex features represent the facial features as extracted from high-dimensional face space by means of sensory processing. The most meaningful features are, of course, those that have high variance among individuals in order to most reliably differentiate between individuals, e.g. configuration between facial parts. Therefore, it is plausible to assume that a neural network of limited resources (i.e. too small to code all possible face features) should use those feature sub-spaces which maximize the variance of the neural response to individual faces (similar to principle component analysis based dimensional reduction). We here simulate the process of learning in a simplified scenario of two-dimensional x and a neural network limited to just one linear neuron, that is computing a weighted sum of the inputs, y = wx (1). Because of the gradually changing exposure to faces of different species over time the face feature distribution provided by the sensory areas changes. Thus a neuron of the face discrimination system has to ensure that its synaptic weights learn to optimally discriminate the face features present at any given time. Therefore, the system has to adapt from maximally describing the variance among exemplars of one class (chimpanzee faces) to maximally describing the variance among exemplars of a novel class (human faces). Features, which are not critical for successful discrimination at a given time, will be “forgotten” due to the size constraints of the neural system. Thus, the performance for the class, which the system is not “tuned” to, will be deteriorated. We use the Oja learning rule42 (a variant of the Hebb-rule ensuring normalized weights) to adapt the synaptic weights of the neuron in each time step t. In detail, we take Δwt = α{yxi − y2wt} (2) where the average is taken over the present individuals in each time step (see below) and the learning rate α is set to 0.02 at the beginning and linearly decreasing to 0 at the end of the simulation. We here do not explicitly model any face feature extraction process but instead assume that face features of the two classes are given by two different distributions in the face feature space. Since we use the variance of the neural activation to face features (and not higher moments) as indicator for face discriminability in the model, we can assume without loss of generality that the face features are normally distributed in the feature space. Thus we used two normal distributions (class 1 and class 2) of features in the two-dimensional input space, referring to conspecific and non-conspecific faces, respectively: the Gaussian distribution of class 1 had a width of 0.2 and height of 0.8 at an angle of 0.78; while the Gaussian of class 2 had a width of 0.3 and a height of 0.65, which is broader than class 1 due to the assumption that the sensory system is not optimized to this class. The angle of class 2 was at −0.45 (see Figure 2b for illustration of these distributions) to model the distinct differences of the face features between classes. One sample from either distribution represents a face feature composition of a particular individual from class 1 or 2, respectively. We did not determine an innate preference for conspecific faces prior to the beginning of the learning process, but randomly set the synaptic weight w0for each run of the simulation to a random direction in the feature space (and ensure normalization). We simulated 3500 steps, being equivalent to 35 years, thus one step reflects 3.5 days. In each step the weight of the neuron was adapted according to the mean response to the present individuals, using Equation 2.

Ethics statement

All experiments were carried out in accordance with the 2002 version of the Guidelines for the Care and Use of Laboratory Primates by the Primate Research Institute, Kyoto University. The experimental protocol was approved by the Animal Welfare and Care Committee of the same institute.