Predictive coding explains binocular rivalry: An epistemological review
Introduction
If one stimulus is shown to one eye and another stimulus to the other, then subjective experience alternates between them. For example, when an image of a house is presented to one eye and an image of a face to the other, then subjective experience alternates between the house and the face. This is known as binocular rivalry. Binocular rivalry is a challenge to our understanding of the visual system, and it is of special importance for studies of phenomenal consciousness in humans and monkeys, because the stimulus presented to subjects can be held constant while the phenomenal percept changes (Frith et al., 1999, Koch, 2004).
There have been many empirical studies of binocular rivalry but the data they produce are conflicting and it is very difficult to give them an unequivocal interpretation. A number of proposals have been made but the neurocognitive mechanism that explains this striking visual effect remains unresolved (for reviews and overviews, see Alais and Blake, 2005, Blake and Logothetis, 2002, Leopold and Logothetis, 1999, Tong et al., 2006). There are recent formal models that can explain a growing number of psychophysical findings and which fit with a range of neurophysiological facts (Koene, 2006, Moreno-Bote et al., 2007, Noest et al., 2007, Wilson, 2007), and there is a general trend towards approaches that integrate top–down and bottom–up processes in the brain (Tong et al., 2006); however, we believe the study of binocular rivalry may benefit from a principled theoretical framework that can motivate these new developments. Most approaches to rivalry stress the role of inhibition, adaptation and stochastic noise. We take the approach of epistemology—the theory of knowledge—to go behind these approaches and ask the more fundamental theoretical question: “why should a perceptual system, such as the brain, have and exploit such mechanisms in the first place?” The motivation behind this approach is the idea that binocular rivalry is an epistemic response to a seemingly incompatible stimulus condition where two distinct objects occupy the same spatiotemporal location. This paves the way for the description of a principled and unified account of rivalling perceptions under dichoptic viewing conditions. Our intent is thus not to add new data to the burgeoning class of data already in hand concerning binocular rivalry but to describe a unifying framework for it.
There is growing support of the idea that the brain is an inference machine, or hypothesis tester, which approaches sensory data using principles similar to those that govern the interrogation of scientific data. In this view, perception is a type of unconscious inference. As Gregory states:
[P]erceptions are hypotheses, predicting unsensed characteristics of objects, and predicting in time, to compensate neural signalling delay (discovered by von Helmholtz in 1850), so ‘reaction time’ is generally avoided, as the present is predicted from delayed signals […] Further time prediction frees higher animals from the tyranny of control by reflexes, to allow intelligent behaviour into anticipated futures (1997, p. 1122).
This view goes back at least to von Helmholtz (1860) and has been expressed with increasing finesse since that time (Gregory, 1998, MacKay, 1956, Neisser, 1967, Rock, 1983). More recently, it has been proposed that this intuitive idea can be captured in terms of hierarchical Bayesian inference, using generative models with predictive coding or free-energy minimisation; and that this is the main neurocomputational principle for the brain’s perception of the environment as well as its learning of new contingencies (Ballard et al., 1983, Dayan et al., 1995, Friston, 2002, Friston, 2003, Friston, 2005, Friston and Stephan, 2007, Kawato et al., 1993, Kersten et al., 2004, Knill and Pouget, 2004, Mumford, 1992, Murray et al., 2004, Rao and Ballard, 1999).
Our proposal is that this general theoretical framework, in its more recent incarnations, provides the computational mechanism that best explains binocular rivalry and reconciles conflicting evidence. We set out some core properties of predictive coding, show how it explains binocular rivalry, and relate the explanation to a number of empirical neurophysiological, imaging and psychophysical findings concerning binocular rivalry. A Bayesian framework has been suggested recently for bistable perception (slant rivalry) (van Ee, 2003), however, though this framework is congenial to the account given here, it is not couched in terms of generative models, predictive coding and empirical Bayes. As we shall see, in its more complex version Bayesian theory has great explanatory promise. Our account has more in common with an earlier model by Dayan (1998) that uses explicit generative models (A further recent study of bistable perception (monocular rivalry) by Knapen, Kanai, Brascamp, van Boxtel, & van Ee, 2007, seems to count against the use of generative models; we discuss this further in Section 6).
Section snippets
Core properties of predictive coding
A core task for the brain is to represent the environmental causes of its sensory input. This is computationally difficult; it is difficult to compute the causes when only the effects are known: as Hume (1739–40) reminded us, causes and effects are distinct existences and, in principle, many different environmental events could be causes of the same sensory effect. Conversely, the same environmental causes can occur in different contexts, so the same environmental event can be the cause of many
Two problems concerning rivalry: Selection and alternation
In dichoptic viewing conditions, where one stimulus is shown to one eye and another to the other eye, binocular matching fails because two different objects seem to occupy the same spatiotemporal position (Blake & Boothroyd, 1985). The epistemological task for the system, given this incompatible or “un-ecological” condition is then to explain the combined bottom–up signal stemming from the two stimuli: it does this rather elegantly by selecting only one stimulus at a time and then alternating
A Bayesian approach to the selection problem
Assume the stimuli are a house and a face and that the percept currently experienced by the subject is the face. Then the question, from a Bayesian perspective, is why the face hypothesis (F) has the highest probability, given the conjoint evidence (I) of a house and a face. The question splits into two: (i) why is F favoured over the hypothesis that it is a house (H)? (ii) Why is F selected over some kind of conjunctive or blended hypothesis that it is a ‘house-face’ (F AND H)? (see Fig. 2).
Solving the alternation problem
The theoretical challenge is to explain why the system, having selected one stimulus for perception, after a few seconds decides to de-select it in favour of the alternative stimulus. It is clear a priori that some kind of reciprocal inhibition must be involved but inhibition cannot be the whole story, if alternation is to be explained. There must be a dynamic evolution of inhibition and activity to ensure alternation. Traditionally, one appeals to adaptation, which allows disinhibition; this
Less rivalry for consistent stimuli
As noted by Blake (1989) rivalry tends to occur when there is an increasing incompatibility between the stimuli presented to the two eyes. More consistent stimuli will tend to fuse. This fits within the predictive coding framework because it is a case where the conjoint hypothesis does have high prior. That is, were the stimuli a mouth-less face and a mouth, then the updated, dominant hypothesis F∗ (“it’s a face with a mouth”) would have a substantial prior. Fusion would then be allowed since
Accounting for conflicting neurophysiological and imaging evidence
Empirical findings on rivalry using single unit recordings and fMRI seem to be in conflict and are difficult to unify under a single theoretical framework. However, it is important to remember that neuronal implementations of predictive coding require both the representation of the prediction and the prediction error in hierarchically ordered pairs of levels in the brain. It is the hierarchal deployment of reciprocal changes among these that will offer an explanation for diverse empirical
Discussion
Under the account described here, an empirical Bayes framework with generative models and implemented with predictive coding or free-energy minimisation explains many aspects of binocular rivalry; because dichoptic viewing of mutually inconsistent stimuli creates a situation where no hypothesis about the environmental causes of the incoming sensory signal has both a high prior and high likelihood. The system therefore settles into a rhythm, where at any time the hypothesis with the highest
Conclusions
Core properties of a theoretical framework for perceptual inference in the brain based on generative models and predictive coding can be described in fairly basic probabilistic terms. The framework can explain and unify many aspects of binocular rivalry, in particular why one stimulus is selected for perception and why there is alternation between stimuli. The framework also accommodates many of the major psychophysical findings on rivalry and provides a unified interpretation of the apparently
Acknowledgements
This research was supported by the Danish Research Council for Communication and Culture, The Danish National Research Foundation, The Wellcome Trust, and a Monash Arts/IT Grant.
References (108)
- et al.
Grouping visual features during binocular rivalry
Vision Research
(1999) - et al.
The ventriloquist effect results from near-optimal bimodal integration
Current Biology
(2004) - et al.
Strength and coherence of binocular rivalry depends on shared stimulus complexity
Vision Research
(2007) - et al.
A test of Levelt’s second proposition for binocular rivalry
Vision Research
(1993) Functional integration and inference in the brain
Progress in Neurobiology
(2002)Learning and inference in the brain
Neural Networks
(2003)- et al.
The neural correlates of conscious experience: An experimental framework
Trends in Cognitive Sciences
(1999) - et al.
Dynamic causal modelling of evoked potentials: A reproducibility study
Neuroimage
(2007) Neural model of temporal and stochastic properties of binocular rivalry
Neurocomputing
(2000)- et al.
A neural network model of multistable perception
Acta Psychologica
(1985)