Introduction

In higher mammals, prefrontal-cortex-dependent executive functions support behavioral adaptation to a changing environment by permitting the introduction of flexibility into responding. Without this top-down cognitive control, behavior is more likely to be driven by simple sensory motor associations and habits (depending upon other cortical and subcortical regions), resulting in automatic and/or inflexible behavior (Miller and Cohen 2001).

Laboratory tasks, such as discrimination reversal learning, have been used in both human and animal subjects to study aspects of executive control and behavioral flexibility (Ragozzino 2007; Robbins 2007; Roberts and Wallis 2000). In related procedures, the associations between stimuli and outcomes are learnt by a subject before being unexpectedly reversed by the experimenter; after reversal, subjects have to alter their behavioral repertoire in order to discover the predictors of reward and/or to avoid a punishment. This may require the subject to stop pre-potent or conditioned responses (i.e., response inhibition) and/or overcome the learned irrelevance for the alternative responses (Clarke et al. 2007; Tait and Brown 2007), while the generation of new behaviors may arise by the intervention of others executive processes, such as working memory and/or response selection (Frank and Claus 2006; Miller and Cohen 2001).

Previous studies have shown that damage to the orbitofrontal cortex produces a selective reversal learning deficit (Boulougouris et al. 2007; Dias et al. 1996; Hornak et al. 2004; Iversen and Mishkin 1970; McAlonan and Brown 2003; Rolls et al. 1994; Schoenbaum et al. 2002), along with poor response inhibition (Aron et al. 2003b; Eagle et al. 2008), while damage to other prefrontal regions impairs behavioral flexibility only when the updating of attentional biases or rules is required (Birrell and Brown 2000; Dias et al. 1996; Owen et al. 1991; Ragozzino and Kesner 2001).

These phenomena appear to be of clinical relevance, as reversal learning deficits and/or response inhibition deficits have been reported in children with attention deficit hyperactivity disorder (ADHD; Itami and Uno 2002; Schachar et al. 2000, 1995), in stimulant-dependent humans (Fillmore and Rush 2002, 2006; Monterosso et al. 2005), or in animal models exposed chronically to addictive psychostimulant drugs (Calu et al. 2007; Jentsch et al. 2002). These conditions have each been empirically or theoretically associated with abnormal orbito-prefrontal functions (Everitt et al. 2007; Hesslinger et al. 2002; Olausson et al. 2007; Schoenbaum and Shaham 2008).

Ascending catecholaminergic systems are involved in the regulation of frontal-cortex-related executive functions (Arnsten and Li 2005; Robbins 2005), and clinical data related to the effective treatment of ADHD point to these systems as a potential target for the modulation of response inhibition and cognitive control. Stimulant drugs, such as methylphenidate (MPH), are common medications for ADHD and are known to inhibit norepinephrine and dopamine transporters (NET and DAT, respectively; Bymaster et al. 2002; Han and Gu 2006), increasing the extracellular levels of these catecholamines in different cortical and subcortical regions (Berridge et al. 2006; Bymaster et al. 2002; Kuczenski and Segal 1997). A large amount of data suggest a beneficial effects of these drugs on different domains of executive control, including behavioral inhibition, in both ADHD patients (Aron et al. 2003a; Scheres et al. 2003; Tannock et al. 1989; Willcutt et al. 2005) and animal models (Arnsten and Dudley 2005; Berridge et al. 2006; Blondeau and Dellu-Hagedorn 2007; Eagle et al. 2007; Lapiz et al. 2007; Navarra et al. 2008; Robinson et al. 2008).

Although these beneficial effects of stimulant drugs, as well as their abuse liability (that make them problematic for the treatment of stimulant abuse and dependence; Volkow and Swanson 2003), have been attributed to their dopaminergic actions, new evidence strongly suggests that the modulation of noradrenergic system may be relevant as well. In fact, the selective NET inhibitor atomoxetine (ATO) has been shown to be an effective alternative to stimulant drugs in ADHD treatment (Faraone et al. 2005; Spencer et al. 2001), and recent data have shown that this drug improves response inhibition in ADHD patients (Chamberlain et al. 2007), healthy subjects (Chamberlain et al. 2006), and in experimental animals (Blondeau and Dellu-Hagedorn 2007; Navarra et al. 2008; Robinson et al. 2008), suggesting that the inhibition of the NET alone may improve cognitive control.

In the current study, we compared the actions of drugs selective for each of the two catecholamine transporters, with MPH, in reversal learning tasks designed to measure aspect of cognitive control and flexible responding in rodents and non-human primates. We exposed rats to serial retentions and reversals of a four-choice spatial discrimination task, and we tested the effect of the administration of the stimulant drug MPH, the selective DAT inhibitor GBR-12909 (GBR), and the selective NET inhibitors ATO and desipramine (DMI). Monkeys performing a three-choice discrimination task that included two different sets of visual discriminanda were also used to examine the effects of a subset of these agents. We predicted that by virtue of NET inhibition, MPH, ATO, and DMI would improve performance selectively under reversal condition, while GBR would not. By employing three- and four-choice tasks in monkeys and rats, respectively, we were able to examine the patterns of errors made by subjects in order to dissociate perseverative errors (i.e., responses to the previous rewarded hole) from neutral errors (i.e., responses to the other incorrect holes), clarifying whether changes in reversal learning performance could be attributed to the ability to change the initially trained response (i.e., response inhibition).

Materials and methods

Subjects

Sixty-one adult male Long–Evans rats (Harlan, Indianapolis, IN, USA) were used in these experiments. The subjects were ~60 days of age at the initiation of training and ranged in weight from 300 to 400 g during the experimental period. All rats were initially food-restricted to 80–85% of their free-feeding weights and subsequently fed ~15 g rat chow per day in their home cage within 1–3 h after testing. Water was continuously available, except while in the operant testing chambers. Rats were maintained in 14/10-h light/dark schedule (lights on at 7 a.m.).

In addition, four male vervet monkeys (Chlorocebus aethiops sabaeus) from the UCLA Vervet Research Colony were included in the experiments; they were 9–11 years of age and weighed 6.0–8.0 kg at the time of testing. The monkeys were housed individually in a climate-controlled vivarium on a 12-h light/dark cycle (lights on at 6 a.m.); they had unlimited access to water and received a nutritionally balanced diet of monkey chow (Teklad, Madison WI, USA) supplemented with fresh fruit. The monkeys received their full daily allotment of food (which was not reduced in order to support behavioral performance) immediately after morning testing (one half portion) and again at 1600–1700 hours (one half portion).

The experimental protocols employed were consistent with the NIH “Guide for the Care and Use of Laboratory Animals” and were approved by the Chancellor’s Animal Research Committee at UCLA. All methods for the care and use of nonhuman primates conformed to US Department of Agriculture and Public Health Service standards.

Drugs

For the rat studies, doses of atomoxetine hydrochloride (1.0 mg/kg, gift from Pfizer) and desipramine hydrochloride (5.0 mg/kg; Sigma-Aldrich; St Louis, MO, USA) were chosen on the basis of a pilot study with a two-choice variant of the task described here (Seu and Jentsch 2006). Doses of methylphenidate hydrochloride (0.33–1.0 mg/kg; Sigma-Aldrich) and GBR-12909 dihydrochloride (2.5–5 mg/kg; Sigma-Aldrich) were chosen based upon previous studies showing relevant effects of this drugs in other behavioral procedures that measure inhibitory control in rats (Eagle et al. 2007; van Gaalen et al. 2006a).

All drugs were dissolved in sterile saline (0.9%) and were administered in a volume of 1.0 ml/kg, with the exception of GBR-12909 that was administered in a volume of 2.0 ml/kg. Methylphenidate, GBR-12909, and desipramine were injected 30 min before testing, while atomoxetine was administered 45 min prior to testing.

For the monkey studies, doses of atomoxetine (1 mg/kg; Tocris-Cookson; Ellisville, MO, USA) and methylphenidate (0.33 mg/kg; Sigma-Aldrich) were selected on the basis of pilot studies in which two different doses were tested on the same subjects but using a slightly different reversal learning task. We chose to administer all drugs by oral delivery rather than via intramuscular injections for the reason that we expected such route of administration to be better comparable with clinical studies. Drugs were mixed with fruit jam or peanut butter and put in a small cookie. Before every acquisition session, a cookie with jam or peanut butter (but without drug) was given to the subjects to avoid the fact that the cookie might be recognized as cue for the subsequent retention or reversal session. Drugs were administered 1 h before the retention or reversal sessions, and the experimenter assured whether the cookie and jam were eaten or not.

Rat behavioral testing apparatus

Standard extra-tall aluminum and Plexiglas operant conditioning chambers with a photocell-equipped pellet delivery magazine on one side and a curved panel with five photocell-equipped apertures on the opposite side (Med Associates, Mount Vernon, VT, USA) were used. The boxes were housed inside a sound-attenuating cubicle, background white noise was broadcasted, and the environment was illuminated with a houselight (a light diffuser that was located outside of the operant chamber but within the cubicle).

Rat training

Most of the prior reversal learning studies described in rodents and monkeys have been conducted using two-choice discrimination tasks (Boulougouris et al. 2007; Dias et al. 1996; Iversen and Mishkin 1970; McAlonan and Brown 2003; Schoenbaum et al. 2002). We chose to adopt a novel four-choice task in order to increase general task difficulty and to permit us to dissect perseverative from neutral errors.

The procedure for the initial training was similar to that used for a lateralized reaction time task (Jentsch 2003). Rats were first trained in a single session in which the houselight was continuously illuminated and single pellets (45 mg Dustless Precision Pellets; Bio-serv, Frenchtown, NJ, USA) were delivered into an illuminated magazine on a fixed-time 20-s schedule over a 45-min period. Across three subsequent daily sessions, the rats were then trained to make a sustained, variable duration nose poke (200, 500, 700, or 1,000 ms) in an illuminated center nose poke aperture (hole 3) to receive a pellet. This response (called the observing response) was used in the subsequent sessions to begin a new trial in order to demonstrate task engagement and to avoid random responding. All rats were trained until they earned at least 70–80 pellets in this initial shaping component.

The rats were then tested in daily discrimination sessions in which the initiation of individual trials was signaled by the illumination of the central aperture. A variable-duration observing response at that location resulted in the immediate switching off of the central light and illumination of the four remaining apertures, two to the left of the central aperture (holes 1 and 2; H1 and H2) and two to the right (holes 4 and 5; H4 and H5). On any given day, one of the holes was chosen by the experimenter to be the target hole, and rats were reinforced with a pellet for a response into that hole only (correct response). Importantly, no signal, other than reinforcement feedback, indicated which of the four holes was the target on any given day.

If the rat responded at a location that was not the established target, all lights in the box were extinguished, and the rat was given a 3-s time-out period in complete darkness (incorrect response). If no response was made within 15 s, the rat received a 3-s time-out in darkness (omission). The inter-trial interval that followed a completed trial or omission was 3 s. On occasion, rats responded into one of the lateral apertures before completing the sustained nose poke (and before the target presentation); in this case, a 3-s time-out was delivered (as above), and an anticipatory response was scored.

Sessions were terminated when rats reached criteria of 18 correct responses in 20 consecutive trials, after 1 h or when 200 trials were completed, whichever came first. If rats failed to achieve criterion performance in 1 h or 200 trials, the discrimination was repeated on subsequent days until criterion was met.

The subjects were exposed to approximately 1 month of initial training. During this time, rats were tested every 2 to 4 days for an average of 12 discrimination sessions (including one to two retention sessions), with each of the four holes assigned as target locations at least twice.

Rat testing—experimental design

After initial training, rats were tested for 2 days a week in pairs of experimental sessions. During the first session, referred to as the “new-hole session”, an aperture was selected pseudorandomly. In the second session, rats were tested for “retention” of the discrimination learned in the prior new-hole session (the reinforcement rule was kept the same) or they were subjected to a “reversal” of the discrimination learned in the prior new-hole session (a different aperture was rewarded). Pharmacological treatments were administered only during retention or reversal sessions, while the new-hole sessions were always performed drug-free.

Note that the new-hole sessions may be considered a reversal session because the rats are experiencing a change in the condition learned in the previous session. The new-hole sessions were used because it seemed to be important to always test retention or reversal of a discrimination learned in a drug-free state. This allowed us to ensure that there were no differences amongst the various drug and reversal vs. retention conditions in terms of performance in the immediately preceding session. In addition, a difference between new-hole and reversal sessions was that more restrictions were imposed in the choice of the target hole during reversal sessions in order to simplify the design and assure the all the conditions within and between drug studies were balanced. For example, only a switch from holes positioned on different sides of the central hole (H5–H1, H5–H2, H4–H2, or H4–H1 and vice versa) was allowed, while this was not a requirement for new-hole sessions.

If in a drug study a rat failed to reach performance criteria in one or more drug sessions, these sessions were repeated after completion of the formal Latin square design. However, only one to three rats in each of the drug studies ever failed to complete a session, and the frequency of failures never differed between treatment groups.

All rats required at least eight to 12 discrimination sessions (including retention, reversal, and new-hole sessions) to complete each pharmacological study; considering initial training and pharmacological testing, each subject was exposed to 20–40 sessions over the course of the procedures described in this manuscript.

A minimum of four and a maximum of 14 drug administrations (of which two to four were saline administrations) were given to each rat, with an interval between injections of at least 1 week.

The measures collected during daily sessions included total number of trials (number of trials required to reach criterion), the mean trial initiation latency (the average interval between illumination of the central aperture and the initiation of the observing response), the mean pellet retrieval time (the average interval between pellet delivery and head entry into the magazine), and the number of anticipatory responses (calculated as a fraction of completed trials). Omissions were very rare and all drugs tested failed to affect them, so omissions are not presented here.

Correct and incorrect responses were measured as percentage of all completed trials in five-trial bins (trials 1–5, 6–10, etc.). In each study, we analyzed these measures across the maximum number of trials bins where data points were present for all rats (because the number of trials available for analysis depended upon how quickly the rat reached criteria). For the retention sessions, only four bins were considered in all drug studies (all experiments included rats that completed the retention sessions in the minimum number of trials, i.e., 20), while for the reversal sessions, the number of bins was variable (between four and seven).

Furthermore, incorrect responses were defined as perseverative (i.e., responses to the hole rewarded in the previous new-hole session) or neutral (responses to the other two incorrect holes) and were analyzed as a percentage of all completed trials in five-trial bins, as described above.

Monkey behavioral testing apparatus

A modified Wisconsin general test apparatus (WGTA) consisting of an opaque screen (that could be raised and lowered) that separated the monkey from a tray fitted with three opaque square food boxes with hinged opaque lids was used. The tops of the food boxes were fitted with distinctive colored pictures (clip art from the Microsoft Office library), and the monkeys were trained to open the lids of food boxes to retrieve food rewards (bits of apple, grape, or banana) hidden within. The monkeys, which had previously been trained to move voluntarily from their cages to a transport cart, were moved, one at a time, to an adjacent testing room. In this room, the transport cart was aligned to the WGTA so that monkeys could easily access the food boxes on the tray when the opaque screen was raised. The monkeys were presented with trials on which the screen was raised to reveal three food boxes with the visual cues. They were allowed to open only one box lid per trial. Each trial lasted either until the animal opened a lid or 2 min passed, whichever came first. The inter-trial interval was approximately 15 s. For each set of three visual cues, the picture that was chosen to be the “positive” stimulus was varied across the four subjects. The position of the food boxes varied pseudorandomly across all trials. Each subject was tested between 8:30 and 11:00 a.m. on each day.

Monkey testing—reversal learning task

Before the four monkeys participated in this study, they were trained on a slightly different reversal learning paradigm and were involved in other pharmacological studies (Lee et al. 2007). We modified our earlier task design in an attempt to increase the general difficulty of the task. Our procedure, like some human discrimination learning tasks (Frank et al. 2007), involved training subjects on two separate discriminations, composed of three stimuli each, within single sessions (discrimination set 1: A,B,C; discrimination set 2: D,E,F). In every session, half of the trials included discrimination set 1 and the other half involved discrimination set 2. The two sets of stimuli were mixed pseudorandomly across trials; the only constraint on stimuli presentation order was that the same triad of pictures was not presented in more than three consecutive trials. Across daily sessions of 30 trials (15 trials with stimuli A,B,C and 15 with D,E,F), the monkeys had first to learn that one stimulus per discrimination was associated with reward while the others were not (i.e., A+, B−, C− and D+, E−, F−). Once a minimum criterion of 13 correct responses out of 15 trials was reached for both sets of stimuli in a single session, the subjects were tested on the subsequent day for retention or reversal of the two discriminations.

The maximum number of acquisition sessions allowed was four, and when a monkey did not meet this criterion in four sessions, the testing for that subject was interrupted and restarted the following week with new sets of stimuli. However, this circumstance was not very common, and often, monkeys learned both discriminations within one session.

The retention and reversal sessions consisted of two contiguous blocks of 20 trials each consisting of an equal mix of 10 A-B-C trials and ten D-E-F trials (Fig. 1). In the retention sessions, the reward contingencies in the two blocks were the same as in the acquisition sessions (A+, B−, C− and D+, E−, F−). Differently, in the reversal sessions, the stimulus–reward associations were changed so that discrimination set 1 was reversed from the beginning of the test session (A−, B+, C−), while discrimination set 2 was retained in the first block (D+, E−, F−) and reversed in the second block (D−, E+, F−). In this way, monkeys were subjected to two different reversal conditions in the session, one preceded by a retention test to “prime” responding (involving discrimination set 2) and one that was not (involving discrimination set 1). The retention and reversal sessions were alternated in a pseudorandom way so that no more than two consecutive weeks included the same learning conditions.

Fig. 1
figure 1

Schematic representation of the experimental design used in the monkey studies. The acquisition sessions consisted of 15 trials for each set of discriminanda (discrimination 1: A, B, C, disc1; discrimination 2: D, E, F, disc2) presented in a pseudo-random order. Once acquisition criterion was achieved, monkeys were tested the following day on retention or reversal of the two discriminations. During retention, the contingencies for both set of discriminanda were the same as in acquisition, while during reversal, one discrimination set (disc1) was reversed from the beginning of the session and the other was retained in the block 1 and reversed in block 2

The measures collected were retention session errors (total errors committed in the retention sessions), reversal session retention errors (errors committed during retention of discrimination set 2 in the first block of the reversal session), perseverative errors (number of responses to the previously reinforced stimulus during reversal of discrimination set 1 or 2), neutral errors (number of non perseverative errors during reversal session), retention session duration (total time required to complete the retention session; in s), and reversal session duration (total time required to complete the reversal session; in s).

Data analysis

All studies were within subjects and the order of drug conditions counter-balanced (cyclic Latin square design) across subjects and testing conditions (retention and reversal).

For the rat experiments, the measures described above were subjected to repeated measures analysis of variance (ANOVA), with testing condition, bin, and dose as factors. Where significant main effects or interactions were detected, they were further analyzed using a paired two-tailed t tests. Our a priori hypotheses were tested using paired one-tailed t tests.

For the monkey studies, specific a priori hypotheses were examined using one-tailed paired t tests, while for all the other comparisons, two-tailed paired t tests were used.

Results

Rat studies: baseline performance characteristics

The data collected under saline-only conditions in all the rat experiments are summarized in Fig. 2 in order to exhibit baseline performance characteristics of rats performing the four-choice discrimination task. In retention, rats perform at ~50% accuracy within the first five trials of the session (chance performance = 25%), reaching ~80% accuracy by trials 16–20, while under the reversal condition, rats start out with accuracy which is below chance (Fig. 2a), mainly due to a tendency to respond toward the previously rewarded hole (Fig. 2b). Indeed, nearly 50% of the total responses emitted in the first four bins of the reversal phase were perseverative (Fig. 2b). Together with this observation, the analysis of responses to the two “neutral” holes revealed that performance was guided by a set of long-term reinforcement rules, incorporating information learned over many sessions; in fact, rats responded more to the neutral hole that was rewarded in the most recent session (i.e., recent neutral) than to the neutral hole reinforced longest in the past (i.e., past neutral; Fig. 2b). When considering accuracy of response during the first four five-trial bins, ANOVA detected a significant interaction between testing condition (retention versus reversal) and trial bin [F (1,80) = 3.0, p ≤ 0.05], along with main effects of testing condition [F (1,80) = 228.6, p ≤ 0.0001] and trial bin [F (1,80) = 26.7, p ≤ 0.0001]. A main effect of testing condition was also observed in the total number of trials required to reach criterion [F (1,80) = 142.1, p ≤ 0.0001] (data not shown).

Fig. 2
figure 2

Rat performance in the four-position discrimination task is illustrated by compiling the saline data collected in all studies. a Correct responding in the first 20 trials of retention and reversal sessions (dotted line indicates chance level = 25%). b Under reversal conditions, the majority of incorrect responses were perseverative (i.e., responses toward the hole rewarded in the previous session), followed by recent neutral errors (i.e., responses toward the hole rewarded two session before) and past neutral errors (i.e., responses to the remaining hole). Data are expressed as mean percentage ± SEM of total responses in each of the first four five-trial bins of the sessions (n = 84)

We ran several analyses in order to evaluate whether there were differences in the difficulty of performance in reversal depending upon which hole was being reinforced in the test session and which hole had been reinforced in the previous session because it is possible that some switches were easier than others. During reversal learning testing, there were no differences in performance (measured by the total number of trials or proportions of correct, perseverative, or neutral errors) that were related only to the current hole being reinforced (all F’s < 1). Nevertheless, ANOVA revealed a main effect of the hole trained in the previous session when considering the percentage of responses that were perseverative [F (1,80) = 11.2, p ≤ 0.001] and on total number of trials to meet criteria [F (1,80) = 4.1, p ≤ 0.05]; in fact, when the previous hole was internal (i.e., directly adjacent to the hole in which the observing response was made: H2 and H4), subjects made more perseverative responses and took more trials to complete the session. In other words, rats have slightly more difficulty switching away from the internal holes. However, all drug conditions were explicitly balanced to control for the type of reversal sequence being imposed (all drug treatment conditions were evaluated using an equal number of shifts from the internal-to-external and external-to-internal holes).

Experiment 1: differential effects of MPH on the retention and reversal of a four-choice discrimination

Based upon previous studies showing the beneficial effect of MPH on aspects of behavioral inhibition (Aron et al. 2003a; Blondeau and Dellu-Hagedorn 2007; Eagle et al. 2007; Navarra et al. 2008; Tannock et al. 1989), we hypothesized that this drug would improve reversal performance while not affecting retention performance. As shown in Fig. 3a, the two doses of MPH tested differently affected the total number of trials required to reach criteria in the two testing conditions [dose × testing condition: F (1,22) = 6.1, p ≤ 0.01; dose: F (1,22) = 0.4, p = 0.69]. Consistent with our a priori hypothesis, MPH at the dose of 0.33 mg/kg decreased the total number of trials required to complete the reversal session (one tailed t test: t = 1.7, df = 22, p ≤ 0.05); notably, it increased the total number of trials required to reach criteria in the retention session (two-tailed t test: t = −2.8, df = 22, p ≤ 0.01). The higher dose of MPH (1 mg/kg), on the other hand, did not significantly affect the total number of trial to criteria in either the reversal (one-tailed t test: t = 0.3, df = 22, p = 0.39) or retention sessions (two-tailed t test: t = −1.2, df = 22, p = 0.24).

Fig. 3
figure 3

a Total number of trials required to reach criteria on retention and reversal sessions after administration of MPH (0.33–1.0 mg/kg) or saline (1 ml/kg, SAL). b Effect of MPH or SAL administration on correct responses (% of total responses) in the first four and five bins of retention and reversal sessions respectively. c Perseverative responses (% of total responses) in the first five bins of the reversal session after saline or MPH. Data are expressed as mean ± SEM (n = 23). *p ≤ 0.05, significantly different from saline; **p ≤ 0.01

Neither dose of MPH affected others measures collected, such as the latency to initiate a trial [dose × testing condition: F (1,22) = 0.8, p = 0.46; dose: F (1,22) = 1.2, p = 0.29], the pellet retrieval time [dose × testing condition: F (1,22) = 0.6, p = 0.55 dose: F (1,22) = 0.2, p = 0.81], or the number of anticipatory responses made per trial completed [dose × testing condition: F (1,22) = 0.3, p = 0.73; dose: F (1,22) = 0.3, p = 0.77] (Table 1).

Table 1 Effect of MPH, GBR, ATO, and DMI on motor latencies and anticipatory responding

In order to evaluate the source of these effects of MPH on trials to criteria, we next examined drug effects on the proportion of total trials that were completed correctly and incorrectly in the two learning conditions. For retention performance, ANOVA detected a main effect of dose on correct responding in the first four five-trial bins [bin × dose: F (1,22) = 2.4, p ≤ 0.05; dose: F (1,22) = 3.2, p ≤ 0.05] due to the fact that rats treated with the lower dose of MPH were overall less accurate than after saline administration (Fig. 3b). For the reversal sessions, there was also a significant bin × dose interaction [F (1,22) = 13.0, p ≤ 0.001], without a main effect of dose, for the proportion of trials completed correctly [F (1,22) = 1.7, p = 0.20]; MPH increased correct responding in the last two five-trial bins (Fig. 3b). Furthermore, no main effect of dose was found for perseverative responding [F (1,22) = 0.1, p = 0.87] or for neutral errors [F (1,22) = 1.4, p = 0.26], though there was a significant dose × bin interaction for perseveration [F (1,22) = 2.2, p ≤ 0.05] (Fig. 3c), but not for neutral errors [F (1,22) = 0.5, p = 0.87] (data not shown). This resulted from decreased perseveration in the final five-trial bins, an effect specific to the lower dose of MPH (Fig. 3c).

Experiment 2: selective inhibition of the DAT does not affect performance of the four-choice task

For these studies, the same set of rats (n = 23) involved in the MPH study were used to evaluate the effects of GBR. As shown in Fig. 4a, the two doses of GBR tested (2.5 and 5 mg/kg) did not affect the total number of trials required to reach criteria in either the retention or reversal sessions [testing condition × dose: F (1,22) = 0.2, p = 0.82; dose: F (1,22) = 1.2, p = 0.31]. However, GBR did dose-dependently increase the number of anticipatory responses per trial [dose × testing condition: F (1,22) = 1.1, p = 0.35; dose: F (1,22) = 3.9, p ≤ 0.05]. Post hoc comparisons revealed that the increase in anticipatory responses was significant for the higher dose of GBR in the retention session (t = −2.6, df = 22, p ≤ 0.05; Table 1). The latency to initiate a trial and the pellet retrieval time were not altered by GBR [dose × testing condition: all F’s < 1; dose, respectively: F (1,22) = 1.2, p = 0.31, F (1,22) = 1.3, p = 0.75] (Table 1).

Fig. 4
figure 4

Effect of GBR (2.5–5 mg/kg) or SAL (1 ml/kg) on a total number of trials to criteria, b on correct, and c perseverative responses (% of total responses). Data are expressed as mean ± SEM (n = 23)

In terms of the effect of GBR on the accuracy in the first 20 trials of the retention session, the ANOVA failed to reach significance [dose × bin: F (1,22) = 0.1, p = 0.99; dose: F (1,22) = 2.1, p = 0.13] (Fig. 4b). Considering the first five bins of the reversal session, no effect of GBR was found for the proportion of total trials completed correctly [dose × bin: F (1,22) = 1.5, p = 0.17; dose: F (1,22) = 0.1, p = 0.88] (Fig. 4b), for perseveration [dose × bin: F (1,22) = 0.5, p = 0.84; dose: F (1,22) = 0.7, p = 0.50] (Fig. 4c), or for neutral errors [dose × bin: F (1,22) = 1.8, p = 0.08; dose: F (1,22) = 0.7, p = 0.53] (data not shown).

Experiment 3: NET inhibitors selectively improve performance in the reversal sessions

Based upon our preliminary studies using a two-choice task (Seu and Jentsch 2006), we hypothesized that NET inhibitors (ATO and DMI) would improve performance of the four-choice reversal learning task. A priori tests revealed that both ATO and DMI decreased the total number of trials required to reach criteria in the reversal session (ATO: t = 1.7, df = 21, p ≤ 0.05; DMI: t = 2.4, df = 15, p ≤ 0.05), while neither treatment affected the number of trials to criteria in retention (ATO t = 0.8, df = 21, p = 0.44; DMI: t = −0.51, df = 15, p = 0.62; Figs. 5a and 6a). The ANOVA failed to detect a significant effect of dose on total trials to criterion for both ATO [F (1,21) = 3.3, p = 0.08] and DMI [F (1,15) = 2.4, p = 0.14]; however, there was a significant dose × testing condition interaction for DMI [F (1,15) = 6.3, p ≤ 0.05], but not for ATO [F (1,21) = 1.3, p = 0.27]. As shown in Figs. 5b and 6b, ATO and DMI did not affect correct responding in the first four five-trial bins of the retention sessions (all F’s < 1, ns). In addition, ANOVA detected only trends for increases in correct responding in reversal for both NET inhibitors [ATO: F (1,21) = 2.7, p = 0.12; DMI: F (1,15) = 3.3, p = 0.09]; however, there was a significant dose × bin interaction for DMI [F (1,15) = 2.3, p = 0.04], but not for ATO [F (1,21) = 1.4, p = 0.24] (Figs. 5b and 6b).

Fig. 5
figure 5

Effect of ATO (1 mg/kg) or SAL (1 ml/kg) on a total number of trials to criteria, b on correct, and c perseverative responses (% of total responses). Data are expressed as mean ± SEM (n = 22). *p ≤ 0.05, significantly different from saline

Fig. 6
figure 6

Effect of DMI (5 mg/kg) or SAL (1 ml/kg) on a total number of trials to criteria, b on correct, and c perseverative responses (% of total responses). Data are expressed as mean ± SEM (n = 16). *p ≤ 0.05, significantly different from saline; **p ≤ 0.01; ***p ≤ 0.001

Notably, ANOVA detected a main effect of dose for both NET inhibitors on perseverative responses during reversal [ATO: F (1,21) = 4.8, p = 0.04; DMI: F (1,15) = 4.9, p = 0.04], as well as a dose × bin interaction for DMI [F (1,15) = 2.4, p = 0.04], but not for ATO [F (1,21) = 1.3, p = 0.28] (Figs. 5c and 6c). On the other hand, there was no main effect of dose for either of the two drugs on neutral errors [ATO: F (1,21) = 0.04, p = 0.85; DMI: F (1,15) = 0.06, p = 0.81] (data not shown) nor were there any dose × testing condition interactions [ATO: F (1,21) = 1.5, p = 0.21; DMI: F (1,15) = 0.9, p = 0.50].

As shown in Table 1, the latency to initiate a trial tended to be affected by ATO [dose × testing condition: F (1,21) = 1.6, p = 0.22; dose: F (1,21) = 3.1, p = 0.09], while for DMI, there was a main effect of dose [F (1,15) = 5.2, p ≤ 0.05] but no dose × testing condition interaction [F (1,15) = 0.1, p = 0.78]; further paired comparisons revealed that DMI significantly increased latency only in the retention sessions (t = −2.1, df = 15, p ≤ 0.05; Table 1). In addition, the time required to retrieve the reward was affected by both ATO [dose × testing condition: F (1,21) = 2.8, p = 0.11; dose: F (1,21) = 4.3, p ≤ 0.05) and DMI [dose × testing condition: F (1,15) = 0.0, p = 0.95; dose: F (1,15) = 18.2, p ≤ 0.001]. As shown in Table 1, ATO increased pellet retrieval time in the reversal sessions (t = −2.5, df = 21, p ≤ 0.05), while DMI exerted effects in both the retention (t = −2.2, df = 15, p ≤ 0.05) and the reversal sessions (t = −3.4, df = 15, p ≤ 0.01). Anticipatory responses were not affected by ATO [dose × testing condition: F (1,21) = 1.4, p = 0.24; dose: F (1,21) = 1.9, p = 0.17], while there was a trend for a main effect of dose for DMI [F (1,15) = 3.9, p = 0.06] without a significant dose × testing condition interaction [F (1,15) = 0.9, p = 0.36] (Table 1).

Monkey studies: effect of MPH and ATO on performance of a visual discrimination task

The effect of MPH and ATO was also tested in non-human primates performing a visual discrimination task. Considering the retention session (data not shown), there was no main effect of drug treatment for the total number of errors [F (2,6) = 1.5, p = 0.3]. ATO did not alter the total number of errors in the retention sessions (two-tailed t test: t = −0.8, df = 3, p = 0.47), while MPH exerted a very weak trend to increase the total number of errors (two-tailed t test: t = −1.8, df = 3, p = 0.16). Furthermore, none of the drugs affected the number of errors made during the retention component given within the reversal sessions [ANOVA for effect of treatment: F (2,6) = 2.3, p = 0.4; two-tailed t tests, ATO: t = −0.7, df = 3, p = 0.49; MPH: t = −1.2, df = 3, p = 0.29] (Fig. 7).

Fig. 7
figure 7

Effect of MPH (0.33 mg/kg) or ATO (1 mg/kg) on performance on the retention or reversal of discrimination set 2 (i.e., set of discriminanda that were reversed after a priming retention). For the retention phase, we report total number of errors, while for the reversal phase, perseverative and neutral errors are presented separately. Data are expressed as mean of errors ± SEM (n = 4). *p ≤ 0.05, significantly different from saline

On the other hand, there was a main effect of drug treatment for perseverative errors when reversal was not preceded by retention [between session reversal; main effect of drug: F (2,6) = 5.4, p = 0.04], while the effect of treatment on perseverative errors preceded by retention was at a trend level [within session reversal; F (2,6) = 3.8, p = 0.08]. Tests of a priori hypotheses showed that ATO decreased the number of perseverative errors made by subjects when reversing the discrimination that was preceded by a retention test (one-tailed t test: t = 2.4, df = 3, p ≤ 0.05; Fig. 7) while not affecting perseveration for the reversal which was not preceded by a retention test (one-tailed t test: t = −0.5, df = 3, p = 0.69; data not shown). The opposite trend was observed for MPH; the drug did not significantly affect reversal of the discrimination that was preceded by a retention test (one-tailed t test: t = 1.2, df = 3, p = 0.15; Fig. 7) but tended to decrease perseverative errors when the non-retained discrimination was reversed (one-tailed t test: t = 1.9, df = 3, p = 0.07; data not shown). Neither of the drugs tested affected the number of neutral errors made when reversal followed retention [effect of treatment: F (2,6) = 1.2, p = 0.4; two-tailed t tests, ATO: t = 1.2, df = 3, p = 0.31; MPH: t = 1.0, df = 3, p = 0.36] (Fig. 7) or when it did not [effect of treatment: F (2,6) = 0.9, p = 0.5; two-tailed t tests, ATO: t = −1.2, df = 3, p = 0.28; MPH: t = −1.2, df = 3, p = 0.30] (data not shown).

Discussion

Using a novel four-choice position discrimination task developed to study behavioral flexibility in rats, we show that three drugs that inhibit NET with varying degrees of selectivity—ATO, DMI, and MPH—exerted a common effect to reduce the total number of trials required to reverse a learned discrimination; selective inhibition of DAT was not associated with a similar performance effect. Furthermore, very similar effects of ATO and MPH were found in monkeys trained to perform a visual discrimination reversal task.

In the four-choice discrimination task for rats, subjects were required to either retain or reverse a position–reward association learnt in the previous session; the difficulty associated with switching responses was revealed by the significantly higher number of trials required to complete reversal, as opposed to retention, conditions. The dissociations between retention and reversal performance specifically arose from the conditioned tendency to respond to the previously learnt rule; this pre-potent response style facilitates efficient performance when the rule is unaltered (retention), while it causes difficulty with switching position–reward associations (reversal). Notably, rats appeared to guide their performance according to a set of long-term reinforcement rules that incorporated information learned over many sessions because they responded most to the hole reinforced in the most recent session and least to the hole reinforced longest in the past (Fig. 2b). In addition, reversal of a learned discrimination in monkeys was associated with a specific persistence of responding towards the previously trained stimulus. Taken together, these data support the notion that our tasks incorporate an interesting element of reversal performance which is the need to inhibit a previously learned response.

Behavioral basis of NET inhibitor effects on reversal learning

Further evidence that the reversal tasks used measure the ability to inhibit pre-potent responses stems from the pharmacological studies. In rats, selective NET inhibitors specifically reduced the total number of trials required to complete reversal sessions and tended to increase the proportion of trials completed correctly in the first stages of the session while having no measured effect on performance in retention. Importantly, the effect of NET inhibitors on total trials required to complete the reversal session was associated with a specific decrease in the perseverative responding in the first stage of the session. In monkeys, ATO specifically decreased perseverative responses when the subjects were tested for their ability to change a response after retention (a condition that likely “primes” the conditioned action); this was contrasted with the effect of MPH which only decreased perseverative errors when no retention was given prior to reversal, an effect that may relate to the ability of MPH to disrupt retention and, hence, reduce interference from the previously learned association. These data may suggest that the effects of the NET inhibitors were mediated by increased ability to overcome pre-potent responding toward the previously trained response. Nevertheless, we cannot rule out of the possibility that these drugs were improving the ability to overcome learned irrelevance; in both cases, a similar reduction in perseverative responses would be expected.

On the other hand, the selective DAT inhibitor GBR failed to affect retention or reversal performance other than producing a dose-dependent increase in anticipatory responses. Accordingly, previous experiments have shown that the same doses of GBR used in our study increased premature responses in the five-choice serial reaction time task (van Gaalen et al. 2006a) and impulsive decision making in a delayed reward task (van Gaalen et al. 2006b). Together with our results, this evidence suggests that the inhibition of DAT may result in disinhibited responding without affecting other forms of impulsivity such as perseveration; this is in line with the idea that different neuronal mechanisms may underlie different forms of impulsivity (Dalley et al. 2008; Evenden 1999).

The ability of ATO and DMI to improve reversal performance was also displayed by the lower dose (0.33 mg/kg) of the stimulant drug MPH, which is known to inhibit both NET and DAT (Bymaster et al. 2002; Han and Gu 2006). Interestingly, similar doses of MPH have been shown to improve response inhibition (Eagle et al. 2007) and other executive functions in rodents (Arnsten and Dudley 2005; Berridge et al. 2006). Collectively, this dose range is associated with plasma concentrations very similar to those produced by clinical dosing in humans. Although the improvement in reversal learning induced by MPH in rats and monkeys was associated with a decrease in perseverative responses, the same dose of MPH was found to generally negatively affect retention performance, an effect not found with selective NET inhibitors. While the improvement in reversal and impairment in retention caused by MPH could, theoretically, be attributed to a deficit in the retrieval of the previously learned rule, making it easier to learn a new response in reversal, this seems not to be the case in the current study because, in rats, methylphenidate reduced perseveration only in the last bins considered (trials 16–25), while one could expect the effect to occur from the first trials if a deficit in rule retrieval was the cause.

Another possible explanation for the opposite effects of MPH on retention and reversal is that the drug generally increases switching behavior; in fact, it has been shown that stimulant drugs increase switching behavior in rodents (Evenden and Robbins 1983). If this effect alone explained the results, however, we might have expected to observe an increase in neutral errors, but this was not the case in the current study.

In contrast to the effects of the 0.33-mg/kg dose, no difference in performance of either testing condition was induced by the high dose of MPH. This may result from an inverted-U dose–response for methylphenidate on cognition such that low dose of this stimulant improves cognitive control, while higher doses induce hyperactivity and cognitive impairment (Berridge et al. 2006; Eagle et al. 2007). In this study, however, the highest dose of MPH did not produce any impairment, disinhibited responding, or hyperactivity, suggesting that an inverted-U dose–response effect cannot solely account for our results.

Monoamine systems and behavioral flexibility

Recent data using neurochemical depletion or pharmacological approaches have highlighted a role for the serotonergic system as an important modulator of reversal learning (Boulougouris et al. 2008; Clarke et al. 2007). Our results, along with other experimental and clinical evidence (see below), suggest that the dopaminergic and noradrenergic systems may also contribute to behavioral flexibility and cognitive control. For example, pharmacological or genetic interventions targeting different dopamine receptor subtypes affect performance in reversal learning or attention set-shifting paradigms (Floresco et al. 2006; Izquierdo et al. 2006; Lee et al. 2007; Mehta et al. 2001; Ragozzino 2002). Additionally, administration of the alpha-2 adrenergic receptor agonist guanfacine has been shown to improve reversal learning in aged monkeys (Steere and Arnsten 1997), while in rodents, stimulus reversals and extra-dimensional shift were improved by high dose of the alpha-2 antagonist atipamezole (Lapiz and Morilak 2006). Furthermore, accumulating evidence suggests that dysfunctions of both the noradrenergic and the dopaminergic system may contribute to ADHD symptoms, including impulsivity, inattention, executive dysfunction, and poor cognitive control and that drugs acting on these systems may be helpful for the treatment of the disorder (Arnsten 2006; Biederman and Spencer 1999; Frank et al. 2007; Robbins 2007).

Interestingly, the current study shows that NET inhibitors with clinical efficacy in the treatment of ADHD improve the ability of monkeys and rats to overcome pre-potent responding. Our results are in accord with recent studies reporting the beneficial effects of selective and non-selective NET inhibitors on behavioral inhibition in rodents, healthy humans, and ADHD subjects (Aron et al. 2003a; Blondeau and Dellu-Hagedorn 2007; Chamberlain et al. 2007, 2006; Eagle et al. 2007; Navarra et al. 2008; Robinson et al. 2008).

Pharmacological and neurochemical mechanisms

NET and DAT are expressed on noradrenergic and dopaminergic terminals, respectively, and their function is to reuptake norepinephrine or dopamine released in the extracellular compartment in order to terminate their synaptic or extra-synaptic actions; therefore, blockade of NET or DAT results in an increase of extracellular levels of norepinephrine and dopamine, respectively, and consequent stimulation of receptors in brain regions receiving projections from these neurotransmitter systems.

The benefit associated with administration of NET inhibitors, compared with the absence of relevant effects of the selective DAT inhibitor, seems to suggest that the modulation of the noradrenergic system is crucial for the improvement in behavioral flexibility observed in our study. However, it is well known that the selective NET inhibitors are also capable of increasing extracellular dopamine in specific regions such as the prefrontal cortex (Berridge et al. 2006; Bymaster et al. 2002; Swanson et al. 2006). In fact, in the prefrontal cortex, a low expression of DAT (Lammel et al. 2008) indicates that a significant portion of extracellular dopamine is cleared by NET (Carboni et al. 2006; Mazei et al. 2002; Moron et al. 2002) which has high affinity for both catecholamines. For this reason, administration of selective and non-selective NET inhibitors results in increase of both norepinephrine and dopamine within the prefrontal cortex (Berridge et al. 2006; Bymaster et al. 2002; Swanson et al. 2006), while local infusion of selective DAT inhibitors does not significantly affect extracellular levels of dopamine in the cortex (Mazei et al. 2002). On the other hand, drugs acting on DAT, including MPH, increase dopamine in the ventral and dorsal striatum (Carboni et al. 2006; Kuczenski and Segal 1997), an action that is not displayed by selective NET inhibitors (Bymaster et al. 2002), and these neurochemical effects have been related to the abuse liability of stimulants medications. For these reasons, the modulation of extracellular levels of both catecholamines in specific regions such as the prefrontal or orbitofrontal cortex may be the mechanism by which NET inhibitors improve inhibitory control and behavioral flexibility.

Conclusion

Conceptually, treatments that enhance executive/cognitive control over pre-potent responses are emerging as leading candidate medications for the treatment of an array externalizing disorders, including ADHD, substance abuse and dependence, and other forms of impulse control problems. The extent to which selective inhibitors of the norepinephrine membrane transporter improve multiple aspects of inhibitory control over responding (Blondeau and Dellu-Hagedorn 2007; Chamberlain et al. 2007; Lapiz et al. 2007; Navarra et al. 2008; Robinson et al. 2008) underscore their potential in regards to the treatment of these behavioral disorders. What remain unknown, however, are the discrete neuronal effects of increased extracellular dopamine and norepinephrine levels that contribute to these enhancements in inhibitory control. While certain component processes that contribute to cognitive control, e.g., working memory maintenance, depend more on dopamine D1-like receptor signaling (Arnsten et al. 1994; Sawaguchi and Goldman-Rakic 1991), recent work also indicates that D2-like signaling likely plays an important role in updating central representations and behavior (Floresco et al. 2006; Lee et al. 2007; Wang et al. 2004), demonstrating the complex mechanisms by which catecholamine transmitters modulate dissociable aspects of cognitive and executive functioning. Further work exploring these multifaceted effects of catecholamine transmitters is clearly needed.