Introduction

There are presently no authentic longevity therapeutics. Such compounds would intervene in the process of aging to extend mean and/or maximum life span, maintain physiological function, and mitigate the onset and severity of a broad spectrum of age-related diseases in mammals. Such drugs might engage the pathways used by caloric, methionine, and phenylalanine restriction, and the longevity-enhancing mutations (reviewed in Spindler 2009). The terms “CR mimetics” and “geroprotectors” have been used to describe such compounds (Weindruch et al. 2001; Roth et al. 2001; Cao et al. 2001; Anisimov 1982; Lippman 1981). In this report, we will use the general term “longevity therapeutics.”

While a full understanding of the mechanisms of aging will greatly facilitate the development and deployment of longevity therapeutics, drug discovery and development have a long history of using surrogate assays for identifying therapeutics, often with little knowledge or understanding of the etiology of the diseases for which the therapeutics were intended (discussed in Spindler 2006). Indeed, most of the medications currently in our armamentarium were discovered using surrogate assays. Thus, the development and refinement of surrogate assays for longevity therapeutics should speed their identification.

There have been multiple methods used in the attempt to identify such compounds. For example, we and others have utilized genome-wide microarray studies of treated mice to identify potential therapeutics (Barger et al. 2008; Spindler and Dhahbi 2007; Spindler and Mote 2007; Spindler 2006; Dhahbi et al. 2005; Corton et al. 2004). Another approach, which will be discussed here, is the direct assays of compounds for their effects on the life span of rodents.

Longevity assays using genetically normal, healthy rodents

In mice, a number of natural mutations, gene knockouts, and overexpressed transgenes are known to extend longevity and increase health span (Selman et al. 2008; Taguchi et al. 2007; Kurosu et al. 2005; Holzenberger et al. 2003; Flurkey et al. 2001; Coschigano et al. 2000; Zhou et al. 1997; Brown-Borg et al. 1996). Thus, potential therapeutic targets for life span extension exist in mammals. However, no robustly effective, safe, and widely recognized longevity therapeutics exist at present. The likely reason that such drugs have not been identified is that we have not mounted an effective search for them. Life span studies in rodents have been used in this search (Table 1 and Electronic supplementary material (ESM) Table 1). More recently, this literature benefits from the improved levels of hygiene used in animal husbandry (e.g., see Sebesteny 1991). For example, several older studies in Table 1 appear to report data consistent with the presence of infectious agents in the rodent colony (Ferder et al. 1993; LaBella and Vivian 1978; Sperling et al. 1978). Despite these improvements, the design and implementation of rodent life span studies could be improved further.

Table 1 Summary appraisal of the published life span studies using healthy rodents

Table 1 and ESM Table 1 summarize and evaluate all of the rodent life span studies we found using repeated key word searches of the online databases. In ESM Table 1, under the heading “Evaluation,” we present our evaluation of the study results. ESM Table 1 presents 106 life span studies performed with healthy rodents. We excluded from this table 20 rodent life span studies performed with melatonin, which are contradictory in their results and which have been reviewed elsewhere (Anisimov et al. 2006).

Despite the fact that the effects of caloric restriction on life span were described 76 years ago (McCay et al. 1935), drug screening studies which regulate or measure food consumption are rare. We found only six studies which measured food consumption and also found life span extension (Liang et al. 2010; Caldeira da Silva et al. 2008; Cai et al. 2007; Stoll et al. 1997; Yen and Knoll 1992; Cotzias et al. 1977). These were deprenyl fed to Syrian hamsters (Stoll et al. 1997); deprenyl and Dinh lang root extract fed to mice (Yen and Knoll 1992); dinitrophenol fed to normal mice of a short-lived strain (Caldeira da Silva et al. 2008); l-dopa fed to male mice (Cotzias et al. 1977); marine collagen peptides fed to Sprague–Dawley rats (Liang et al. 2010); and reduced advanced glycation end product-containing standard mouse diet fed to mice (Cai et al. 2007). These are the only studies in the literature showing an increase in rodent longevity for which the potential effects of “voluntary” CR on life span can be confidently excluded. Four studies which controlled or measured caloric intake found no change in life span with various treatments (Smith et al. 2010; Spindler and Mote 2007; Lee et al. 2004; Pugh et al. 1999b).

Six other studies found life span extension and reported the effects of the treatments on body weight as a surrogate measure of food consumption (Table 1 and ESM Table 1). However, there are demonstrated instances in which a discordance was found between body weight and food consumption, making body weight a potentially unreliable surrogate measure of caloric consumption (see below). These treatments are: coenzyme Q10 administered orally to male Wistar rats fed a diet high in polyunsaturated fatty acids (Quiles et al. 2004); Ginkgo biloba extract administered orally to male F344 rats (Winter 1998); green tea polyphenols administered in drinking water to male C57BL/6 (B6) mice (Kitani et al. 2007); 2-mercaptoethanol administered orally in food to male BC3F1 mice (Heidrick et al. 1984); PBN fed to B6 male mice (Saito et al. 1998); and piperoxane administered by injection to F344 rats (Compton et al. 1995).

Twenty studies found extended life span, but potential CR effects cannot be excluded based on the data available (Table 1 and ESM Table 1). Many of these reports include statements to the effect that no change in body weight (most common) or food intake (rarely) occurred, but no data or analysis are shown. No indication is given of whether the data were anecdotal or systematic, when and how many times during the study the measurements were taken, or what statistical methods were used to analyze the data. These uncertainties, coupled with the potential fallibility of weight as a biomarker for food consumption (see below), make these studies less persuasive.

Twenty-nine other studies report life span extension by treatments, but the body weight and/or food consumption data presented in the publication suggest that induced voluntary CR was responsible for the longevity effects observed. Of the remaining studies, nine would be difficult to repeat because the composition, preparation, or mode of delivery of the treatment agents are published in difficult to obtain journals or are not reported.

Food consumption should be measured

Body weight is often used in longevity studies as a surrogate measure of caloric consumption (Table 1 and ESM Table 1). The vast majority of the studies using body weight as a surrogate do not report the methods used or the results obtained (ESM Table 1). Thus, the reader cannot know whether the conclusions drawn used systematic or anecdotal measures. The number of animals weighed, the number of times they were weighed, and the statistics used are not reported. Such problems are evident in two reports from the NIA Interventions Testing Program (NIH-ITP; Harrison et al. 2009; Strong et al. 2008; Miller et al. 2007). While the studies are unusually robust in many aspects of experimental design, including large cohorts of genetically heterogeneous mice of both sexes tested at multiple sites, most of their reports give no details regarding body weight measurements (Harrison et al. 2009; Strong et al. 2008; Miller et al. 2007). Thus, NIH-ITP investigators have reported that the same concentration of rapamycin fed to HET3 mice produced either no effect on body weight (methods and data unspecified; Harrison et al. 2009) or a 6% or 10% decrease in body weight (for females and males, respectively; Miller et al. 2011). Thus, it is possible that the mice in the first study experienced an undetected reduction in body weight. It also is unclear whether the reductions in body weight found in the second study were due to reduced caloric intake. Thus, “voluntary CR” may have played a role in the longevity effects observed. While one may seek further information from these investigators at this time, our publications are likely to outlive us.

Body weight is an unreliable surrogate measure of caloric intake. Both dietary l-dopa and dietary dinitrophenol reduce body weight without changing food consumption (Caldeira da Silva et al. 2008; Cotzias et al. 1977). A drug-induced discordance between body weight and food intake may not be uncommon. We found five agents or combination of agents that significantly decreased body weight and four agents or combination of agents which significantly increased the body weight of mice fed isocalorically (unpublished results). For example, mice fed food supplemented with four doses of nordihydroguaiaretic acid (NDGA) experienced an approximately dose-responsive decrease in body weight without a corresponding decrease in food consumption (Fig. 1). Food was packed in 1-g pellets and fed daily. Food intake for each of these groups was carefully monitored and recorded. Any uneaten food, even when masticated and dropped into the bedding, was readily identifiable by shape, color, and texture. Quantitatively, the group fed the highest dose of NDGA weighed the same or less than a 20% calorie-restricted (20% CR) group at most times during the study (Fig. 1). Others have reported, without showing data, that mice consuming NDGA-supplemented diets ad libitum have no change in body weight relative to controls (Strong et al. 2008). Thus, it is possible that the mice in this published study maintained their body weight by increasing food consumption. Feeding measured quantities of food and monitoring of its consumption ensures that life span data are not confounded by changes in caloric consumption. This reduces the likelihood of CR-related changes in life span (Merry 2002; Compton et al. 1995).

Fig. 1
figure 1

Isocaloric feeding of diets containing NDGA reduced body weight without altering food consumption. The left axis shows the mean bimonthly weights of dietary groups fed AIN-93M diet with no additional additives (empty square) or AIN-93M diet containing NDGA at 1.5-g/kg diet (empty triangle), 2.5 g/kg diet (empty diamond), 3.5-g/kg diet (empty hexagon), and 4.5-g/kg diet (); a 20% CR diet (empty downturned triangle); or a 40% CR diet (circle). The mice were shifted from chow feeding to the defined diets at 12 months of age. The right axis shows the percentage of the kilocalories fed to each group of mice which were actually consumed for the group fed AIN-93M diet with no additives (filled square); AIN-93M diet containing NDGA at 1.5-g/kg diet (filled triangle), 2.5-g/kg diet (filled diamond), 3.5-g/kg diet (filled hexagon), and 4.5-g/kg diet (); a 20% CR diet (filled downturned triangle); or a 40% CR diet (filled circle). The symbols representing food consumption are superimposed in the figure, making them difficult to distinguish because the mice ate essentially all their food. Error bars and symbols for statistical significance were omitted for the sake of clarity. The body weights were significantly different than controls, as judged by the non-parametric Mann–Whitney test, for the NDGA 1.5-g/kg diet group at 22 months (P < 0.01), 24 months (P < 0.001), 26 months (P < 0.01), 28 months (P < 0.05), and 30 months (P < 0.01); for the 2.5-mg/kg diet group at 18 months (P < 0.01), 20–26 months (P < 0.001), and 28 months (P < 0.01); for the 3.5-mg/kg diet group at 20 and 22 months (P < 0.01), 24 and 26 months (P < 0.001), and 28 and 30 months (P < 0.01); and for the 4.5-mg/kg diet group at 16 months (P < 0.01) and 18–30 months (P < 0.001). The mice were shifted from chow feeding to the defined diets at 12 months of age. These studies used male B6C3F1 mice (Harlan Breeders, Indianapolis) randomly assigned to treatment groups at 3 weeks of age. At 12 months of age, the mice were shifted from ad libitum chow feeding (Diet no. 5001, Purina Mills, Richmond, IN) to daily feeding with either 13.3 kcal/day per mouse of control diet (AIN-93M, Diet no. F05312; Bioserv, Frenchtown, NJ) or daily feeding with an identical quantity of control diet supplemented with the indicated concentrations of NDGA. The 20% CR group was shifted from ad libitum chow feeding to 11 kcal/day per mouse of AIN-93M 20% Restricted Diet (Diet no. F06298, Bioserv). The 40% CR group was shifted from ad libitum chow feeding to 11 kcal/day per mouse of AIN-93M 20% Restricted Diet for 2 weeks and thereafter to 7.46 kcal/day of AIN-93M 40% Restricted Diet (Diet no. F05314, Bioserv). The diets for the 20% and 40% calorically restricted groups were fortified so the mice received fewer calories in the form of carbohydrate than the other groups, but approximately equal amounts of fat, protein, vitamins, and minerals. All mice were fed the amounts indicated daily. Food consumption was monitored at the time of feeding, and any food left was noted and removed. With rare exceptions, all food was eaten each day. The drugs were mixed with powered diet and cold-pressed into 1-g pellets by Bio-serv. The food was stored moisture free at 4°C until used. The mice drank acidified (pH 4.0) tap water ad libitum and were maintained on a 12-h light/dark cycle at 22°C. Cohorts of 296 negative control mice and 36 CR and treated mice were utilized

Monitoring of both food consumption and body weight will identify instances in which a compound produces a discordance between them. Drug-induced changes in activity, metabolic rate, or intestinal absorption of calories might lead to such a discordance, which would not be detected by monitoring of only body weight. Once detected, a discordance can be investigated further using measurements of spontaneous activity, metabolic rate, and absorption of calories (e.g., Westbrook et al. 2009; Adams et al. 2006). Thus, measured feeding coupled with body weight monitoring is a much more robust approach to life span studies than body weight monitoring alone.

Methods for isocaloric feeding

In the author’s experience, measuring food consumption is less difficult and expensive than it is sometimes assumed to be. In an ongoing longevity study involving 2,400 mice, measured feeding is ~9% of total costs. To deliver a known amount of food to each cage conveniently, we use the method described by Weindruch and colleagues (Pugh et al. 1999a). The food (AIN-93M) and any additional components are cold-packed into 1-g pellets by Bio-Serv (Frenchtown, NJ). These round pellets are conveniently scooped into a 1.6-cm inner diameter Plexiglas tube fitted with a commercially available plastic cap. Tubes cut to different lengths are used to deliver different numbers of pellets to the cages.

If a supplemented diet is under-consumed, flavoring can be added, the supplement can be changed to an agent with similar actions, the supplement concentration in the food can be reduced, or, if desired, the amount of food given to a control group can be decreased to that of the test agent. We slightly underfeed all the mice in our studies to insure that all food is eaten.

Healthy, long-lived rodents, such as an F1 hybrid or a more genetically heterogeneous mouse should be used for compound screening

During our survey of the literature, we found many reports of life span-based compound screening performed with short-lived or enfeebled rodents (data not shown). By “enfeebled,” we mean natural or selected rodent lines that have genetic (or possibly epigenetic) changes that reduce longevity and health relative to their unaltered parental or control strains. For example, many studies utilized senescence-accelerated prone mouse strains (SAMP1 through SAMP9) to rapidly screen for longevity therapeutics (Rodriguez et al. 2008; Li et al. 2007; Umezawa et al. 2000; Boldyrev et al. 1999; Kumari et al. 1997; Edamatsu et al. 1995; Zhang et al. 1994). SAMP mice suffer from the early onset of a spectrum of age-related pathologies which abbreviate their life span. We found only one study in which the effectiveness of an agent was tested in both an SAMP mouse (SAMP8) and in one of its associated control mouse strains (SAMPR1; Zhang et al. 1994). In this study, a botanical which extend the life span of SAMP8 mice did not extend the life span of the control strain. Similarly, resveratrol was reported to extend the life span of high fat-fed, obese, and diabetic mice (Baur et al. 2006). While this article has been cited by many as evidence that resveratrol can extend mammalian life span, the results did not translate to healthy mice (Pearson et al. 2008). Thus, screening agents in enfeebled rodents has not yet been shown to facilitate the identification of compounds which extend the life span of healthy animals.

For these reasons, studies designed to identify longevity therapeutics should utilize long-lived mice, such as an F1 hybrid or more genetically heterogeneous mouse. F1 hybrid mice, which are widely available, are genetically heterozygous at all loci for which their parents are heteroallelic. They are more disease- and stress-resistant and have larger litters and longer life spans than their inbred parental lines (Flurkey et al. 2009). HET3 mice, which are produced using a four-way crossbreeding scheme, are more genetically heterogeneous than F1 mice and are used by the NIH-ITP. However, they are more difficult to produce and have shorter life spans than some F1 mice. For example, B6C3F1 mice have a mean life span of about 915 days (Spindler and Mote 2007; Pugh et al. 1999b; Smith and Walford 1977), while HET3 mice have a mean life span of about 800 days (Strong et al. 2008). Longer life spans are usually regarded as signs of greater vigor. Outbred mice, which are even more genetically heterogeneous than HET3 mice, are more vigorous and less expensive than inbred mice (Flurkey et al. 2009). However, they have the disadvantage of being genetically undefined. Because each mouse is genetically unique, study results can be more varied and thus more difficult to reproduce.

Chemically defined diets should be used for gerontological research

There are three general categories of rodent diets: cereal-based (non-purified), purified, and chemically defined (Kozul et al. 2008; Reeves et al. 1993; American Institute of Nutrition ad hoc Committee on Standards for Nutritional Studies 1977). The majority of the studies summarized in ESM Table 1 appeared to have used non-purified or purified cereal-based diets. However, cereal-based diets are often variable in composition (American Institute of Nutrition ad hoc Committee on Standards for Nutritional Studies 1977), and this variability, and the presence of trace contaminants, can strongly influence experimental results (Kozul et al. 2008; Jensen and Ritskes-Hoitinga 2007; Allred et al. 2004; Thigpen et al. 2004, 2003). For example, Prolab-RMH 1000 rodent chow contains appreciable quantities of polychlorinated dibenzo-p-dioxins and dibenzofurans, probably from pesticide residues (Schecter et al. 1996). Purina Laboratory Rodent diet 5001 (LRD-5001) contains high concentrations of methylmercury and a mixture of inorganic and organic arsenic compounds at a concentration 36 times the EPA-recommended level for drinking water (Kozul et al. 2008; Weiss et al. 2005). The specifications for diets such as NIH-31 allow manufacturers to use any of a number of sources of protein, including fish meal, a possible source of arsenic and other contaminants, or soy, a possible source of pesticide residue. Thus, purified, defined diets are preferable.

Use of a positive control is highly desirable

Many life span studies are published without the benefit of a positive control group, such as a 40% CR group. If none of the compounds tested in a study extend life span, the possibility cannot be excluded that the rodents would not respond to a longevity treatment under the study conditions. Few reviewers would endorse the publication of negative biochemical data without the inclusion of a positive control to show that the assay was working. This should be similarly important for rodent life span studies.

Dosages of agents tested in rodents

The dosages at which potential therapeutics are tested in rodents must balance a number of competing theoretical and practical issues. Ideally, one would like to know that a therapeutic level of the agent is maintained throughout a life span study. Of course, the ideal therapeutic level of an agent is not known for most life span studies. Furthermore, food intake, body volume, intestinal absorption, and metabolism may change with age. Monitoring the blood levels of an agent throughout a life span study would be difficult and expensive. Group sizes which would make rodents available for testing throughout the study are often impractical.

Several approaches can mitigate these limitations. Published studies with well-defined treatment endpoints can be used to estimate dosages. In this way, one can be reasonably certain that a therapeutic level of the agent is achieved. Initial signs that a dosage is too high, such as reduced food intake or inattention to grooming, can be used to adjust dosages “on the fly.” Where rodent studies cannot be found, equivalent rodent dosages can be calculated from human dosages using default cross-species scaling factors (Reagan-Shaw et al. 2008; US EPA 2005; Rhomberg and Lewandowski 2004; Dourson et al. 1992, 1996; Dourson and Stara 1983). These scaling factors are often used to set and access drug dosages in human and animal studies (e.g., Chalastanis et al. 2010). Empirically, small animals have been found to require larger dosages per gram body weight than larger animals. These differences are due to pharmacokinetic differences (e.g., rates of uptake, metabolism, and clearance of compounds) and to pharmacodynamic differences (e.g., rates of damage to macromolecules, cellular repair and regeneration, signaling cascades, and proliferative responses) between small and large animals. One widely used scaling formula increases the human dosage in milligrams per kilogram body weight/day by tenfold to obtain the equivalent mouse dosage. Another scaling factor also in use is based on the 3/4 power of body weight [i.e., (milligrams/kilogram body weight)3/4/day], which leads to equivalent mouse dosages that are about sevenfold higher than the equivalent human dosages. While these calculations were initially developed for chemotherapeutics, they are also used as starting points when human dosages must be extrapolated from preclinical rodent data (Chalastanis et al. 2010).

Summary: the preferred design for testing potential longevity therapeutics using mouse life span studies

Based on the information reviewed above, we recommend a number of design parameters essential or highly desirable for rodent life span assays: (1) The diets should be fed in measured amounts and consumption monitored. Body weight should be monitored regularly. These measurements and their statistical analysis should be reported. (2) A long-lived, healthy rodent strain should be used, preferably an F1 or further outcrossed strain. (3) Chemically defined diets should be used. They ensure the greatest degree of reproducibility and avoid the confounds introduced by contaminants or compositional variability. (4) Use of a positive control is highly desirable. Without a positive control, negative results are of questionable significance. We use a 40% CR control, which also allows us to calibrate the effects of a treatment (e.g., Fig. 1). (5) Dosages can be chosen using treatment endpoints gleaned from the literature or, where necessary, from human dosages using accepted cross-species scaling factors. Use of these methods will produce a more reliable literature on which to base further studies.