Abstract
This study was conducted to comprehensively survey the available literature on intravenous pharmacokinetic parameters in the rat, dog, monkey, and human, and to compare common methods for extrapolation of clearance, to identify the most appropriate species to use in pharmacokinetic lead optimization, and to ascertain whether adequate prospective measures of predictive success are currently available. One hundred three nonpeptide xenobiotics were identified with intravenous pharmacokinetic data in rat, dog, monkey, and human; both body weight- and hepatic blood flow-based methods were used for scaling of clearance. Allometric scaling approaches, particularly those using data from only two of the preclinical species, were less successful at predicting human clearance than methods based on clearance as a set fraction of liver blood flow from an individual species. Furthermore, commonly used prospective measures of allometric scaling success, including correlation coefficient and allometric exponent, failed to discriminate between successful and failed allometric predictions. In all instances, the monkey tended to provide the most qualitatively and quantitatively accurate predictions of human clearance and also afforded the least biased predictions compared with other species. Additionally, the availability of data from both common nonrodent species (dog and monkey) did not ensure enhanced predictive quality compared with having only monkey data. The observations in this investigation have major implications for pharmacokinetic lead optimization and for prediction of human clearance from in vivo preclinical data and support the continued use of nonhuman primates in preclinical pharmacokinetics.
During lead optimization in drug discovery, species selection for pharmacokinetic investigation is driven by several factors, including the requisite species for preclinical efficacy and safety studies and, to a lesser extent, an understanding of which species are likely to be most predictive of human pharmacokinetics. Rats are used nearly universally for pharmacokinetic studies due in part to their accessibility and the likelihood that the rat will be used in safety assessment. Although the primary nonrodent species varies from effort to effort, the key nonrodent safety assessment species in the pharmaceutical industry is generally either the dog and/or monkey (Dorato and Vodicnik, 1994). Consequently, drug discovery pharmacokineticists often possess preclinical data from rat, dog, and/or monkey for a given clinical candidate, and are faced with the task of using these data, at least in part, to project likely pharmacokinetic behavior in humans.
The reliable prediction of human pharmacokinetic parameters from in vitro or preclinical in vivo data has long been a subject of extensive investigation (Ings, 1990). In particular, the practice of allometric extrapolation from in vivo preclinical data is widespread and well documented, with hundreds of primary and review articles available in the literature. However, a quantitative, or even qualitative, understanding of the relative ability of each of the major preclinical species to provide predictive pharmacokinetic data in humans has heretofore been limited by the lack of a comprehensive survey of primary pharmacokinetic data. Recently, the individual abilities of the rat, dog, and monkey to predict human pharmacokinetics for a limited set of compounds have been evaluated (Chiou et al., 1998, 2000; Chiou and Buehler, 2002). Although these studies have provided some valuable insights, the underlying data sets are relatively small. Additionally, there is a lack of commonality between data sets in these studies, so that different compounds appear in the evaluation of the rat compared with the dog or the monkey. These limitations preclude a thorough interrogation of the pharmacokinetic relationships among species for an adequate number of compounds to support firm conclusions.
Given the magnitude and the significance of the issue of interspecies pharmacokinetic extrapolation, we have conducted an exhaustive literature survey and compiled intravenous pharmacokinetic data from the rat, dog, monkey, and human for 103 nonpeptide xenobiotics. Using this data set, we are conducting a detailed analysis of interspecies pharmacokinetic relationships, and in this article we report our findings regarding clearance. Based on our analysis, we have addressed several key questions often faced in a drug discovery setting, including whether any of the major species appears to be a pharmacokinetic outlier with respect to human clearance predictions, whether data from all three species are needed for reliable clearance extrapolation, and whether accurate prospective tools exist for evaluation of the probability of success of these scaling techniques to predict human clearance.
Materials and Methods
Data Collection. An exhaustive primary literature review was conducted as the basis for the data set reported herein. For the purposes of this study, no restrictions were placed on gender or on the rat or dog strain from which data were gathered (although dog data were available predominantly from beagle dogs). For the monkey, only data from rhesus (Macaca mulatta) and cynomolgus (Macaca fascicularis) monkeys were used, since these tend to be the most commonly used nonhuman primates in the pharmaceutical industry and comprised the largest portion of the monkey data set. Care was taken to ensure that the pharmacokinetic parameters for each compound were derived from the same body fluid (usually plasma) across species. Using these methods, a total of 103 nonpeptide xenobiotics were identified with data of sufficient quality for further consideration. These molecules and the supporting pharmacokinetic references are listed in the Appendix, which is available online as a data supplement to this article. The resulting data set was biased toward low clearance compounds (70%), with 16% of the remaining compounds each being moderate or high clearance in humans (Fig. 1). To verify that this inherent bias in the data set did not unduly affect the conclusions of this work, all the comparisons performed in this study were repeated using a vetted data set containing a randomly selected equal portion of low, moderate, and high clearance compounds. As none of the conclusions were altered using the smaller data set, all data presented here are for the complete 103-compound collection.
Data Analysis. For the 103 xenobiotics evaluated, traditional body weight-based allometric scaling was first conducted from rat, dog, and monkey or from any two of these species together to predict human clearance according to the following equation (Boxenbaum, 1982): Clearance = aWb, where a and b represent the allometric coefficient and exponent, respectively. For the three-species approach, the correlation coefficient of this relationship was also calculated. Scaling of clearance from each preclinical species to humans also was conducted using clearance values as a percentage of liver blood flow. Assuming that clearance is primarily hepatic and that the blood to plasma ratio is constant across species, clearance can be expressed in each of the preclinical species as a fraction of liver blood flow, and human clearance projected from each species as follows: human clearance = animal clearance · (human liver blood flow/animal liver blood flow). For this investigation, liver blood flow values of 85, 30, 45, and 21 ml/min/kg were used for rat, dog, monkey, and human, respectively (Davies and Morris, 1993; Brown et al., 1997).
Results
Before conducting any mathematical extrapolations from preclinical data to humans, the intrinsic relationship between clearance in each species and in humans was first explored. Figures 1 and 2 display the relationship between clearance (normalized for liver blood flow) in the rat, dog, and monkey versus human. Two approaches were used to compare the preclinical species with humans: by categorizing compounds as low (<30% liver blood flow), moderate (30 to 70% liver blood flow), or high (>70% liver blood flow) clearance in each species (Fig. 1); and by plotting the data in comparison with deviation from unity and evaluating outliers (compounds with clearance ± 15% LBF1 of the line of unity; Fig. 2). When considering clearance by category, the rat generally tended to be the least similar to human, with a total of 46 of the 103 compounds (45%) being in a different clearance category in rats compared with humans (32 with a higher clearance category for rat than human, and 14 with a lower clearance category in rat than human). The dog had a total of 45 of the 103 compounds (44%) in a different clearance category compared with humans (33 with a higher clearance category for dog than human, and 12 with a lower clearance category in dog than human). The monkey overall tended to be most similar to human, with only 32 of the 103 compounds (31%) in a different clearance category compared with humans. In addition, the monkey was the least biased of the preclinical species, with 16 compounds having a higher clearance category for monkey than human, and 16 with a lower clearance category in monkey than human. The interspecies differences were similar when considering predictivity based on deviation from unity (Fig. 2), and the rank order of similarity was the same, with 41 compounds falling outside the ± 15% LBF lines for rat, 39 for dog, and 25 for monkey. Of the outliers, 71% and 79% of the rat and dog outliers, respectively, fell above the ±15% LBF line, whereas only 56% of the monkey outliers were above the ±15% LBF line, again demonstrating the lack of bias in the monkey data. Although not the conventional statistical mechanism for evaluating outliers, the ±15% LBF analysis in Fig. 2 allows for a more intuitive outlier evaluation than a more traditional absolute fold deviation analysis.
Quantitative predictivity of human clearance from the various preclinical species to humans was evaluated using several methodologies. Using standard allometric scaling from rat, dog, and monkey, rat and dog only, or rat and monkey only to humans, a predicted human clearance was generated and compared with actual human clearance values. Allometric scaling from dog and monkey only was also considered, but the resulting predictions were very poor and were not considered further (data not shown). Furthermore, an alternative method of predicting clearance, based on predicting that clearance will be the same fraction of hepatic blood flow in humans as in one of the preclinical species, was also used, along with combinations of the various species. The accuracy of each of these methods at predicting human clearance is shown in Table 1; median values are presented to ensure that the relationships are not obscured by extreme outliers. Each of the predictive methods evaluated resulted in broadly similar accuracy; for example, the median absolute difference between predicted and observed clearance for each method ranged from 1.2 to 3.2 ml/min/kg. However, based on each of the criteria evaluated (fold difference from observed value, variance from observed value, or absolute variance from observed value), hepatic blood flow-based scaling from monkey liver blood flow provided the most accurate prediction of human clearance. Finally, the number of times each method provided the most and least accurate clearance predictions was evaluated; these observations are presented in Table 2. Prediction based on monkey or rat hepatic blood flows alone provided the most accurate clearance prediction in the most instances, followed by allometric scaling from all three species or from rat and monkey only; allometric scaling from rat and dog or scaling based on dog hepatic blood flow provided the most accurate clearance prediction in the fewest instances. Furthermore, in only one instance was scaling based on monkey liver blood flow the least accurate method for predicting clearance, whereas three-species allometric scaling was never the least predictive method. Obviously, averaging the results from the liver blood flow analysis of any two species produced neither the most nor least accurate predictions in any instance. Interestingly, allometric scaling from rat and monkey was the least predictive method in the most instances, followed by allometric scaling from rat and dog, rat hepatic blood flow, and dog hepatic blood flow.
As another way to evaluate these data, extrapolation was next considered on a qualitative, rather than a strictly quantitative, basis. The predicted clearance from each species and the actual human clearance were classified as low, moderate, or high on the basis of the categories described above. Table 2 displays the qualitative accuracy of each of the scaling methods at predicting human clearance. All the methods evaluated correctly classified clearance for 62 to 72% of compounds, with monkey hepatic blood flow or the average of dog and monkey hepatic blood flow being the most qualitatively predictive, followed by the average of rat and monkey hepatic blood flow. For those instances where human clearance was incorrectly classified, the preclinical species generally tended to overpredict clearance (Table 2), with monkey hepatic blood flow overpredicting clearance in 59% of the incorrectly classified compounds, and three-species allometric scaling overpredicting clearance in 57% of the incorrectly classified compounds. Although the monkey liver blood flow and the combination of dog and monkey liver blood flow produced the same number of quantitatively correct compounds (72%), the results from monkey liver blood flow alone were less biased (59% versus 79% overprediction rate).
To further clarify the relationships between various preclinical species, as opposed to comparing the various scaling methodologies, hepatic blood flow-based scaling was used to estimate human clearance from each preclinical species, and then the predicted clearance categories were compared (Fig. 3). When comparing rat versus dog, the two species qualitatively classified human clearance the same way (although not necessarily correctly) 71 times; of the compounds where the prediction differed between rat and dog, each were correct approximately half the time. Comparing rat versus monkey, a similar clearance classification was generated 66 times; of the compounds where the prediction differed between these two species, the monkey was correct in 24 of the 37 instances (65%). Finally, when comparing dog versus monkey, the two species produced the same clearance prediction 79 times, and of the compounds where the prediction differed, monkey was correct for 17 of the 24 compounds (∼70%). These observations demonstrate that when a species disagreement is encountered between rat and dog, either species is equally likely to be correct; however, when the rat and monkey or dog and monkey data disagree, the monkey is more likely to generate the correct human clearance prediction.
Often in drug discovery, data are available in a rodent and a nonrodent species, and the question faced is how much additional predictive value would be added via pharmacokinetic optimization in a second nonrodent species, or conversely, how much risk is incurred by not generating pharmacokinetic data in two nonrodent species. To address this question, the present data set was analyzed assuming that data were available in the rat and either the dog or the monkey, with an emphasis on whether the additional nonrodent species would have improved the human clearance prediction (Fig. 4A). An improvement in quality of the prediction was defined as a third species correctly predicting human clearance when the other two species data were either incorrect or disagreed; likewise, a worsened prediction was defined as a third species incorrectly predicting human clearance when the other two species data were either both correct or disagreed. With rat and dog data available, generating monkey data would have resulted no change in predictive quality 58 times; of the remaining 45 instances, generating monkey data would have improved the quality of the prediction 30 times (67%) and decreased the quality of the prediction in 15 instances. Also, as shown in Fig. 4B, the 15 instances where monkey data would have worsened the prediction were divided approximately evenly between overpredicting and underpredicting human clearance, further demonstrating the lack of bias incurred when using the monkey. On the other hand, with rat and monkey data available, generating dog data would have resulted in no change in predictive quality 66 times; of the remaining 37 instances, generating dog data would have improved the prediction 16 times and worsened the prediction in 21 instances. Furthermore, of the 21 instances where dog data would have worsened the prediction, in most instances (81%), use of dog data would have resulted in an overprediction of human clearance.
Finally, consideration was given to the ability to prospectively evaluate the predictive quality of the various extrapolation techniques evaluated. As shown in Fig. 5, for compounds predicted to demonstrate low clearance in humans, reasonable predictive accuracy was achieved for each of the methods evaluated, and approximately 80 to 90% of the compounds that were predicted to be low clearance in humans experimentally demonstrated low clearance. Lower accuracy was observed for those compounds predicted to have high human clearance, where only 30 to 50% of the compounds predicted to have high human clearance experimentally demonstrated high clearance. Finally, the lowest qualitative predictive accuracy was observed for compounds predicted to have moderate human clearance, where only 15 to 30% of these compounds experimentally demonstrated moderate human clearance; 47 to 85% of the compounds predicted to be moderate clearance experimentally demonstrated low clearance in humans. For compounds predicted to have moderate and high clearance, the monkey hepatic blood flow scaling method was the most predictive of all the methods evaluated. Additionally, for the three-species body weight-based allometric scaling, both the correlation coefficient and the allometric exponent for each compound used in this exercise were examined with respect to qualitative predictive accuracy. As shown in Fig. 6A, the correlation coefficient as a measure of goodness of fit of the allometric relationship did not correspond to predictive accuracy. Furthermore, as shown in Fig. 6B, although the median allometric exponent differed between compounds for which qualitative predictive success was achieved (0.63) versus those for which allometric extrapolation failed (0.80), there was substantial overlap between the two groups, demonstrating that the allometric exponent cannot be reliably used as a prospective marker of predictive success.
Discussion
Within the pharmaceutical industry, substantial resources are expended to select drug candidates with favorable in vivo pharmacokinetics in various preclinical species to enable future preclinical pharmacology or safety studies and to improve the likelihood of clinical success. However, although local opinions and rules have been developed using relatively small proprietary data sets, a broader understanding of the relative predictivity of various preclinical species for human pharmacokinetics has been elusive. Through an exhaustive literature survey, we have identified 103 xenobiotics with reliable intravenous pharmacokinetic data, and we are conducting a detailed analysis of these data to probe interspecies pharmacokinetic extrapolation. The present investigation leads to some conclusions regarding extrapolation of clearance from various preclinical species to humans, several of which conflict with dogma and current practices in lead optimization and interspecies extrapolation.
One key observation from this exercise is that the monkey is an important species for extrapolating clearance. From a naïve perspective, this observation is unsurprising, given the phylogenetic proximity of humans to monkeys versus other preclinical species. However, the use of monkeys in conducting pharmacokinetic studies has been questioned. From a resource and ethical perspective, minimizing the use of nonhuman primates is an important overall goal (Animal Procedures Committee, 2002). Additionally, there have been reports of specific compounds for which monkey pharmacokinetics were not representative of human disposition (Krause and Kühne, 1993; Tanaka et al., 1994; Chiu et al., 1995; Kuroda et al., 2000), and Chiou and Buehler (2002) have recently proposed on the basis of a ∼30-compound data set that monkeys may systematically overpredict human clearance. However, from the present investigation using a 103-compound data set, it is clear that extrapolation methods using monkey pharmacokinetic data provide the most qualitatively and quantitatively accurate estimate of human clearance. This observation is consistent with previous reports using more limited data sets also indicating that extrapolation from monkey may provide the most accurate prediction of human clearance (Campbell, 1994) or half-life (Obach et al., 1997). Consequently, although the use of nonhuman primates must be carefully managed to minimize resource and animal use, these data suggest that monkey pharmacokinetic data are important for accurate prediction of human clearance.
Another key observation from the present investigation is that generating pharmacokinetic data in multiple preclinical species does not always result in improved extrapolation. For the 103 xenobiotics evaluated in this study, obtaining pharmacokinetic data in the dog when data from the rat and monkey are already available would not improve the prediction in 66 instances and would have worsened predictivity in 21 instances, with the majority of these resulting in an overprediction of clearance. These observations are in agreement with previous observations in a more limited data set (Chiou et al., 2000), suggesting that the dog is not a good predictor of human absorption. Consequently, a strategy could be envisioned in which dog data are generated only if needed to ensure adequate exposure in preclinical safety or efficacy studies, with the monkey being the primary nonrodent, pharmacokinetic lead optimization species. However, it should be noted that when data from only rat and monkey are available for extrapolation, body weight-based allometric scaling would not be recommended; rather, the most accurate approach would appear to be extrapolation based on scaling from monkey hepatic blood flow.
In addition to implications for scaling to humans, the present data also allow an evaluation of how pharmacokinetic parameters in the different preclinical species relate to one another. As mentioned above, the rat is a common pharmacokinetic lead optimization species due to its accessibility and the need for drug candidates to have adequate rodent pharmacokinetic properties for preclinical safety and efficacy studies and due to the general perception that rat is often reasonably predictive of pharmacokinetics in higher species (Chiou et al., 1998). Indeed, although rat is not the best species for predicting human clearance, rat clearance is in the same qualitative category as monkey clearance in 66 of the 103 instances compiled here. These data suggest that the use of the rat as an early pharmacokinetic screen, coupled with subsequent evaluation in the monkey for key compounds of interest, will likely be a useful lead optimization strategy. Also of interest is the comparison of dog and monkey clearance shown in Fig. 3. Although these two nonrodent species would similarly classify clearance the majority of the time, when the two species data disagree, the monkey provided the correct human clearance classification in 17 of 24 instances, further supporting the continued use of the monkey in pharmacokinetic research.
Finally, the present data highlight interesting findings regarding traditional three-species body weight-based allometric scaling. In conducting such extrapolation, one frequently used measure of predictive quality or extrapolation success is the degree of interspecies agreement (i.e., the r2 value of the allometric linear regression); the correlation coefficient has also been used as a justification to exclude certain species from an extrapolation set (Chung et al., 1985; Brazzell et al., 1990; Efthymiopoulos et al., 1991; Cherkofsky, 1995; Feng et al., 1998; Kim et al., 1998; Sukbuntherng et al., 2001). Although previous investigations have demonstrated this idea to be incorrect for specific compounds (Ward et al., 2002), the present study demonstrates that an improved mathematical correlation coefficient does not signify an improved ability of allometric scaling to correctly predict human clearance on a more global basis. Additionally, the allometric exponents calculated were based on the three-species scaling of the present data contrast to the historical understanding of allometric exponents. It has long been considered that for clearance, an allometric exponent of ∼0.75 indicates a favorable relationship (Ings, 1990), with decreased confidence in predictive success as the exponent deviates from this value. Also, previous extrapolation exercises based on smaller data sets have suggested that for compounds with an allometric exponent <0.55, human clearance cannot be predicted accurately using three-species allometric scaling, with or without the use of correction factors (Mahmood and Balian, 1996). In contrast, the present data demonstrate that allometric exponents <0.55 are not necessarily related to predictive failure, nor do allometric exponents >0.55 provide improved predictivity.
Although these data provide a reasonably comprehensive view of scaling of clearance from preclinical species to humans, several outstanding questions remain to be addressed. For example, it has long been proposed that application of correction factors to allometry (maximum lifespan potential, brain weight, etc.) may improve predictivity (Boxenbaum, 1984; Mahmood, 1999). However, to date, it has been difficult to establish a priori when such factors are needed (Bonate and Howard, 2000); we are presently investigating the influence of a plethora of such nonmechanistic correction factors on allometric predictive quality with this data set. Additionally, numerous investigators have established that for some xenobiotics, extrapolation is improved by incorporation of various in vitro data or physiologic correction factors, such as intrinsic clearance, plasma protein binding, glomerular filtration rate, bile flow rate, or intrinsic metabolic activity, along with in vivo pharmacokinetics (Obach et al., 1997; Ward et al., 1999). A thorough investigation of how and whether such mechanistic correction factors influence extrapolation for the present large interspecies data set would be a useful adjunct to the present observations. However, at present, it is not possible to apply such in vitro corrections to the in vivo data in hand, due to the lack of a comprehensive and internally consistent in vitro data set. Our laboratory is currently generating in vitro plasma protein binding and intrinsic clearance data for the compounds described here and will report separately on their influence on the overall scaling of these data. Application of these correction factors will additionally necessitate a consideration of the major route of elimination of each of these molecules (Mahmood, 1998; Mahmood and Sahajwalla, 2002). Finally, the present analysis does not consider the degree to which physicochemical properties or structural features of a given molecule may contribute to whether its clearance can be successfully extrapolated to humans from the preclinical data, or the potential use of computational approaches as replacements for or adjuncts to the preclinical data in extrapolation of clearance. We are currently inter-rogating these important questions using this data set, and we will describe the results in a future publication.
In summary, the results from an exhaustive literature compilation of preclinical and clinical intravenous pharmacokinetic parameters suggest that allometric scaling approaches, particularly those using data from only two of the preclinical species, were less successful at predicting human clearance than methods based on clearance as a set fraction of liver blood flow from an individual species. Furthermore, commonly used prospective measures of allometric scaling success, including correlation coefficient and allometric exponent, failed to discriminate between successful and failed allometric predictions. In all instances, the monkey tended to provide the most qualitatively and quantitatively accurate predictions of human clearance and also afforded the least biased predictions compared with other species. Additionally, the availability of data from both common nonrodent species (dog and monkey) did not ensure enhanced predictive quality compared with having only monkey data. From this extrapolation exercise, we would recommend that liver blood flow-based extrapolation from monkey be used for the most accurate and reliable estimation of human clearance. These observations have major implications for pharmacokinetic lead optimization and for prediction of human pharmacokinetic parameters from in vivo preclinical data, and they support the continued use of nonhuman primates in preclinical pharmacokinetics.
Acknowledgments
We thank Kelly Frank, Lauren Hardy, Melanie Nord, and Dr. Kelly Byrnes-Blake for their assistance in gathering the preclinical pharmacokinetic data for this reference set of compounds, and Dr. Anthony J. Pateman for helpful discussions regarding the use of the hepatic blood flow extrapolation technique.
Footnotes
-
These data were presented in part at the Preclinical Development Forum, 23-25 February 2004, Boston, MA.
- Received October 20, 2003.
- Accepted February 23, 2004.
- The American Society for Pharmacology and Experimental Therapeutics