In Silico Predictions of Blood-Brain Barrier Penetration: Considerations to “Keep in Mind”
- ADMETRx, Inc., Kalamazoo, Michigan (J.T.G.); and Argenta Discovery Ltd., Harlow, Essex, United Kingdom (D.E.C.)
- Address correspondence to:
Dr. Jay T. Goodwin, ADMETRx, Inc., 4717 Campus Drive, Suite 600, Kalamazoo, MI 49008. E-mail: jtgoodwin{at}admetrx.com
Abstract
Within drug discovery, it is desirable to determine whether a compound will penetrate and distribute within the central nervous system (CNS) with the requisite pharmacokinetic and pharmacodynamic performance required for a CNS target or if it will be excluded from the CNS, wherein potential toxicities would mitigate its applicability. A variety of in vivo and in vitro methods for assessing CNS penetration have therefore been developed and applied to advancing drug candidates with the desired properties. In silico methods to predict CNS penetration from chemical structures have been developed to address virtual screening and prospective design. In silico predictive methods are impacted by the quality, quantity, sources, and generation of the measured data available for model development. Key considerations for predictions of CNS penetration include the comparison of local (in chemistry space) versus global (more structurally diverse) models and where in the drug discovery process such models may be best deployed. Preference should also be given to in vitro and in vivo measurements of greater mechanistic clarity that better support the development of structure-property relationships. Although there are numerous statistical methods that have been brought to bear on the prediction of CNS penetration, a greater concern is that such models are appropriate for the quality of measured data available and are statistically validated. In addition, the assessment of prediction uncertainty and relevance of predictive models to structures of interest are critical. This article will address these key considerations for the development and application of in silico methods in drug discovery.
The main interfaces between the central nervous system (CNS) and the peripheral circulation are the blood-brain barrier (BBB) and the blood-cerebrospinal fluid barrier. The surface area of the former (approximately 20 m2) is several thousand times larger than the latter; thus, the BBB represents the most important barrier between the CNS and the systemic circulation (Pardridge, 2002; Graff and Pollack, 2004). There are at least two aspects of the BBB that make it a formidable barrier to candidate CNS drugs. First, in terms of its morphology, the endothelial cells of which the BBB is composed are connected by complex tight junctions, which severely limit any paracellular transport. Furthermore, a minimal level of pinocytosis and lack of significant membrane fenestrae affords an additional hindrance to the transport of hydrophilic molecules. Second, the BBB includes an array of metabolic enzyme systems and efflux transporters that constitutes a biochemical barrier to the majority of xenobiotics (de Boer et al., 2003). This combination of physical and biochemical barriers establishes the BBB endothelium as quite distinct from other endothelia, and it has been estimated to prevent the brain uptake of more than 98% of potential neurotherapeutics (Pardridge, 2002).
Determinants of BBB Penetration
The generally accepted biophysical/physicochemical models of BBB permeability have as their primary determinants for passive transport the solute's lipophilicity, hydrogen-bond desolvation potential, pKa/charge, and molecular size (Levin, 1980; Young et al., 1988; Jezequel, 1992; Chikhale et al., 1994; Atkinson et al., 2002; Abraham, 2004). Such models must, however, also acknowledge the potential for active mechanisms, which generally depend upon some specificity of molecular recognition and solute concentration at the transporter. BBB permeability is therefore impacted by other factors that determine solute concentration at the brain capillary surface, including plasma protein binding, blood flow through and partitioning into capillary membranes, and distribution into brain parenchyma (Kalvass and Maurer, 2002). These and other pharmacokinetic parameters such as absorption, first-pass metabolism, distribution into other tissues, and elimination pathways are in turn complex functions of physicochemical, biochemical, and physiological determinants, not all of which are directly related to the chemical structure of the drug (Burton et al., 2002). The question then becomes whether, and how well, these determinants are captured by current in silico methods and which determinants can be related directly to solute chemical structure. Qualitative, expert-based rules such as those proposed by Lipinski [Raub, 2004 (http://www.aapspharmaceutica.com/meetings/files/36/Raub.revised.092804.PPT); Raub et al., 2005] or simple molecular polar surface metrics are considered grossly reflective of the major determinants of passive cellular membrane permeability (Clark, 1999). These simplified rule-based approaches have utility because they approximate our understanding of the underlying mechanisms that contribute to CNS penetration; however, they are unlikely to accurately reflect the complexity of the interactions and combination of these physicochemical and biochemical determinants. Now that predictions of CNS penetration are playing a greater role in discovery decision making, an examination of the primary endpoints that are measured and modeled is warranted.
How Is BBB Permeation Measured?
Log blood (plasma)-brain partitioning (BB) is a measure of the partitioning between blood (plasma) and brain tissue, quantified by the ratio of the solute concentrations in brain and plasma. Log BB is generally obtained by methods such as the brain uptake index (Ohlendorf, 1981), which measures first-pass extraction from a single intravenous injection to yield log BB. Log BB can also be determined from the bolus carotid intravenous method (Ohno et al., 1978) with area under the curve plasma quantitation and single-point brain concentration determination. In this case, a permeability-surface area coefficient is determined, and log BB can be calculated from log PS with certain assumptions regarding the endothelial surface area. It is an apparent value because the perfusate contains serum protein and other blood components to which the compound of interest may bind. Note that BBB permeability is implicitly captured in this measure and that it does not distinguish free and plasma-bound solute (Pardridge, 2004). It also does not address intracerebral distribution, differentiating between intracellular and extracellular concentrations. As a consequence, log BB is but a crude assessment of the likely concentrations at the targeted site of CNS activity, whether extracellular or cytosolic. Results from log BB measurements also depend upon experimental conditions, particularly dosing regimen (single bolus dose versus multiple doses versus continuous infusion) as well as sampling time from dose. Ideally, brain concentrations are measured at the plasma tmax for the solute; however, differences in brain and systemic clearance can lead to variations in the measured log BB for a given compound depending upon sampling time. It is appropriate to consider the various contributions to experimental uncertainty when employing such data in model development, particularly when attempting to infer improved performance of a given method or set of computational descriptors relative to others.
When considering CNS penetration, perhaps the most appropriate in vivo measure is the log of the permeability-surface area coefficient (log PS) (Martin, 2004; Pardridge, 2004), which represents the permeability of a given solute across the brain capillary endothelium (the anatomical representation of the BBB). This measure reflects the free (unbound) extracellular solute concentrations and is most often performed following the perfusion method established by Takasato et al. (1984). This method eliminates serum binding and provides a direct measure of trans-BBB apparent permeability; however, it is a resource-intensive measure that requires microsurgical expertise and therefore is of relatively low throughput. Solubility of the solute of interest in the perfusate can also be a limiting factor. As with log BB, this measure does not address specific intracerebral solute tissue distribution and represents the potential combination of passive and active transport mechanisms [deconvolution requires additional experiments, perhaps in combination with in vitro methods (as addressed in Current Issues and Future Directions)]. Considering the greater mechanistic clarity of log PS compared with log BB, the former property is likely to be more informative as a measure of CNS penetration for use in lead optimization. Its value will be greatly enhanced in combination with measures of other determinants of CNS penetration, including plasma protein binding and susceptibility to active transport (Mahar-Doan et al., 2002; Raub et al., 2005).
There have been numerous efforts recently to classify chemical structures as CNS+ or CNS-, where these predictions are derived from available data for marketed compounds (for examples, see Ajay et al., 1999). These data represent pharmacological activity, which is the activity of the compound at the receptor/targeted site of action, and are indirect and implicit measures of BBB permeability and intracerebral concentration and distribution. It should be understood that if a given compound does not show CNS pharmacological activity, the compound may yet access the CNS but without the targeted or modeled pharmacological potency—“the absence of evidence is not the evidence of absence.” Such predictions may be best suited for combinatorial library design, in which a general bias for chemistries more likely to manifest CNS potency is desired, to select from among a much larger virtual or existing compound library and to reduce the burden on compound acquisition, library synthesis, or high-throughput screening resources. Another critical consideration is the utility of such classification methods (e.g., CNS+ versus CNS-) in prioritization and selection of potential candidates. What is the error in such classification, particularly at the boundaries separating the classes? What is the associate risk of advancing poor candidates or excluding viable candidates as a result of such classification error? Practitioners should also be cognizant of the risks in assuming that marketed drugs and existing compounds for which such pharmacological data exist represent the breadth and diversity of all chemistries likely to demonstrate favorable CNS activities.
Why Predict BBB Permeation?
What are the decisions that are made in drug discovery with respect to CNS penetration? Certainly, in the simplest terms, it is desirable to determine whether a compound will penetrate and distribute within the CNS with the requisite pharmacokinetic and pharmacodynamic performance required for a CNS target or if it will be excluded from the CNS, wherein potential toxicities would mitigate its applicability. As a result, a variety of in vivo and in vitro methods for assessing CNS penetration have been developed and applied to advancing drug candidates with the desired properties. However, such methods are inevitably resource-intensive in terms of skilled personnel, animals, cell culturing, and bioanalytical support, and they are of course retrospective in their application, requiring the existence of synthesized compound. In silico prediction methods address this limitation, supporting the prospective design and selection of candidate structures prior to synthesis.
A Bit of History
The prediction of BBB permeation from computed or experimental parameters has been of interest for at least 30 years. Some of the earliest attempts sought to discover relationships between brain penetration (usually determined as log BB) and partition coefficients measured in various solvent systems. For instance, it was suggested in the early 1970s that the optimum log Poct value for diffusion into the CNS was around 2 (Glave and Hansch, 1972; Levin, 1980). In something of a seminal paper, Young et al. (1988) at Smith Kline & French Laboratories discovered a linear relationship between log BB and Δ log P (= log Poct - log Pcyc) for 20 histamine H2 receptor antagonists. This seems to have been one of the first indications of the crucial role of hydrogen-bonding capacity, as well as lipophilicity,1 in passive diffusion across the BBB. The 20 compounds in Young et al. (1988) are still part of most log BB training sets today, having been added to over the years, notably by Platts et al. (2001). Since the mid-1990s, when interest in predictive ADMET began to burgeon, many groups have developed predictive models using log BB data or surrogate measures such as CNS (in)activity. These have been thoroughly reviewed at regular intervals in recent years (Norinder and Haeberlein, 2002; Clark, 2003; Ecker and Noe, 2004) and so will not be covered here.
The State of the Art
The state of the art in log BB prediction can be illustrated by Winkler and Burden (2004). A Bayesian neural network was used to model the relationship between an 85-compound log BB data set and a set of computed molecular property descriptors, including counts of hydrogen-bond acceptors and donors, hydrophobes, rotatable bonds, log P, molecular weight, and polar surface area (PSA), a popular measure of hydrogen-bonding capacity. The model with the best statistics (r2 = 0.81, s = 0.37) was obtained using four nodes in the hidden layer. Applying this model to a 21-compound test set gave rise to a q2 value of 0.65 (standard error of prediction = 0.54). The statistics for the training and test sets are typical of what is observed in high-quality log BB models (Clark, 2003). Given that the experimental error in log BB data are estimated to be up to 0.3 log units, it is clear that the current generation of in silico models seems to be approaching the limits of predictive accuracy that might be expected. Applying automatic relevance determination to the whole data set showed that the most important descriptor in the model was log P, closely followed by the count of rotatable bonds and PSA. The combination of lipophilicity (log P) and hydrogen-bonding (PSA) descriptors is a feature of many modern log BB models (Clark, 2003). The count of hydrogen-bond donors was more important than that of acceptors, mirroring findings by other workers in the field of absorption prediction (Clark, 2005).
It would seem then that, given the current data sets available, the prediction of log BB is about as good as can be expected. There is, however, little room for complacency, especially given some recent high-profile criticism leveled at log BB as an index of brain penetration (Martin, 2004; Pardridge, 2004). In particular, the fact that log BB reflects the total drug concentration in the brain rather than the free drug concentration, which crucially determines the receptor saturation by the drug, has been highlighted (Atkinson et al., 2002). Pardridge (2004) contends that the best index of BBB permeability is the BBB PS product, because this can predict the level of free drug in the brain.
It is thus timely that two papers on log PS prediction have been published recently. In the first of these, Abraham (2004) collated a set of log PS values for 30 neutral compounds and correlated these data (r2 = 0.87, s = 0.52) with his five solute descriptors previously used for log BB modeling. It is interesting to compare two equations
for log BB (eq. 6 in Platts et al., 2001) and log PS (eq. 3 in Abraham, 2004): 


where E is an excess molar refraction, S is dipolarity/polarizability, A is the hydrogen-bond acidity of the compound, B is the hydrogen-bond basicity of the compound, and V is its characteristic (McGowan) volume. With the exception of the additional indicator variable (I) for carboxylic acids, the two equations are reassuringly similar in terms of the descriptors that they comprise and the
signs of the coefficients. This indicates that the factors that determine log BB are also those that determine log PS, suggesting
that the knowledge about the molecular determinants of brain permeation gained over recent years from log BB modeling is still
useful, at least in a qualitative sense.
The second paper (Liu et al., 2004) describes the measurement of log PS values for 28 drug-like compounds. Of these, 23 compounds were assumed to cross the
BBB by passive diffusion and were selected for modeling. Using stepwise multivariate linear regression and a variety of computed
descriptors, the following equation (eq. 3 in Liu et al., 2004) was derived: 
where log D (at pH 7.4) was calculated by ACD/LogD software (Advanced Chemistry Development, Inc., Toronto, ON, Canada),
vsa_base is the van der Waals' surface area due to basic atoms, and TPSA is the topological polar surface area— the latter
descriptors being computed by MOE (Chemical Computing Group, Montreal, QB, Canada). Once again, the combination of polar surface
area and lipophilicity descriptors previously observed in many log BB models is noteworthy, and reassuring. This model was
found to predict two small literature log PS data sets quite well.
Some Success Stories
Although it is hard to find instances in the medicinal chemistry literature in which BBB permeability calculations have been used successfully, some examples are beginning to emerge in which it is clear that, at least in a qualitative sense, predictive modeling has had influence, either directly or indirectly. For example, scientists at AstraZeneca, citing a log BB modeling paper (Clark, 1999), have described their efforts to produce a non-CNS penetrating κ-opioid receptor agonist by adding polar substituents to increase the PSA and reduce lipophilicity (Semple et al., 2003). Conversely, scientists at Neurogen have reported the results of lipophilicity and PSA calculations as evidence that a series of corticotrophin-releasing factor-1 receptor antagonists was likely to permeate the BBB (Hodgetts et al., 2003), again citing a review of predictive modeling (Atkinson et al., 2002) as their justification. Finally, the VolSurf BBB permeation model (Cruciani et al., 2000) has been employed recently to predict the BBB permeability of some prodrugs of nonsteroidal anti-inflammatory agents (Perioli et al., 2004). It seems likely that, given the lag time inherent in most medicinal chemistry publications, more such applications will be reported in the future that reflect the uptake of in silico BBB permeation predictions in the last few years.
Current Issues and Future Directions
The increased ease, speed, and throughput of in silico methods do not come without a cost, viz., the increased risk that such predictions may be somewhat or totally inaccurate for the structures of interest. Such inaccuracies would derive from two primary sources: the nature of the chemistry space represented by the compounds used in training these models (does it cover, and how well, the chemistry of interest?) and the mechanistic clarity and relevance represented in the data as they are measured.
A common refrain at the present time in the field of predictive ADMET is the need for better and larger data sets from which to build the next generation of models. The area of BBB permeation is no exception to this. Even after more than a decade of intense interest in predicting BBB permeation, the number of log BB measurements available in the public domain is still probably fewer than 200. If, as has been suggested above, the focus should now shift to log PS, the need for additional data are even more acute, since few log PS data number are available. The composition and size of the data sets also need attention. To date, the publicly available data sets comprise a motley assortment of compounds from various sources, with little guarantee of their being derived by a common experimental protocol. If the science of BBB permeation prediction is to advance as we would wish, then what is required is bespoken data generation for the purposes of model building, creating data sets that span appropriate ranges in the biological endpoint, molecular diversity, and physicochemical properties.
Again, considering the possible structural diversity of chemistry space (even subsetting for “small” molecular therapeutics of mol. wt. ≤500) and the potential liabilities associated with “global” models, e.g., sparseness of coverage in chemical-space regions of interest and lack of motivation to generate such a large and comprehensive set of measurements due to the substantial financial burden for any one organization, it may be more appropriate to focus resources for quantitative model development within lead optimization series. In such settings, there will be sufficient amounts of compound generated for measurements (less required for in vitro measures and more required for in vivo confirmatory experiments) and greater structural similarity to support local model development. The importance of local chemistry space-relevant model development should not be underestimated—through working in more homologous chemistry spaces, there is a greater likelihood of reducing complexities due to multiple mechanisms and the resulting need to generate nonlinear statistical/computational models using methods such as neural networks. Such nonlinear models in turn make a structure-based mechanistic interpretation of predicted trends from computational descriptors difficult at best and are thus best avoided during lead optimization, when the key question asked by the medicinal chemist is “Which compound should I make next?”
In vivo methods, although offering a more direct measure of performance and greater acceptance in discovery decision making as a result, often cannot meet the time constraints placed on such decision making due to their low throughput. In addition, in vivo methods are not as readily amenable to the mechanistic deconvolution of the various determinants of CNS penetration, including passive and active transport. Therefore, in vitro methods that can supplant more resource-intensive in vivo methods—such as the use of wild-type and transformed/transfected cell lines [such as Madin-Darby canine kidney transfected with human multidrug resistance 1 or other transformed cell lines (Terasaki et al., 2003)]— have emerged and have found utility in combination with in vivo methods for assessing CNS penetration with greater throughput and increased mechanistic clarity and relevance (Raub et al., 2005). Such mechanistic clarity, along with more rapid turnaround times for the generation of data, are critical for the development and application of predictive models of CNS penetration to be applied in discovery decision making.
No method is without its drawbacks, and there are several concerns associated with predicting CNS penetration. These can perhaps best be enumerated by indicating where in the discovery process such methods are currently applied. The earliest stages in drug discovery where knowledge of likely CNS penetration will assist in decision making are in library design and virtual screening, as well as in the hit-to-lead/lead selection stages (Fig. 1). In silico methods offer the promise of rapid, relatively inexpensive (beyond the initial investment in software models and infrastructure) characterization and can be used with the objective of populating libraries either exclusively or a with preponderance of structures anticipated to be either CNS penetrants or nonpenetrants (for library design) or biasing lead selection toward desired CNS penetration (or lack thereof). Existing models such as those of Cruciani et al. (2000) or simpler metric/rule-based approaches (Clark, 1999, 2003; Norinder and Haeberlein, 2002: Raub, 2004) (simple indicator variables and surface area calculations) are readily applied in library design and virtual screening. However, these models are in large part based upon phenomenological data (e.g., pharmacologically active CNS+ versus nonactive CNS- compounds) or experimentally measured sets of pharmacokinetic property endpoints of limited size and structural diversity. The compounds that comprise these data sets are subsets of existing, synthesized compounds, which are in turn vastly small (and potentially sparsely sampled) subsets of possible chemistry space. It is appropriate to consider the relevance of the training data structural composition relative to the chemical classes of interest. Such models may be globally applied but perhaps not locally relevant. Therefore, there is an inherent risk in application of such models—that credible compounds will be excluded from further consideration before there is an opportunity to validate them through in vivo studies (false negatives). As such, these models or mnemonic devices should mirror current biophysical and biochemical understanding of pharmacokinetic processes involved in CNS penetration (e.g., plasma binding, passive and active transport mechanisms, and metabolism) to imbue some sense of confidence in their applicability. The simpler expert-rule/metric-based approaches as referred to above may therefore be most appropriate at this stage. Their application may be best suited in combination with some structural similarity clustering to ascertain variability of the CNS metrics across structural classes. Such a combination of methods will help to diagnose liabilities intrinsic to the core template or due to side-chain modifications.
Application of in silico, in vitro, and in vivo CNS penetration assessment methods to the various stages of drug discovery.
The more appropriate setting for determining CNS penetration is arguably in the lead optimization stage in drug discovery. At this point in the discovery process, there is generally a sufficient investment of synthetic capacity and medicinal chemistry support for compound design and scale-up to support experimental measurements. Furthermore, the data resulting from in vitro and in vivo assessments of CNS penetration may be used for the development of more locally (in the sense of chemistry space) relevant and quantitatively predictive models. In addition, there usually exist sufficient resources for the confirmation of predicted CNS penetration and in turn the application of such data to iterative model development for improved prospective design of additional analogs.
Regardless of the stage in the drug discovery process, it is the responsibility of the user to determine the appropriateness of any predictive model that they may choose to apply (Table 1), whether made available from the scientific literature or developed within their own organizations. Model “appropriateness” will be dictated by the quality (mechanistic clarity and relevance to the process to be modeled, statistical robustness provided by replicate measurements, and range and distribution of measured data) (Stouch et al., 2003), quantity, and whether the chemical structures in the training set represent the chemical space of interest (does the model extrapolate or interpolate predictions for the structures of interest?). What is the statistical confidence surrounding the predicted value, and how good does it need to be for the decisions to be made based upon such predictions? There have been numerous statistical methods developed and applied to the modeling of CNS penetration, and aside from concerns regarding the appropriateness of qualitative predictions (uncertainty increases as predictions approach category boundaries), the utilization of one or more of these methods is basically a matter of user preference.
Essential considerations for the development and application of effective and validated predictive models For several representative references, see Golbraikh and Trophsa (2002), Stouch et al., (2003), and Hawkins (2004). See also http://home.neo.rr.com/catbar/chemo/pit-falls.htm.
Concluding Remarks
The utility of methods for predicting CNS penetration fundamentally depends upon what is to be decided and at what point in the drug discovery process. This will impact the type of data to be generated (in silico, in vitro, or in vivo) and will determine the requirements for the mechanistic relevance and clarity of the measured endpoint and how they can be related to chemical structure. The discovery scientist will need to determine how much uncertainty associated with data (predicted or measured) can be accommodated in the decision process (risk assessment). They should consider the advantages of employing local as opposed to global models and integration with in vitro and in vivo measurements to validate and iteratively refine models and extend their utility. These issues will continue to drive the measurement of data of appropriate statistical and mechanistic clarity for the development and improvement of in silico models. This will in turn lead to a reduction of uncertainty and will improve efficiencies by focusing on local models—into lead refinement where greater resources are made available and where the opportunities exist for ongoing validation of in silico and in vitro methods against in vivo measurements.
Footnotes
-
↵1 Note that lipophilicity, as measured by octanol/water partitioning, is in fact a function of both hydrophobicity (cavity formation in solvent) and hydrogen-bond donor potential, as indicated by solvatochromic analysis (El Tayar et al., 1991). Hydrogen-bond potential, as represented by Δ log P [the difference in octanol/water and heptane (or isooctane)/water partitioning] is primarily a function of both hydrogen-bond acceptor and donor energetics, with minimal contribution from solute volume (or cavity formation).
-
Article, publication date, and citation information can be found at http://jpet.aspetjournals.org.
-
doi:10.1124/jpet.104.075705.
-
ABBREVIATIONS: CNS, central nervous system; BBB, blood-brain barrier; BB, blood (plasma)-brain partitioning; PS, permeability-surface area; ADMET, absorption, distribution, metabolism, excretion, toxicity; PSA, polar surface area.
-
- Received January 18, 2005.
- Accepted May 24, 2005.
- The American Society for Pharmacology and Experimental Therapeutics




