Original article
Correlation and prediction of a large blood–brain distribution data set—an LFER study

https://doi.org/10.1016/S0223-5234(01)01269-7Get rights and content

Abstract

We report linear free energy relation (LFER) models of the equilibrium distribution of molecules between blood and brain, as log BB values. This method relates log BB values to fundamental molecular properties, such as hydrogen bonding capability, polarity/polarisability and size. Our best model of this form covers 148 compounds, the largest set of log BB data yet used in such a model, resulting in R2=0.745 and e.s.d.=0.343 after inclusion of an indicator variable for carboxylic acids. This represents rather better accuracy than a number of previously reported models based on subsets of our data. The model also reveals the factors that affect log BB: molecular size and dispersion effects increase brain uptake, while polarity/polarisability and hydrogen-bond acidity and basicity decrease it. By splitting the full data set into several randomly selected training and test sets, we conclude that such a model can predict log BB values with an accuracy of less than 0.35 log units. The method is very rapid—log BB can be calculated from structure at a rate of 700 molecules per minute on a silicon graphics O2.

Introduction

Brain uptake, or the ability of a molecule to enter brain tissue, has been a subject of great interest to the pharmaceutical industry for over 30 years; for recent reviews see [1], [2]. Although measures such as biological activity [3] and ‘brain uptake index’ [4], [5] have been proposed, the two measures that are most amenable to physicochemical analysis are: (a) brain perfusion; and (b) blood–brain distribution. The former is obtained as a rate constant from experiments over a very short time scale. Because of the variation in experimental procedure, the various sets of available data are not compatible, and so only rather restricted data sets have been analysed [6]. The latter is obtained from much longer time scale experiments, and since results from a number of workers seem self-consistent, it has been possible to build a much larger database. Blood–brain distribution, BB, is the measure we shall analyse defined through Eq. (1).BB=(conc. in brain)/(conc. in blood)Young et al. [7] published the first analysis of brain uptake in terms of BB, reporting in vivo values in rats for a large number of H2 receptor histamine agonists, and modelling these values using water–octanol and water–cyclohexane partition coefficients log P(octanol) and log P(cyclohexane). Abraham et al. [8] and Chadha et al. [9] added a further 37 BB values found indirectly from solubilities in blood and brain: many attempts to model this and related data sets have been made, and are summarised below.

Lombardo et al. [10] correlated log BB against the free energy of solvation in water, obtaining a reasonable linear fit. Kelder et al. [11] used the polar surface area (PSA) as a descriptor of log BB, with encouraging results, but we note that the resulting model, Eq. (2),logBB=1.33−0.322PSAis physically unrealistic since every compound with zero PSA is predicted to have log BB=1.33 (however, this may not be of great concern to drug designers, since no drug will have zero PSA). Clark [12] also used PSA to model log BB, this time in combination with log P(octanol), to yield a slightly better model. Norinder et al. [13] and Luco [14] both used large numbers of descriptors and partial least squares (PLS) methods to correlate log BB with generally excellent results. Luco collated a total of 100 BB values, but made no attempt to combine these into one data set, restricting his training model to just 58 compounds. Exactly the same data set of 100 compounds was later analysed by Feher et al. [15]; using 61 compounds as a training set, a statistically very reasonable equation was obtained with only three descriptors:logBB=0.4275−0.3873na, aq+0.1092logP(octanol)−0.0017PSAn=61,R2=0.730,RCV2=0.688,e.s.d.=0.424,F=51Here, and in all that follows, n is the number of points used in the regression, R2 is the square of the overall correlation coefficient, R2CV is the cross-validated correlation coefficient, e.s.d. is the standard deviation, and F is Fischer's F-statistic. The descriptor na, aq in Eq. (3) represents the number of hydrogen-bond acceptors in an aqueous medium. Although this seems to be rather a simple descriptor, it is not obvious how it is to be determined or estimated. Eq. (3) was applied to 61 compounds out of the 100 listed. Two test sets of compounds were constructed from the remaining 39 compounds; no statistical details were given, but we have calculated that for the two test sets the e.s.d. values were 0.76 (n=14) and 0.80 (n=25). The most recent quantitative analysis [16] uses the free energy of solvation [13] as the sole descriptor. For a training set of 55 compounds Keserü and Molnár [16] obtained a correlation equation with R2=0.722 and e.s.d.=0.37 log units. A number of test sets were studied; values of e.s.d. ranged from 0.14 (n=5) to 0.37 (n=25). The lower e.s.d. value is clearly an artefact, because the experimental error in log BB must be around 0.30 units, but the e.s.d. value of 0.37 for 25 test compounds is impressive. It was suggested [16] that the calculation of log BB via the free energy of solvation was the fastest method to date, at >10 molecules per minute.

Summaries of the various models of blood–brain distribution that used reasonably large data sets are shown in table I, where details of the original model of Abraham et al. [8] are given for comparison. There are a number of models that give e.s.d. values of about 0.35 log units for a training set, and slightly higher values between observed and predicted log BB values of a test set. This appears to be close to the accuracy limit of models constructed with large datasets, and e.s.d.=0.35 is the accuracy for which we aim here. Note that most models exclude a number of compounds from the analysis; invariably these are compounds that have much more negative observed log BB values than calculated. There are several possible reasons for such outliers. For example, if the analytical method is radiochemical detection, any biological degradation will lead to much smaller observed values than calculated. Efflux mechanisms [17], most notably by poly-glycoprotein [18], will also result in more negative observed values than calculated.

In addition to the models of blood–brain distribution, shown in table I, Crivori et al. [19] have recently described an analysis of brain uptake of 120 compounds using 3D VolSurf parameters. Their analysis was qualitative only, classifying compounds as +, +/−, or −, resulting in a remarkably rigorous model, considering that quite different measures of brain uptake were used; these included the equilibrium blood–brain distribution that we have dealt with, and the kinetic rate of perfusion from saline to brain. In the present work, we have been careful to use only a single measure of uptake, namely blood–brain distribution.

Section snippets

Chemistry

The general equation we use is the same as that originally employed by Abraham et al. [8], and subsequently reviewed [2], [20]. We use a simplified notation, and write the equation as follows:SP=c+e.E+s.S+a.A+b.B+v.VHere SP is a set of solute properties in a given system, e.g. a set of log BB values. The independent variables are solute descriptors: E is an excess molar refraction, S is the dipolarity/polarisability, A and B are the hydrogen-bond acidity and basicity, respectively, and V is the

Results

An initial regression of all 157 log BB values in table II against LFER descriptors showed a reasonably accurate fit, but nine molecules had to be omitted as outliers (see below for details). The remaining set of 148 values yielded the following equation:logBB=0.044+0.511E−0.886S−0.724A−0.666B+0.861Vn=148,R2=0.710,RCV2=0.682,e.s.d.=0.367,F=71Eq. (5) is a reasonable model of log BB, with many similarities to our previous [8] equation and e.s.d. close to those found in several other studies using

Discussion

Because the descriptors we use in the general Eq. (4) reflect solute/solvent interactions, we can interpret the coefficients in terms of the effect that particular interactions have on the process under consideration. The positive e- and v-coefficients in Eq. (6) indicate that increasing molecular size and (less importantly) the presence of n- and π-electron pairs tend to push compounds out of blood and into brain. In this respect, Eq. (6) is similar to many water–solvent partition (log P)

Conclusions

We have developed and tested a LFER model for the equilibrium distribution of solutes between blood and brain, log BB. After collating data from several sources to yield a data set of 157 log BB values, a model was constructed using MLRA. The best fit resulted from an equation in which shows that size strongly enhances brain uptake, and polarity/polarisability, hydrogen-bond acidity, basicity, and the presence of carboxylic acid groups strongly retard brain penetration. Thus, we are able to

Acknowledgements

J.A.P. and Y.H.Z. are grateful to GlaxoWellcome for post-doctoral fellowships. A.H. and D.B. are members of GlaxoSmithkline's Mechanism and Extrapolation Technologies group.

References (48)

  • W.H. Oldendorf

    Brain Res.

    (1970)
  • M.H. Abraham et al.

    J. Pharm. Sci.

    (1994)
  • H.S. Chadha et al.

    Bioorg. Med. Chem. Lett.

    (1994)
  • D.E. Clark

    J. Pharm. Sci.

    (1999)
  • U. Norinder et al.

    J. Pharm. Sci.

    (1998)
  • M. Feher et al.

    Int. J. Pharm.

    (2000)
  • M. Niyagi et al.

    J. Pharmacobio-Dyn.

    (1986)
  • M.I. Aly et al.

    Neurochem. Res.

    (1980)
  • E.C.M. De Lange et al.

    Brain Res.

    (1994)
  • H.G. Bolander et al.

    Acta Pharmacol. Toxicol.

    (1984)
  • N. Miyagi et al.

    J. Pharmacobio-Dyn.

    (1986)
  • Leo A.J., MedChem 2000 database, BioByte Corp., P.O. 517, Claremont, CA...
  • F. Hervé et al.

    Clin. Pharmacokinet.

    (1994)
  • M.H. Abraham et al.

    J. Chem. Soc. Perkin Trans. 2

    (1994)
  • M.H. Abraham et al.
  • C. Hansch et al.

    J. Med. Chem.

    (1968)
  • W.H. Oldendorf

    Am. J. Physiol.

    (1971)
  • J.A. Gratton et al.

    J. Pharm. Pharmacol.

    (1997)
  • R.C. Young et al.

    J. Med. Chem.

    (1988)
  • F. Lombardo et al.

    J. Med. Chem.

    (1996)
  • J. Kelder et al.

    Pharm. Res.

    (1999)
  • J.M. Luco

    J. Chem. Inf. Comput. Sci.

    (1999)
  • G.M. Keserü et al.

    J. Chem. Inf. Comput. Sci.

    (2001)
  • Cited by (192)

    • Nonlinear Method to Predict the Distribution of Structurally Diverse Compounds between Blood and Tissue

      2020, Journal of Pharmaceutical Sciences
      Citation Excerpt :

      The existing calculation methods include the nonlinear regression model,2 linear regression model (MLR), partial least squares (PLS), principal components analysis (PCA), grid computing, genetic algorithm, and neural network algorithm.9,16 ( 2) Adding new effective parameters such as polarized surface area (PSA),2,7,10,14 solution parameters,17,18 and molecular weight parameters.6,12,16,19 PSA is defined as the van der Waals surface area from oxygen, nitrogen, and hydrogen atoms bonded to oxygen and nitrogen atoms.14

    • Novel application of capillary electrophoresis with a liposome coated capillary for prediction of blood-brain barrier permeability

      2020, Talanta
      Citation Excerpt :

      However, cetirizine is reported to be able to passively cross BBB and induce weak CNS related adverse effects (sedation) [73]. The in silico calculated drug log BB value, based on Abraham descriptors [67], is much higher (log BB = 0.1 [54]). The log k parameter, determined in the CEC method was also relatively high (log k = −1.01; Table 1), indicating the compound's interaction with the liposomal coating.

    View all citing articles on Scopus
    View full text