Transient lower esophageal sphincter relaxation (TLESR) is the major mechanism for gastroesophageal reflux. Characterizations of candidate compounds for reduction of TLESRs are traditionally done through summary exposure and response measures and would benefit from model-based analyses of exposure-TLESR events relationships. Pharmacokinetic (PK)-pharmacodynamic (PD) modeling approaches treating TLESRs either as count data or repeated time-to-event (RTTE) data were developed and compared in terms of their ability to characterize system and drug characteristics. Vehicle data comprising 294 TLESR events were collected from nine dogs. Compound [(R)-(+)-[2,3-dihydro-5-methyl-3-(4-morpholinylmethyl)pyrrolo[1,2,3-de]-1,4-benzoxazin-6-yl]-1-naphthalenylmethanone mesylate (WIN55212-2)] data containing 66 TLESR events, as well as plasma concentrations, were obtained from four dogs. Each experiment lasted for 45 min and was initiated with a meal. Counts in equispaced 5- and 1-min intervals were modeled based on a Poisson probability distribution model. TLESR events were analyzed with the RTTE model. The PK was connected to the PD with a one-compartment model. Vehicle data were described by a baseline and a surge function; the surge peak was determined to be approximately 9.69 min by all approaches, and its width in time at half-maximal intensity was 5 min (1-min count and RTTE) or 10 min (5-min count). TLESR inhibition by WIN55212-2 was described by an Imax model, with an IC50 of on average 2.39 nmol · l−1. Modeling approaches using count or RTTE data linked to a dynamic PK-PD representation of exposure are superior to using summary PK and PD measures and are associated with a higher power for detecting a statistically significant drug effect.
Transient lower esophageal sphincter relaxation (TLESR) is the predominant mechanism behind gastroesophageal reflux. Gastroesophageal reflux disease affects more than 10% of the Western population, and even if effective treatments are available, some patients still fail to achieve adequate symptom relief (Fass et al., 2005). TLESR is characterized by a rapid prolonged decrease in lower esophageal sphincter (LES) pressure in the absence of swallowing (Dodds et al., 1982; Mittal and McCallum, 1988).
TLESRs are usually recorded as events with a corresponding time of occurrence, which offers several possibilities regarding the modeling approach. The current metric method used to quantify the relationship between exposure and number of TLESRs is a statistical test based on the counted number of events in a designated time span (the entire observation time or the last fraction). This method therefore disregards part of the data, but more importantly ignores the time course of the events and the dynamics of the drug effect and may consequently result in poor power and predictive performance.
Analyzing counts of the events occurring during predefined time intervals is an alternative method more commonly used for the analysis of epileptic seizures (Frame et al., 2003; Miller et al., 2003; Trocóniz et al., 2009). Count models often contain a low number of parameters and are easily used for simulations, but they may involve simplification of the data, with potential impact depending on the length of the time interval. On the other hand, treating these observations as repeated time-to-event data (RTTE) constitutes a more detailed method that has previously been introduced in biomedical analyses (Cox et al., 1999). Because TTE analysis considers the time from entry into a study/start of drug administration until a subject presents a particular outcome, a RTTE model applies when events can occur more than once. The application of this method in the pharmacometric field is increasing, although it is not frequently used, partly because it requires more complex models and data structure.
Therefore, this study aims at exploring the differences between the count and the RTTE methods through the analysis of TLESRs both in terms of their ability to estimate parameters of interest and simulate similar studies. It further aims to contrast both of these approaches to a traditional analysis with respect to power to detect a drug effect.
Materials and Methods
Nine adult Labrador retrievers, both male and female, were used in this crossover study. Cervical esophagostomies were made, and after recovery from surgery, the dogs were allowed to rest in a Pavlov stand. The experiments were approved by the Ethical Committee for Animal Experiments of the Gothenburg region of Sweden.
The method used for the study has been described previously (Lehmann et al., 1999). The dogs were intubated with a water-perfused Dentsleeve (Mississauga, Canada) multilumen assembly for measurement of LES, esophageal, and gastric pressures. An antimony pH electrode was placed 3 cm above the LES for measurement of acid reflux episodes, and a water-perfused catheter was placed in the hypopharynx to measure swallows.
TLESRs were stimulated by gastric infusion of an acidified liquid nutrient (30 ml/kg; 100 ml/min) followed by air insufflation (500 ml/min) to maintain a gastric pressure of 10 ± 1 mm Hg during the experiment. TLESRs were defined as a rapid decrease in LES pressure (>1 mm Hg/s) to a pressure <2 mm Hg above gastric pressure and a duration >1 s, without any pharyngeal signal <2 s before onset.
The cannabinoid receptor agonist (R)-(+)-[2,3-dihydro-5-methyl-3-(4-morpholinylmethyl)pyrrolo[1,2,3-de]-1,4-benzoxazin-6-yl]-1-naphthalenylmethanone mesylate (WIN55212-2) was administered at an intravenous bolus dose of 0.015 mg/kg 10 min before initiating the measurements. WIN55212-2 was purchased from Tocris Bioscience (Bristol, UK) and dissolved in 5% ethanol and 30% polyethylene glycol in sterile water. The control animals received 0.9% sodium chloride intravenously (0.5 ml/kg) 10 min before the start of the experiment.
Vehicle data comprising 294 TLESR events from 32 experiments were collected from nine dogs. Compound (WIN55212-2) data containing 66 TLESR events from 15 experiments, as well as plasma concentrations from eight experiments, were obtained from four dogs. TLESR data were recorded during a 45-min window starting 10 min after the intravenous bolus administration. Blood samples were collected at predetermined time points until 180 min.
Two experiments were conducted on each of the four dogs on separate days. The characterization of the pharmacokinetic (PK) profiles corresponded to the first step of the modeling process. An attempt to distinguish interoccasion variability was not conclusive, therefore mean values of plasma concentrations at each sampling point were used for each dog. In the same time span with pharmacodynamic (PD) observations, the plasma concentrations followed monoexponential decay; thus, a one-compartment model was built to estimate individual pharmacokinetic parameters, further used in an individual pharmacokinetic parameter approach for the construction of the PK-PD model. where C(t) is the plasma concentration at time t, C0 is the plasma concentration at time 0, and kel is the elimination rate constant.
Two PK-PD approaches were used to characterize the exposure-response profile based on the TLESR events. The count approach consisted of dividing an experiment into time intervals, counting the number of events in each time interval, and modeling them. The RTTE approach considered the original data by modeling the risk of an event to occur as a hazard process.
Analyzed data consisted of event counts per time interval; therefore, obtained values were non-negative integers entering a probability distribution. The Poisson function is the most commonly used probability distribution to describe count data. It expresses the probability of a certain number of events to happen in a fixed period of time. The use of the Poisson model assumes that the hazard within each time interval is constant. Moreover it implies that events described are independent of each other (Feller, 1968).
The probability mass function [f(n;λ)] of the number of events to be equal to n following the Poisson distribution takes the form: where λ is the expected average number of TLESR events in each time interval, and n is the actually observed number of TLESR events in one specific time interval.
The experiment, with a duration of 45 min and a minimum time unit of 1 min, was divided into time intervals of different lengths, and TLESR events were represented as count numbers on a series of time intervals. Two count datasets were established, one with 1-min time intervals and the other with 5-min time intervals.
Repeated Time-to-Event Model.
TTE data can be modeled as a hazard process (Hosmer and Lemeshow, 1999); the response is often referred to as survival time. TLESRs are not unique events, and the experiment lasted for 45 min even though a first event had been observed, which gave us the opportunity to monitor several of them. Thus, the TTE modeling method was expanded to a RTTE approach, which is applied to events that can happen numerous times. As in count models, changes in the hazard with time of experiment of concentration of drug could be modeled. In addition, using the RTTE model the assumption of a constant hazard within time intervals, necessary in count models, could be evaluated. Violation of such an assumption could occur if, for example, one event triggers (or inhibits) another one. Each TTE, starting after the last event and ending when the next event was occurring, was modeled after a function of the hazard. According to the independence assumption, the system was reset at each event, and the hazard started to accumulate again.
The instantaneous risk of a TLESR event is a hazard. The hazard can be time-varying because of physiological or pharmacological processes. Such time variation can be modeled by using an ordinary differential equation.
The probability density function [f(t)] for an event to happen at time t can be formulated as: where h is the hazard and S is the survival probability, i.e., the exponential of negative cumulated hazard between tj-1 and tj (tj-1 being the time at the start of the experiment or of the last event). After an event has occurred, the hazard of having another event is zero in TTE modeling, whereas it is nonzero in an RTTE model.
The censoring at the end of the experiment (at 45 min) was handled by incorporation of the information that no event happened in the period between the last observed event and the end of the study.
Time Course and Drug Effect.
The observed behavior of the TLESR events over time was, as stated before, described with the mean count parameter (λ) in the count model and the hazard (h) in the RTTE model. Both took an expression composed of a baseline (Base) incremented by a time-course function [g(t)] and affected by a drug effect [E(C(t)] when analyzing the compound data:
An interindividual variability component exponentially affecting the base parameter was included in the models:
The TLESR events distributed heterogeneously over time, in both the vehicle and the WIN55212-2 experiments, and presented a high occurrence of events between 5 and 15 min. This apparent change in time course was described by a surge function used previously to describe hormone release (Nagaraja et al., 2003; Lönnebo et al., 2007; Jauslin et al., 2011), improved for shape estimation (Karlsson, 2009): where SA is the surge amplitude, SW is half of the surge width in time units at half-maximal intensity, PT is the peak time, and γ is the shape parameter. A low value for the shape parameter results in a peaked surge, and higher values result in the surge approaching a square wave. These four parameters were estimated only as population values.
Because the WIN55212-2 compound inhibited TLESR events, its drug effect was described with an Imax inhibition model: where Imax, the maximal inhibitory effect, was fixed to 100%, and IC50, the concentration at half of the maximal inhibitory effect, was estimated.
The 1-min count, the 5-min count, and the RTTE models were evaluated based on 2000 Monte Carlo simulations from the final models and the original study design. The time course, but also the variance, of the simulated events was evaluated with visual predictive checks (VPC) (Karlsson and Holford, 2008). Simulated datasets from the RTTE model were re-estimated with the final RTTE and count models in a stochastic simulation and estimation study to assess method performance and provide information on differences between models found when applied to the original data. If the estimation method is appropriate, the resulting estimates should show low bias. Likewise, if a simplification of the data structure is acceptable, both bias and imprecision should not be overly inflated compared with the original data structure. Finally, through stochastic simulations and estimations, the statistical power to detect a drug effect was explored by comparing a count approach with a traditional approach. For this purpose multiple data sets (n = 2000) were simulated using the final model and the same design as the original data, but with a very low dose (0.003 mg/kg). For both approaches it was tested whether a model including an estimated drug effect was significantly better than one that assumed that there was no difference between active and placebo treatments. For both methods the power was taken to be the fraction of data sets that showed a significant difference between placebo and active treatment. Thereafter this process was repeated many times; each time the data were simulated by using an incrementally increased dose, until the calculated power was close to 100% for both methods.
The traditional approach used for this illustration was a linear regression considering the number of counted events over the entire experiment. The drug effect was a constant multiplying the average concentration over the same 45-min time span. The average concentration was obtained as the integral of the individual concentration-time profile over the 45 min divided by 45 min. where E is the drug effect factor and ε is the unexplained variability. Thus, the traditional approach treats the total number of events as a continuous, not a count, variable.
First, individual PK parameters were estimated for each dog on drugs using the averaged concentrations over experiments. Then, PD parameters were estimated in a PK-PD model based on individual PK parameters. Model discrimination and study power were determined through likelihood-ratio tests. Analyses were performed with the Laplace method in NONMEM VI version 2.0 (Beal et al., 1989–2009). Data preprocessing and postprocessing and model diagnostics were completed using R 2.10.1 (R Development Core Team, 2010), Xpose4 4.2.2 (Jonsson and Karlsson, 1999; xpose.sourceforge.net), and PsN 3.2.4 (Lindbom et al., 2005; psn.sourceforge.net).
The individual PK parameters assessed were the following: C0 was estimated to be 21.2, 19.0, 14.9, and 32.8 nM, for the four dogs participating in the experiment, and kel was 0.0304, 0.0341, 0.0332, and 0.0501 min − 1, respectively.
Baseline and Time Course.
The observed average frequency of events was 0.20 min−1 (294 TLESRs/32 experiments/45 min) with vehicle and 0.10 min−1 (66 TLESRs/15 experiments/45 min) in the presence of WIN55212-2. A similar baseline of distributions of events was estimated by the 1-min count model and the RTTE model; the value of the baseline of the mean count λ or of the hazard h was approximately 0.14 min−1 and the exponentially distributed interindividual variability was 29%. The estimated surge parameters were similar between the 1-min count model and the RTTE model. The maximum λ or h (SA) was 3.6 times higher than base, PT was at approximately 9.55 min, the duration of the surge (2 × SW) was approximately 5 min, and the shape parameter, γ, was estimated to a value of 1.6. The latter was significantly better than fixing gamma to the commonly used value of 2. The baseline parameter estimate of the 5-min count model was considerably higher than that of the other models (Table 1).
The PD profile of the compound WIN55212-2 was described as proportional to the baseline. Other models allowing the drug effect to operate, for example, on both the baseline and the surge function did not significantly improve the description of data. The drug effect was quantified almost equally by the three approaches: 1-min count model, 5-min count model, and RTTE Model, (Table 1), with the following typical values: IC50 of 2.43, 2.53, and 2.20 nmol · l−1, respectively. The 5-min count model had the highest imprecision, shown in relative standard error (RSE): 27, 36, and 19%, respectively.
The λ simulated with the 1-min count model and the hazard simulated from the RTTE model presented very similar time-course profiles (Fig. 1). The VPC of the frequency of events enabled the visualization of the distribution of both the observed and simulated events in the experimental time span (Fig. 1). Surges were clearly observed at approximately 10 min; for vehicle data all the median observed proportions of events except two fell within the 2.5th and the 97.5th percentiles of the confidence intervals, whereas for WIN55212-2 data all observations were contained in the 95% confidence intervals.
The data summarized for diagnostic purposes per 5-min time intervals presented up to five events per time interval; therefore, the corresponding VPC (Fig. 2) displayed the proportions for each present frequency. The proportions, while always adding up to one, decreased approximately 10 min for the low frequencies and increased around the same time area for the high ones.
A comparison of observed data with model predictions in the form of Kaplan-Meier plots reveals a good performance of the RTTE model (Fig. 3). Up to the first eight events for each experiment are depicted separately by experiment and treatment arm. These graphs also illustrate the difference in frequency of events between the treatment arms. For example, all dogs on vehicle were experiencing at least five events, whereas some of the dogs on drugs had no more than two events.
Count models make a distribution assumption of counts within intervals that can be explored via the variance and the mean of the counts per individual. A VPC constructed with the ratio variance over mean calculated in five intervals along the experiment (Fig. 4) displayed an increasing value of the ratio of the two quantities over time, which was accurately retrieved in the simulations, for both the placebo and WIN55212-2.
The re-estimation of parameters from simulated datasets (Table 1) resulted in the re-estimated parameters close to the original ones. These low biases indicate model stability as well as estimation method robustness. The Laplace estimation method applied seemed adequate for Poisson models as shown previously (Plan et al., 2009).
Finally, the power to detect a drug effect assessed for the current settings of the study for the 1-min dataset was found to be close to 100% at a significance level of 1%. When decreasing the dose to a third of the level administered in the experiments, i.e., 0.05 mg/kg, the pharmacometric model approach permitted us to ensure a power higher than 80% (Fig. 5). The traditional approach using summary measures for exposure and response systematically displayed a lower power than the dynamic modeling approach. Because of the nonlinearity linking the power to the dose, the difference in power increased while decreasing the dose. Thus at the dose level required to reach 90% power with the traditional approach, a power of 97.5% could be reached with the 1-min modeling approach, and at the one required to reach 80%, it was a 92.5% power that was obtained with the 1-min modeling approach.
Two pharmacodynamic modeling methods, count and RTTE, were implemented and applied to TLESRs. Although these two methodologies were dealing with different aspects of TLESRs, they both adequately described and quantified the pharmacodynamic characteristics of the events present with the vehicle (placebo) and the compound (WIN55212-2).
Improvements carried by such models compared with traditional analyses are undeniable, because they enable a characterization of the time course of the events as a consequence of the time course of exposure. Common ways of analyzing these types of data, such as using dose as exposure measure and average number of events per dose group as response measure (Omari et al., 2006), have several shortcomings: 1) they do not take into account that systemic exposure may differ between subjects receiving the same dose; 2) exposure in terms of systemic drug concentrations will change over time; 3) the frequency of events may change over time; and 4) they do not separate the contributions from underlying biological variability between subjects, stochastic noise, and measurement error. The consequence of these shortcomings is a poorer description of the underlying (patho)physiological and pharmacological processes. It also will lead to lower statistical power to detect drug effects and limited possibilities to make predictions and extrapolations. Models such as those presented in this analysis, on the other hand, do not suffer from these shortcomings.
Furthermore, traditional approaches usually take the dose as the predictor of the drug effect (Lehmann et al., 2002) or the average concentration during the experiment time span (as done in the present study). This disregards the time-variant exposure characterized in detail by pharmacokinetic techniques. In contrast, a dynamic representation of the exposure allows a more appropriate estimation of the correlation between the target-site concentration and the drug effect because the information is more fully preserved.
The applied count model considered data where the whole time span was separated into a series of time intervals. From this viewpoint, it consisted of a refinement of the common statistical approach, which was to study the relationship between the total counted number of TLESRs and the exposure.
A function can be designed to describe the changes in λ caused by changes in the underlying system or a drug effect. In the studied dataset, there was a time region with a high frequency of events observed at approximately 10 min; a surge function was found to be reasonable to express these system dynamics. An inhibitory Imax model, assuming that at high doses the occurrence of events can be fully inhibited, could adequately describe the changes associated with drug concentration. Hence, count models demonstrate certain flexibility, in addition to good simulation properties allowing the PD profile to be retrieved.
The number of time intervals within an experiment of a fixed duration is determined by the length of the time intervals, which reflects the resolution of the count models. In the analyzed study, 5-min intervals displayed low resolution, considering that the events were recorded with a time unit of 1 min. Consequently, the surge determined by the 5-min count model was flatter compared with the one accounted for with the other investigated models. This was mostly caused by an imperfect characterization of the surge width, in reality less than 5 min. There was also a significant bias on the γ determined by the 5-min count model, resulting in a higher value and higher RSE than otherwise determined. Therefore, one can state that the length of the time intervals had an impact on the estimation of the PD parameters.
It was also demonstrated that the interval length affects the power of a study, the more the information is summarized, from single events reported through 5-min time intervals to counts over the entire experiment the lower the statistical power. The power to detect a drug effect could be increased with suitable model-based analysis. In addition, the model-based analysis allowed a more appropriate characterization of the concentration-response relationship compared with traditional methods based on summary exposures. The three Rs (Russell, 1995) define the basis upon which to launch animal research experiments: replacement (by alternatives to animal experiment), reduction (of the number of animals entering experiments), and refinement (of the protocol to enhance animal welfare). Hence, from this perspective, using a dynamic modeling approach ensures the reduction of the number of laboratory animals and/or allows significant drug effects to be detected at lower doses.
The RTTE model describes the time-to-event process on a global perspective by integrating the hazard. The hazard can vary with time and the same surge function was revealed suitably in the RTTE model to describe TLESR variations. Such a surge could not be well characterized by other time-varying hazards functions, e.g., Weibull or Gompertz (Lee and Go, 1997).
RTTE modeling is always theoretically the best method because it treats events with their exact time of occurrence and does not involve any data simplification. The advantage becomes clearer when the hazard varies within time intervals and not only between them. Furthermore, it allows the exploration of whether one event triggers or suppresses the occurrence of subsequent events. This was investigated in the present study, but was not found to be significant (results not included).
The close similarity between the results from the RTTE model and 1-min count model in this study is caused by the fact that 1 min was the minimum time resolution possible considering the data recording methods. In that case, the two approaches are equivalent, and the hazard is equal to λ. The count model can then be used not under a motive of rough measurement data collection, but because of easier model building and faster run times.
In terms of variability, quantification was limited by the fact that only nine animals were included in the study; two to four experiments were performed on each dog. Intraindividual variability was not successfully assessed, but interindividual variability could be estimated for the most important parameter, the baseline. Addressing variability allows for unbiased population parameter estimates and adequate design of future experiments and may facilitate prediction into clinical studies.
A panel of visual evaluation tools that can be used in conjunction with count and RTTE model building methods were illustrated. Diagnostics of discrete events, as treated in the present article, are typically simulation-based and lead to evaluation of both typical responses and variability. The measures of variability are, however, affected by the number of individuals and may not be relevant to be presented when numbers are too low. Consequently, in this work we generally focused on the confidence interval around the predicted median, accurately described by a high number of simulations.
In conclusion, TLESRs from preclinical studies and drug treatment with WIN55212-2 were appropriately modeled with three models representing two different approaches. Both approaches provided a better description of the pharmacodynamic properties of WIN55212-2 with respect to TLESRs than the traditional analysis method ignoring the dynamic features of drug concentration and events.
Participated in research design: Plan, Ma, and Karlsson.
Conducted experiments: Någård and Jensen.
Performed data analysis: Plan and Ma.
Wrote or contributed to the writing of the manuscript: Plan, Ma, Någård, Jensen, and Karlsson.
We thank two anonymous reviewers for valuable comments.
E.L.P. was financially supported by UCB Pharma SA. G.M. was financially supported by Pfizer.
Article, publication date, and citation information can be found at http://jpet.aspetjournals.org.
- transient lower esophageal sphincter relaxation
- lower esophageal sphincter
- repeated TTE
- visual predictive check
- (R)-(+)-[2,3-dihydro-5-methyl-3-(4-morpholinylmethyl)pyrrolo[1,2,3-de]-1,4-benzoxazin-6-yl]-1-naphthalenylmethanone mesylate
- relative standard error
- surge amplitude
- half of the surge width
- peak time.
- Received May 7, 2011.
- Accepted August 31, 2011.
- Copyright © 2011 by The American Society for Pharmacology and Experimental Therapeutics