Abstract
CYP2B6 metabolizes many drugs, and its expression varies greatly. CYP2B6 genotype-phenotype associations were determined using human livers that were biochemically phenotyped for CYP2B6 (mRNA, protein, and CYP2B6 activity), and genotyped for CYP2B6 coding and 5′-flanking regions. CYP2B6 expression differed significantly between sexes. Females had higher amounts of CYP2B6 mRNA (3.9-fold, P < 0.001), protein (1.7-fold, P < 0.009), and activity (1.6-fold, P < 0.05) than did male subjects. Furthermore, 7.1% of females and 20% of males were poor CYP2B6 metabolizers. Striking differences among different ethnic groups were observed: CYP2B6 activity was 3.6- and 5.0-fold higher in Hispanic females than in Caucasian (P < 0.022) or African-American females (P < 0.038). Ten single nucleotide polymorphisms (SNPs) in the CYP2B6 promoter and seven in the coding region were found, including a newly identified 13072A>G substitution that resulted in an Lys139Glu change. Many CYP2B6 splice variants (SV) were observed, and the most common variant lacked exons 4 to 6. A nonsynonymous SNP in exon 4 (15631G>T), which disrupted an exonic splicing enhancer, and a SNP 15582C>T in an intron-3 branch site were correlated with this SV. The extent to which CYP2B6 variation was a predictor of CYP2B6 activity varied according to sex and ethnicity. The 1459C>T SNP, which resulted in the Arg487Cys substitution, was associated with the lowest level of CYP2B6 activity in livers of females. The intron-3 15582C>T SNP (in significant linkage disequilibrium with a SNP in a putative hepatic nuclear factor 4 (HNF4) binding site) was correlated with lower CYP2B6 expression in females. In conclusion, we found several common SNPs that are associated with polymorphic CYP2B6 expression.
CYP2B6 is expressed in human liver and some extrahepatic tissues. CYP2B6 activity in liver microsomes varies more than 100-fold among different persons (Yamano et al., 1989; Ekins et al., 1998). This finding suggests that there are significant interindividual differences in the systemic exposure to a variety of drugs that are metabolized by CYP2B6. The number of substrates that are partially or completely metabolized by CYP2B6 is estimated to be around 70 (Ekins et al., 1998, 1999). CYP2B6 substrates include clinically prescribed drugs such as alfentanil, ketamine, propofol, cyclophosphamide, ifosfamide, nevirapine, efavirenz, bupropion, and tamoxifen; drugs of abuse such as methylenedioxymethamphethamine (MDMA, “ecstasy”) and recreational drugs such as nicotine; and procarcinogens such as the environmental contaminants aflatoxin B1 and dibenzanthracene (Lang et al., 2001). Because differences in systemic exposure to drugs or environmental chemicals metabolically cleared by CYP2B6 may lead to variation in therapeutic and toxic responses to these xenobiotics, it is important to determine all of the genetic variants responsible for altered CYP2B6 expression. This is especially important because there is no validated drug that can currently be used as a probe to identify CYP2B6 activity in vivo and to guide dosing of CYP2B6 substrates with narrow therapeutic indices.
To determine the contribution of CYP2B6 genetic variants to polymorphic CYP2B6 expression, a recent study resequenced CYP2B6 exons in cDNA derived from 35 German Caucasians. Nine single nucleotide polymorphisms (SNPs) including five that caused nonsynonymous amino acid changes, were found (Lang et al., 2001). A 1459C>T substitution in exon 9, which resulted in an Arg487Cys substitution, was associated with CYP2B6 activity that was 8-fold lower in Cys487 homozygotes than in Arg487 homozygotes. However, the frequency of this variant (14%) in a German Caucasian population cannot explain the large variability in CYP2B6 activity. Using expressed cDNAs, another group found that 7-ethoxycoumarin O-deethylase activity of CYP2B6 was greater when the enzyme contained His172 (encoded by exon 4) rather than Gln172 (Ariyoshi et al., 2001). Furthermore, it has been reported that compared with CYP2B6*1, the alleles CYP2B6*4 (Lys262Arg), 2B6*5 (Arg487Cys), 2B6*6 (Gln172His; Lys262Arg), and 2B6*7 (Gln172His; Lys262Arg; Arg487Cys) are associated with a higher intrinsic clearance (Vmax/Km) of 7-ethoxy-4-trifluoromethylcoumarin. Moreover, CYP2B6*2 (Arg22Cys) reportedly does not affect CYP2B6 activity (Jinno et al., 2003).
Although alternatively spliced transcripts of CYP2B6 have been reported, the early splicing studies (Miles et al., 1988; Yamano et al., 1989) did not attempt to fully characterize the splicing of CYP2B6 transcripts, and these studies contained a relatively small number of samples. Neither of the previous genotyping studies of CYP2B6 (Ariyoshi et al., 2001; Lang et al., 2001) identified splice site variations that could lead to alternatively spliced mRNA. However, both studies followed the exon-by-exon sequencing approach of genomic DNA analysis, and splice variants could only be inferred on the basis of canonical splice acceptor and donor nucleotides.
There is a growing awareness that some variation in human gene expression is dramatically influenced by variation in regulation of that expression. CYP2B expression is regulated by the transcription factor CAR (constitutive androstane receptor) (Wei et al., 2000), whose hepatic expression was recently reported to be highly correlated with that of hepatic CYP2B6 (Chang et al., 2003). However, this association needs to be validated in a larger study because liver samples from only 12 human subjects were analyzed.
To further investigate the polymorphic behavior of CYP2B6 and its causes we conducted the study described in the present report. Our objectives were to determine whether alternative splicing is an important determinant of polymorphic expression of CYP2B6 in the liver; to determine whether SNPs in the coding region, introns, or 5′-flanking sequences of CYP2B6 are correlated with altered CYP2B6 expression in the livers of Caucasian, African, or Hispanic Americans; and to determine whether expression of CAR is correlated with CYP2B6 expression in a large number of human livers. Here we report a striking difference in CYP2B6 expression between sexes, newly identified alternatively spliced mRNA variants, several newly found SNPs whose presence is correlated with altered CYP2B6 expression, and an association between CYP2B6 expression and CAR expression.
Materials and Methods
Materials.S-mephenytoin was purchased from Toronto Research Chemicals (North York, Ontario, Canada) and purified at Eli Lilly & Co. (Indianapolis, IN). We bought nirvanol from Ultrafine (Manchester, UK) and phenobarbital and NADPH from Sigma Chemical Co. (St. Louis, MO).
Study Subjects and Liver Samples. Genomic DNA, total RNA, microsomes, or a combination of these materials was prepared from liver samples (University of Pittsburgh/St. Jude Liver Resource). The majority of the livers used in this study were from organ donors and additional relevant information on the donor history is listed in the legends to the tables. The liver samples had been isolated from 51 male subjects, 30 female subjects, and 2 subjects whose sex was unknown. Of the 80 subjects whose ethnicity was reported, 43 were Caucasian (15 females, 28 males), 29 were African-Americans (9 females and 18 males), 7 were Hispanic (3 females, 4 males), and 1 was Asian (female). The ethnicity of three of the samples (2 females and 1 male) and gender for two (both African-Americans) was unknown.
GenBank Accession Numbers. The CYP2B6 promoter sequence is AF 081569 (Sueyoshi et al., 1999; Wang et al., 2003); the CYP2B6 gene sequence, NG_000008 and AC023172; CYP2B6 mRNA, NM_000767; CYP2B6 exon 3A, X16864 (Miles et al., 1990); and CYP2B7 gene, AC008537. Other CYP2B6 sequences used were those with the accession numbers X06399 (Miles et al., 1988), X06400 (Miles et al., 1988), X13494 (Miles et al., 1989), M29873, M29874, and J02864 (Yamano et al., 1989).
Western Blot Analysis. Quantitative immunoblotting of CYP2B6 from human liver microsomes was performed as described on 60 human livers (Ekins et al., 1997). Twenty micrograms of liver microsomes prepared from donor tissue were separated on 7.5% slab polyacrylamide gels, and 0.05 to 5 pmol of human CYP2B6 (supersomes from BD Gentest, Woburn, MA) were included so that a standard curve could be developed as we have done previously (Ekins et al., 1997). Blots were incubated with an anti-CYP2B6 antibody (BD Gentest) and subsequently with a biotinylated rabbit anti-mouse antibody and streptavidin alkaline phosphatase. Bound antibody was detected by treatment with 5-bromo-4-chloro-3-indolyl phosphate/nitro blue tetrazolium (BCIP/NBT) (Kirkegaard and Perry Laboratories, Gaithersburg, MD). The integrated optical density of each band was determined by computer using the public domain NIH Image program (developed at the National Institutes of Health and available at http://rsb.info.nih.gov/nih-image/).
CYP2B6-Mediated Biotransformation ofS-Mephenytoin to Nirvanol. Microsomes from 73 human liver samples were characterized for the CYP2B6 form-selective formation of nirvanol after incubation with S-mephenytoin (Heyn et al., 1996). Duplicate mixtures of 100 μM S-mephenytoin and human liver microsomes (0.5 mg protein/ml) in 0.1 M sodium phosphate, pH 7.4, were preincubated at 37°C for approximately 3 min before the reaction was initiated by the addition of 0.6 mM NADPH (total volume, 500 μl). After a 45-min incubation period (we used linear rate conditions for nirvanol formation with respect to protein concentration and time of incubation), the reaction was stopped with 0.1 ml of 2% sodium azide containing phenobarbital as an internal standard. Liquid extraction was then undertaken with the addition of 5 ml of dichloromethane and a subsequent 10-min mixing period. The sample was centrifuged, the aqueous layer discarded, and the dichloromethane layer transferred to a clean tube. This sample was dried under nitrogen for 20 to 30 min at 37°C.
High-performance liquid chromatography was performed to determine the amount of nirvanol that had formed (Heyn et al., 1996). The sample was resuspended in 160 μl of mobile phase (20 mM perchlorate, pH 2.5; methanol; acetonitrile [ratio of volume of each component: 69:25:6, respectively]).
Forty microliters of the sample was injected into a Nucleosil C18 column (5 μm, 150 × 4.6 mm; MetaChem, Torrance, CA), and the gradient that was used consisted of 100% mobile phase for 0 to 11 min, 85% methanol for 11 to 13 min, and 100% mobile phase for 13 to 24 min. The flow rate was 1.0 ml/min, and UV detection (204 nm) was conducted to monitor the eluants. The retention times for nirvanol and for phenobarbital were 12 min and 13.5 min, respectively. The rate of nirvanol formation was expressed as picomoles per minute per milligram of microsomal protein. The lower limit of quantitation was 20 pmol nirvanol/ml incubation and the upper limit of quantitation was 2000 pmol nirvanol/ml incubation.
Quantitation of mRNA by Real-Time PCR. Total RNA was isolated from human liver samples, and the amount of CYP2B6 mRNA present was quantified by real-time PCR using TaqMan fluorogenic probes as part of an earlier study (Davila and Strom, manuscript in preparation). The CYP2B6 primers were 5′-CACTCATCAGCTCTGTATTCG-3′(sense) and 5′-GTAGACTCTCTCTGCAACATGAG-3′(antisense). Real-time PCR quantitation of CAR and glyceraldehyde-3-phosphate dehydrogenase (GAPDH) mRNAs was carried out by using the QuantiTect SYBR Green PCR kit (QIAGEN, Valencia, CA) according to the manufacturer's instructions. Amplification was done with the ABI PRISM 7900HT Sequence Detection System (Applied Biosystems, Foster City, CA). CAR primers (Chang et al., 2003) (forward 5′-CCAGCTCATCTGTTCATCCA-3′ and reverse 5′-GGTAACTCCAGGTCGGTCAG-3′) were used in PCR in which the initial activation step was conducted at 95°C for 15 min and was followed by 40 cycles in which each cycle consisted of denaturation at 94°C for 30 s, annealing at 57.5°C for 30 s, and synthesis at 72°C for 30 s. The relative amount of CAR mRNA in each human liver sample was normalized to the GAPDH values. Forward and reverse primers used for GAPDH amplification were 5′-ACCACAGTCCATGCCATCAC-3′ and 5′-TCCACCACCCTGTTGCTGTA-3, respectively. Amplification of GAPDH mRNA consisted of an initial activation step at 95°C for 15 min and was followed by 40 cycles in which each cycle consisted of denaturation at 94°C for 30 s, annealing at 60°C for 30 s, and synthesis at 72°C for 1 min. The CAR mRNA and CYP2B6 mRNAs were each normalized to GAPDH to control for quality of the mRNA. Although there is always a statistical risk when data are normalized to a single gene, GAPDH mRNA is the one most frequently used to normalize values of gene expression in liver tissue, and in one study its expression varied less than that of 18S rRNA in the same set of human liver samples (K. Thummel, personnel communication).
Reverse Transcription-PCR and Amplification ofCYP2B6 cDNA. Because the nucleotide sequence of CYP2B6 is approximately 95% similar to that of CYP2B7 (Yamano et al., 1989), mismatched primers were designed to selectively amplify CYP2B6. PCR primers were designed with PRIMER3 (http://www.genome.wi.mit.edu/cgibin/primer/primer3.cgi), and their sequence homology and specificity were checked by using BLASTn (http://www.ncbi.nlm.nih.gov).
To amplify CYP2B6, first-strand cDNA was synthesized from 3 to 5 μg of total RNA from human liver samples according to the manufacturer's instructions (SuperScript Preamplification System for First-Strand cDNA Synthesis, Invitrogen, Carlsbad, CA). Input CYP2B6 cDNA (1 μl) was amplified in a total volume of 50 μl consisting of 10× PCR buffer with 1.5 mM MgCl2, 50 ng of DNA, 10 pmol of each primer (utrF2 and R2, Table 1), 0.2 mM dNTP (Invitrogen), and 2.5 U of TaqDNA polymerase (Expand High-Fidelity PCR system, Roche Diagnostics, Mannheim, Germany). After an initial denaturation step at 94°C for 1 min, 32 cycles of PCR were performed; each cycle consisted of denaturation at 94°C for 30 s, annealing at 57–58°C for 30 s, and synthesis at 72°C for 1 min. The last cycle was followed by a final extension step at 72°C for 10 min.
If the amount of first-strand cDNA in the sample was low, 5 and 10 μl of first-strand cDNA samples were used in the first round of amplification with the primers CYP2B6/utrF1/R1 (Table 1), and the amplified products then underwent a second round of nested PCR using the primers CYP2B6/utrF2/R2 (Table 1).
CYP2B6 cDNAs were analyzed by agarose (1%) gel electrophoresis. If a single band that was the expected size of the CYP2B6 transcript was observed, the amplification products were incubated with shrimp alkaline phosphatase and exonuclease I (USB, Cleveland, OH) for 30 min at 37°C to remove unincorporated nucleotides and primers; the enzymes were inactivated by incubation at 80°C for 15 min before sequencing. If multiple bands were observed in the agarose gel, individual bands were excised from the gel. The DNA was extracted from the gel sections by using a Zymoclean Gel DNA recovery kit (Zymo Research, Orange, CA), eluted in water, and sequenced by using the PCR primers.
To obtain shorter amplicons of CYP2B6 for further identification of splice variants, first-round PCR products were amplified by second-round nested PCR in which various combinations of PCR primers (Table 1) were used. Sequencing was carried out on an ABI Prism 3700 Automated Sequencer (Applied Biosystems) by using the PCR primers.
PCR Amplification and Sequencing ofCYP2B6Exons with Flanking Introns. PCR was designed to amplify the 9 exons of CYP2B6 and approximately 150 bp of each intron that flanked the adjacent exon. PCR of genomic DNA was similar to that of cDNA, except that the initial denaturation step was at 95°C for 10 min and the number of cycles was 35.
PCR Amplification and Sequencing of theCYP2B6Promoter Region. We used the primers (CYP2B6/5′UTRF/R, Table 1) in a single reaction to amplify the 2.4-kb proximal promoter region of CYP2B6 from genomic DNA. The products were treated with SAP-exonuclease and sequenced by using five pairs of promoter-sequencing primers (Table 1). The primers CYP2B6/XREMF1/R1 were used to amplify and sequence the region from –7.9 to–8.6 kb; the primers CYP2B6/XREMF2/R2 were used to amplify and sequence the region from –8.3 to –8.95 kb.
Analysis ofCYP2B6Sequences. The Phred-Phrap-Consed package (University of Washington, Seattle, http://droog.mbt.washington.edu/PolyPhred.html) was used to assemble the sequences. This program automatically detects the presence of heterozygous single nucleotide substitutions using fluorescence-based sequencing of PCR products. To represent a true variant, the SNP had to be present in sequences generated by the forward and reverse primers, and the presence of an SNP in a sample was confirmed by repeating the PCR and sequencing. Sequences were also analyzed with the Lasergene software (DNASTAR, Inc., Madison, WI). The wild-type and variant sequences were analyzed by a splice prediction program (http://www.fruitfly.org/seq_tools/splice.html) to determine whether any identified SNPs might have contributed to aberrant splicing. The promoter sequences were searched by the program TESS (Transcription Element Search System; http://www.cbil.upenn.edu/tess/) to detect the disruption of any putative transcription factor-binding sites. Nuclear hormone receptor binding motifs were detected using the computer program NUBIscan (http://www.nubiscan.unibas.ch) (Podvinec et al., 2002).
Statistical Analysis to Evaluate Possible Genotype-Phenotype Relationships. Phenotypes were compared between gender and among racial groups with the exact Wilcoxon and exact Krusal-Wallis tests, respectively. Phenotypes were compared between two genotypes and among three genotypes with the exact Wilcoxon and exact Krusal-Wallis tests, respectively. Proc-StatXact for SAS users, statistical software for exact nonparametric inference (1997) (CYTEL Software Corporation, Cambridge, MA) was used for statistical analysis. When the overall test among three genotypes (homozygous major, heterozygous, homozygous minor) was significant at an α level of 0.10 or less, pairwise comparisons were conducted. P values were not adjusted for multiple comparisons because we did not want such strict control on the type 1 error rate, since this would be expected to miss a potentially significant factor or association.
Linkage Disequilibrium. The significance of linkage disequilibrium between pairs of polymorphic sites was assessed by using Fisher's exact test as described (Weir, 1996) to analyze genotypic data.
Determination of Haplotype Frequencies in Different Ethnic Groups. To calculate the haplotype frequencies, we used the six loci for which the maximal amounts of genotyping data were available. The haplotype frequencies were estimated on the basis of the expectation-maximization algorithm (Excoffier and Slatkin, 1995). Haplotype frequencies were determined separately for each ethnic group by counting the estimated haplotypes in each ethnic group. The association between haplotypes and the different CYP2B6 phenotypes were evaluated on the basis of a score test from the haplo.score package (http://www.mayo.edu/statgen), which tests for association between traits and haplotypes when linkage phase is ambiguous and which is based on generalized linear models applicable to a number of traits (Schaid et al., 2002). The haplotype frequencies for females and males were also estimated separately and together. Because the current version of the haplo.score package could not cope with a covariate, i.e., sex, we reconstructed individual haplotypes by PHASE (Stephens et al., 2001) and used Wilcoxon's rank sum test to compare the phenotypes between group 1 (with at least one haplotype) and group 2 (without the haplotype).
Results
The content of CYP2B6 mRNA, protein, and activity were quantitated in human livers. Real-time quantitative PCR revealed a large degree of variation in the relative levels of CYP2B6 mRNA (Table 2, Fig. 1a). The mean relative value for 22 females (1016 ± 2557) was 3.9 times greater than that for males (259 ± 676). Among females, Hispanics had the highest mean relative level of CYP2B6 mRNA. Among males, Caucasians and Hispanics had similar mean relative values but they were greater than the mean relative value for African-American males.
Quantitation of CYP2B6 Protein. CYP2B6 protein concentration was determined in human liver microsomes (Table 2, Fig. 1b). Fifty-nine of the 60 human liver samples contained detectable amounts of CYP2B6 protein (not shown). Females had a mean CYP2B6 protein level that was 1.6 times higher than the mean CYP2B6 protein level in male livers. Among females the highest mean CYP2B6 protein levels were found in Hispanics. In contrast, Hispanic males had a mean level of CYP2B6 protein that was similar to that of Caucasians, but higher than the mean value for African-American males.
Quantitation of theN-Demethylase Activity of CYP2B6. It has been conclusively demonstrated, using specific monoclonal antibodies and inhibitors, that the extent of N-demethylation of S-mephenytoin (i.e., the biotransformation of S-mephenytoin to nirvanol) is a specific indicator of the level of CYP2B6 activity (Heyn et al., 1996; Ekins et al., 1998; Ko et al., 1998). Therefore, we measured CYP2B6 activity in the 60 samples described in the preceding section and in samples from 13 African-Americans by measuring the amount of nirvanol formed by the N-demethylation of S-mephenytoin (Table 2, Fig. 1c). Our high-performance liquid chromatography method did not detect N-demethylation of mephenytoin in two (7.1%) of 28 liver samples from females and in nine (20%) of 45 samples from males. CYP2B6 enzyme activity in liver tissue of females was 1.7 times greater than that in liver tissue of males. CYP2B6 activity was 3.6 and 4.9 times higher in Hispanic females than in Caucasian females and African-American females (P < 0.0215 and P < 0.0381, respectively). Among the males, Hispanics had a mean activity level that was lower than the activity in Caucasians and African-Americans.
Correlation of Hepatic CYP2B6 mRNA Level, Protein Quantity, and Enzymatic Activity. A very good positive correlation was seen between CYP2B6 activity level and CYP2B6 protein quantity in females (r2 = 0.88) and in males (r2 = 0.73). Similarly, a good positive correlation was found between the relative level of CYP2B6 mRNA and the amount of CYP2B6 protein in liver samples from females (r2 = 0.80), but the positive correlation between the two in liver samples from males was not as great (r2 = 0.60). Relative values were also available for CYP1A2, CYP2D6, CYP4A11 (not shown), and CAR mRNAs (all normalized to the level of GAPDH mRNA, which served as a control for mRNA quality). CYP2B6 mRNA expression was not correlated with expression of CYP1A2, CYP2D6, or CYP4A11 mRNA. Because all of the relative quantities of mRNAs were normalized to that of GAPDH, there was minimal chance that differences in RNA quality among the samples led to spurious associations. If data for all persons were included in the analysis, the relative level of CYP2B6 mRNA was not correlated with that of CAR mRNA (r2 = 0.19). However, CAR and CYP2B6 mRNAs were highly correlated (r2 = 0.76) if we excluded data for two subjects who were receiving CYP2B6 inducers (dilantin and fluconazole) at the time of tissue donation and for one subject whose mRNA was highly degraded. The relative level of CAR mRNA was higher in females than in males (Fig. 1d).
Alternative Splicing ofCYP2B6mRNAs. Although alternatively spliced transcripts of CYP2B6 have been reported, it was unclear whether alternative splicing is an important determinant of polymorphic expression of CYP2B6 in the liver. Therefore, CYP2B6 exons 1 through 9 were amplified from mRNA isolated from the same 60 human liver samples phenotyped for mRNA and protein expression, and the PCR products were analyzed by agarose gel electrophoresis (Fig. 2a). When we conducted one round of amplification of full-length CYP2B6 cDNA in samples with high, intermediate, and low CYP2B6 mRNA levels in representative liver samples, the CYP2B6 cDNA of the expected size (∼1500 bp) was amplified primarily in those samples with the highest level of CYP2B6 mRNA (Fig. 2b). CYP2B6 mRNA was more readily detected in the liver samples from females than in those from males. Notably, although only five samples from Hispanics were among the 60 samples analyzed, samples from two of the three female Hispanics and from one of the two male Hispanics had the highest level of CYP2B6 mRNA (Fig. 2a).
To detect CYP2B6 mRNA in samples that contained the lowest amounts of CYP2B6 mRNA (as indicated by real-time PCR), the amount of template cDNA was increased for first-round amplification, and a second round of amplification with nested primers was performed. Under these robust conditions, the expected CYP2B6 cDNA was found only in those samples with the highest level of CYP2B6 mRNA, whereas several smaller cDNAs (the chief variant was 1000 bp long) were seen in many of the samples (Fig. 2b). The level of variant cDNAs increased as the level of CYP2B6 cDNA representing properly spliced CYP2B6 mRNA decreased.
To obtain shorter CYP2B6 transcripts that could be more readily characterized and to avoid amplifying previously reported CYP2B6 splice variants in which portions of exon 8 were deleted (Miles et al., 1989), we used primers identical to sequences in exons 2 and 7 to amplify the CYP2B6 cDNA. Apart from the cDNA of the expected size, a variant cDNA of approximately 500 bp (splice variant 1, SV1) was observed (Fig. 3a). This variant lacked sequences corresponding to CYP2B6 exons 4, 5, and 6. Sequencing of intron 3 and exon 4 in genomic DNA from the liver samples that expressed SV1 revealed several SNPs. The exon 4 genotype 15631G>T was most frequently associated with SV1 (Fig. 3a). The SNP in intron 3 was not independently related to SV1 production, but the ratio of SV1 to full-length cDNA was greater in persons carrying the combination of the exon 4 (15631G>T) and intron-3 (15582C>T) genotypes than in those with only one of the two SNPs (Fig. 3a).
Amplification was done to check for the presence of a previously reported SV that includes 44 nucleotides from intron 3 (exon 3B) but lacks the first 29 nucleotides of exon 4 (Yamano et al., 1989; Miles et al., 1990). We found four SVs (SV2–5) (Fig. 3b), several of which had not been reported previously. These cDNAs represented variant mRNAs that included either 44 bp (exon 3A) (SV4, SV5) or 130 bp (exon 3B) (SV2, SV3) of sequence in intron 3. Some cDNAs representing alternatively spliced transcripts contained exon 4 (SV2, SV4), and others contained all of exon 4 except the first 29 bp (SV3, SV5) (Fig. 3c). Because of the presence of premature termination codons, all four variants encoded truncated proteins.
CYP2B6 exons 7 through 9 were also amplified from the same 60 liver cDNAs. Two previously reported SVs (Miles et al., 1989; Yamano et al., 1989) were observed in approximately half of the samples. One of these variants lacked sequences corresponding to exon 8, whereas the other variant lacked sequences for exon 8 but contained a 58-bp insertion (exon 8A) from intron 8 (Fig. 3d). Both variants contained premature termination codons.
Analysis of Splice Sites and Skipping of Exons 4 through 6. We used an information theory-based model to determine the information content of the splice donor and acceptor site sequences and whether the strength of the splice site (i.e., the Ri value) was related to alternative splicing (Table 3). Unlike other measures of binding strength, the splice site information we used is related to the thermodynamic entropy and therefore the free energy of binding (Schneider, 1997; Rogan et al., 1998; Thompson et al., 2002). One basis for alternative splicing of exons 4 through 6 appears to be that the skipped upstream acceptor site in intron 3 had a lower information content (Ri = 7.0 bits) than the ultimate downstream acceptor site intron 6 (Ri = 10.6 bits) (Table 3). For the cryptic exons, all of the splice sites (except the exon 8A acceptor site) had nominal Ri values (Ri values greater than the minimum information content required for splicing, i.e., 1.6 bits). Nevertheless, the acceptor site for exon 3A is stronger than the exon 3 acceptor site; this feature is similar to that of the cryptic exons in CYP3A5 (Rogan et al., 2003). Thus, the higher information content of some of the cryptic splice sites is related to their enhanced use in splicing and to the inclusion of some cryptic exons (e.g., exon 3A) in the CYP2B6 mRNA.
Because the most common SV lacked exons 4 through 6, we further analyzed genomic DNA for SNPs surrounding exon 4 to see whether they might be predicted to affect splicing. Our analysis indicated that the exon 4 SNP led to several differences in splicing. First, the 15631G>T SNP disrupts a cryptic AG splice acceptor site that is used in creating SV3 and SV5, which lacks the first 29 bp of exon 4. Second, the exon 4 SNP 15631G>T appears to lie within an exonic splicing enhancer (ESE). ESEs enhance pre-mRNA splicing, are present in most exons within approximately 30 bp of the exon border, and serve as binding sites for the serine/arginine-rich (SR) proteins, which are thought to promote exon definition by either directly recruiting the splicing apparatus or antagonizing the action of nearby silencer elements (Cartegni et al., 2002). The 15631G>T substitution in exon 4, the most frequently observed exonic SNP, was located approximately 30 bp from the 5′ end of exon 4. Computer analysis of the sequence CCA[G/T]TCC, which surrounds this exon 4 SNP, indicated that the sequence is a potential binding site for the SR protein srp40. The threshold value for srp40 binding is 2.67. The wild-type sequence of exon 4 has a score of 2.85, and replacement of G with T drops the score to 2.19. Thus, the ESE in exon 4 promotes appropriate splicing of exon 4, but the SNP in this ESE leads to the skipping of exon 4 (Fig. 3a).
Analysis ofCYP2B6Exons and Their Flanking Introns. A total of seven exonic SNPs were identified in CYP2B6 (Tables 4 and 5), and five were missense mutations. The most frequent SNP was the 15631G>T substitution in exon 4 and resulted in the Gln172His change. Although analysis of DNA from German Caucasians indicated that the Gln172His substitution was always associated with the nonsynonymous SNP (Lys262Arg) in CYP2B6*6 and CYP2B6*7 (Lang et al., 2001), we found that the Gln172His replacement was infrequently associated with other nonsynonymous SNPs and represented a newly identified allele, CYP2B6*9. The previously reported 64C>T and 12740G>C SNPs in exons 1 and 2, respectively, were found to be linked. Both SNPs occurred at frequencies of 9% and 14% in Caucasians and Hispanics, respectively, but were absent in African-Americans. A newly identified missense SNP, 13072A>G in exon 3, results in a Lys139Glu change and represented the previously unknown allele CYP2B6*8. This SNP was found only in a single Caucasian. In exon 5, two SNPs were identified: a silent 18000C>T and a 18053A>G that resulted in a Lys262Arg replacement. The Lys262Arg allele (CYP2B6*4) was found in 5% of Caucasians, 17% of African-Americans, and 14% of Hispanics. This finding was in contrast to an earlier report of a frequency of 32.6% in German Caucasians. The last exonic SNP that we observed was the previously reported 25505C>T substitution, which leads to an Arg487Cys change and was found in 13% of Caucasians and 9% of African-Americans, but was absent in Hispanics and the single Asian subject.
Analysis of SNPs inCYP2B65′Flanking Sequences. Because SNPs in DNA regulatory sequences can have an impact on gene expression, we resequenced a total of 3.8 kb (the proximal –2.4 kb and a distal 1.2 kb (from –7.9 to –8.95 kb) of the 5′-flanking region. The region resequenced included the proximal CAR-binding site (the phenobarbital-responsive enhance module, PBREM) and the distal one (xenobiotic-responsive enhancer module, XREM) (Wang et al., 2003). We identified 10 SNPs, nine of which were present in at least 1% of subjects in an ethnic group (Fig. 4). No SNPs were discovered in either the PBREM or the XREM. The most frequent of these SNPs was a –750T>C substitution, which occurred in 48%, 83%, and 79% of Caucasians, African-Americans, and Hispanics, respectively (Table 4). The next most common alleles were those that possessed the –1456T>C and the –2320T>C changes. The SNP at –2320 lies within a putative HNF4-binding site, whereas the SNPs at –750 and –757 lie within putative binding sites for HNF1 and Sp-1, respectively (Fig. 4). The SNP at –8427 (T>C) resides in a glucocorticoid receptor-binding site (T/Cgtgtc). The SNP at –8207 is immediately downstream of a glucocorticoid receptor-binding site and within a putative nuclear hormone receptor binding motif (DR2) as predicted by Nubiscan (Podvinec et al., 2002).
Several SNPs appear to be specific to particular ethnic groups. For example, the SNP at –1224 and the SNPs in exons 3 and 5 were found only in Caucasians. African-Americans and Hispanics had several SNPs that were not seen in Caucasians (e.g., SNPs at –8427, –1578, –757). The exon 9 SNP was not present in any of the Hispanics studied.
Association betweenCYP2B6Genotype and Phenotype. Inclusive recruitment from racially diverse populations without stratification can confound association studies because of the variance in allele frequency or the presence or absence of variants in one population but not in another. Therefore, we determined whether genotypes were associated with phenotypes in the entire study population and in stratified populations. CYP2B6 genotype-phenotype associations were not observed when data from the entire study population were analyzed. However, when the phenotypes were first stratified for sex alone or for sex and ethnicity, associations between phenotypes and genotypes were observed (Table 5). Three SNPs were significantly associated with CYP2B6 expression (Table 6). The Arg487Cys nonsynonymous change resulted in significantly lower CYP2B6 activity in Caucasian females who were heterozygous for the Arg487 allele. The SNP in the intron 3 branch site was also associated with lower levels of CYP2B6 activity in female subjects. For example, compared with those homozygous for the reference allele (15582CC), those subjects who were heterozygous (CT) (n = 32) and homozygous (TT) (n = 5) for the variant had 1.46-fold and 1.85-fold lower levels of CYP2B6 protein, respectively.
Additional SNPs were correlated with decreased CYP2B6 protein levels, but their association did not reach statistical significance. The –750T>C SNP in the putative HNF1-binding site showed a modest effect on CYP2B6 protein levels. Compared with persons homozygous for the –750T allele change, those persons with one or two alleles containing this –750C SNP had 1.42-fold and 1.81-fold, respectively, less CYP2B6 protein (Fig. 5). The SNP in exon 1, which resulted in the Arg22Cys substitution, was associated with higher CYP2B6 activity in all females but not in males. A potential confounding factor was that some samples were from persons who may have been undergoing treatment with CYP2B6 inducers at the time that the liver tissue was donated. However, the dose and duration of exposure were not known, and no consistent effect on CYP2B6 expression could be seen among those with potential exposure to CYP2B6 inducers. Finally, we made no adjustments for multiple statistical comparisons because we did not want such strict control on type I error rate because this would increase the chance of missing a potential significant association. An additional confounding factor inherent in our study is that African-Americans and Hispanic-Americans are heterogeneous populations because of the genetic admixture with European-Americans. Thus, it will be important to verify these results with larger studies designed with sufficient power to confirm these associations.
Linkage Disequilibrium (LD) and Haplotype Structure ofCYP2B6. In our study, LD referred to the preferential association of two SNPs within a population. Fisher's exact test was used to detect significant pairwise LD between many of the SNPs in CYP2B6 (Fig. 6). LD was highly significant between more sites in Caucasians than in African-Americans or Hispanics (Fig. 6). This result may be due to the lower number of African-American subjects or may reflect the fact that a greater number of haplotypes are typically seen in African-Americans. A high degree of linkage was observed between the SNP in the HNF4-binding site in the promoter (–2320T>C) and the SNP in the intron 3 branch point (15582C>T).
Using the six most common SNPs and multisite haplotype inference (Table 7), we identified 11 total and 8 main haplotypes. Haplotype refers to a series of SNPs that are linked on a single chromosome. Haplotypes I, III, IV, and VII accounted for 94% of the observed haplotypes in Caucasians. Haplotypes I through VII accounted for 95% of the haplotypes in African-Americans. Haplotype III (–2320T/int3 branch point C) was associated with high levels of CYP2B6 protein in Caucasian females (P < 0.049), but haplotype IV (–2320C/int3 branch point T) was associated with low quantities of CYP2B6 protein in Caucasian females (P < 0.035). Haplotype VII (exon 9, 1459C, Cys487) was significantly associated with low CYP2B6 activity in Caucasian females (P < 0.004).
Effect(s) of Nonsynonymous SNPs inCYP2B6on CYP2B6 Protein Function. One new way to prioritize those SNPs that might have a functional consequence is to use computer programs such as SIFT (sorting intolerant from tolerant) (Ng and Henikoff, 2002), a sequence homology-based tool that evaluates whether amino acid changes are in evolutionarily conserved or unconserved regions, and thus predicts whether any of the nonsynonymous SNPs would affect protein function (Ng and Henikoff, 2002). Therefore, we used this program to identify those SNPs that are most likely to affect CYP2B6 protein function. Our analysis of the amino acid sequences of CYP2B6 and those of 137 other CYP2 proteins predicted that only the exon 3 SNP that results in Lys139Glu would affect the function of CYP2B6 (Fig. 7). Interestingly, this substitution occurs at an amino acid (Lys) that is conserved in CYP2B in all species studied so far and is also conserved in members of the CYP2C family, e.g., human CYP2C9 and CYP2C19, and rabbit CYP2C5.
Many base changes resulted in replacement of a highly conserved amino acid by a different (but biochemically similar) amino acid present in the same position in another species (e.g., exon 4 CYP2B6 Q172 was changed to H172, which is found in monkey, pig, dog, and rabbit CYP2B orthologs at this position). Although none of the amino acid changes was found in the six putative substrate recognition sites of CYP2B6 (Gotoh, 1992), some investigators have suggested that mutations outside of the substrate recognition sites can affect substrate activity (Domanski et al., 1999). The Arg22Cys substitution occurs in the membrane anchor region, whereas the Gln172His replacement occurs in the amino acid hinge between the D′ and E helices. There are only four amino acids in CYP2B6 that are in regions that are highly evolutionarily unconserved across species. Amazingly, the Arg487Cys occurs in one of these evolutionarily unconserved amino acids and was not predicted by the SIFT program to affect protein function. Nevertheless, the SNP responsible for this amino acid change is associated with low levels of CYP2B6 protein and activity (Table 6) (Ariyoshi et al., 2001; Auboeuf et al., 2002). Although recent reports have reported SIFT as the best predictor of SNPs likely to affect function (Leabman et al., 2003) it is important to consider that SIFT analysis provides only one context in which substitutions that are likely to affect CYP2B6 expression can be identified.
Discussion
We conducted a study to discover SNPs in CYP2B6 because this enzyme metabolizes a growing number of clinically important medications and because there is considerable variation in its expression in humans; this variation is believed to be due to genetic factors such as SNPs. Although a previous study identified polymorphisms in German Caucasians (Lang et al., 2001) and an Arg487Cys variant has been shown to be associated with low CYP2B6 expression in some persons, the frequency of this variant allele in the population could not explain the extent of variation in human CYP2B6 expression. In our study the Arg487Cys variant was also associated with low expression in liver tissue, but only in that of females. In addition, differences in CYP2B6 expression in the liver tissues of females was associated with a SNP in the intron 3 branch site (15582C>T) of CYP2B6 and a SNP in a putative HNF4-binding site (– 2320T>C) in the promoter.
Our strategy of SNP discovery included the resequencing of mRNAs from liver samples of subjects from different ethnic groups as a means to discover sequence variation and to simultaneously characterize alternatively spliced mRNAs. Indeed, the completion of the draft sequence of the human genome has led to the idea that alternative splicing is an important mechanism of generating the complexity of proteins seen in humans. Analysis of alternatively spliced mRNAs can also reveal transcripts that are associated with polymorphic expression of genes (e.g., CYP3A5 (Kuehl et al., 2001)). We found CYP2B6 cDNAs that represented numerous aberrantly spliced transcripts, several of which had been previously identified (Miles et al., 1988, 1989; Yamano et al., 1989). The main SV, SV1, lacked sequence corresponding to exons 4, 5, and 6 of the CYP2B6 gene and was found in many of the samples analyzed. The presence of a SNP in exon 4 and an SNP in intron 3 were correlated with the appearance of SV1, and the intron 3 SNP was highly associated with CYP2B6 phenotype.
The intron 3 SNP appeared to enhance the proportion of mRNA transcripts skipping exons 4–6 in those persons with the exon 4 ESE SNP (Fig. 3a). RNA splicing depends not only on the donor and acceptor sequences at the exon/intron boundaries, but also on intronic sequences (branch sites) located within 40 nt of the AG splice acceptor. An invariant adenine of the branch site consensus region undergoes nucleolytic attack by the terminal G nucleotide of the splice donor site. The intron 3 15582C>T alters the consensus C that is immediately adjacent 3′ of the invariant A in the putative branch site sequence (Fig. 8). Previous studies have shown that mutations in many of the branch site nucleotides can abolish splicing or lead to a reduction in the proper splicing and to switching of the pattern of 3′ splice site selection even if the conserved A residue is not affected.
It is difficult to know the exact proportion of any of the CYP2B6 SVs in the pool of CPY2B6 transcripts because most of the SVs contained premature termination codons and therefore would probably undergo nonsense-mediated decay. This possibility could explain the relatively low abundance of these SVs. Several factors may have contributed to the skipping of exons 4 through 6. Splice site selection appeared to be related to both the SNP in the exon 4 ESE and the strength of the interaction between the splice site and the spliceosome, because the skipped intron 3 acceptor site was weaker (i.e., it had a lower Ri value) than the intron 6 acceptor site used in the alternative transcript. Other reports have shown that there is a preference for stronger acceptor sites in splicing (Rogan et al., 1998; Thompson et al., 2002).
The ability to relate the exon 4 SNP to CYP2B6 phenotype is complicated by the possibility that the exon 4 SNP has pleiotropic effects (Fig. 8). The disruption of the ESE by the SNP in exon 4 leads to the skipping of exon 4 and the generation of an alternatively spliced CYP2B6 mRNA. This SNP would be predicted to be associated with a smaller pool of appropriately spliced CYP2B6 mRNA and decreased CYP2B6 activity. However, the exon 4 SNP also disrupts a cryptic acceptor 3′ splice site in the reference sequence (AG/T). The cryptic 3′-splice acceptor AG is used to create some of the SVs that lack the first 29 bp of exon 4 (Fig. 3b). If the exon 4 SNP did not exert pleiotropic effects, then it would be predicted to decrease the production of these SVs, and this decrease would be expected to result in more of the appropriately spliced CYP2B6 mRNA. The exon 4 SNP, which results in the Gln172His substitution, was recently introduced into CYP2B6 cDNA, and the protein encoded by this variant allele had higher 7-ethoxycoumarin O-deethylase activity than the protein encoded by the reference allele (Ariyoshi et al., 2001; Jinno et al., 2003). These types of cDNA-based assays were unable to determine whether the exon 4 SNP affected splicing of the pre-mRNA. The results of the phenotype analysis in our study population showed that there was no relationship between this exon 4 SNP alone and CYP2B6 activity.
Our ability to identify the functionally important CYP2B6 genotypes is hampered by the fact that many of the SNPs are in LD and in multiple haplotypes. For example, a high degree of linkage was observed between two SNPs (e.g., the –2320T>C SNP in the putative HNF4-binding site and the intron 3 branch point SNP). Because multiple haplotypes exist, the SNPs in LD in one haplotype may or may not be associated with SNPs in another haplotype. It must also be considered that some phenotypes are the sum of regulatory and splicing genotypes, as there is growing evidence that transcription and splicing are coincident. Because transcriptional regulation can be important for splicing choices (Maniatis and Reed, 2002), SNPs in the HNF4 binding site may influence splice-site selection.
It is possible that CYP2B6 transcription and splicing are coordinately regulated by nuclear hormone receptors and their coactivators (Auboeuf et al., 2002). This possibility is particularly intriguing in light of the recent identification of PGC-1α (peroxisome proliferator-activated receptor 1α) as a coactivator of the gene coding for the CAR receptor (a master regulator of CYP2B) (Shiraki et al., 2003). PGC-1α contains an SR domain (found in the family of splicing factors that bind to ESEs) and an RNA-binding domain, and thus may function to help couple splicing to transcription. PGC-1α colocalizes with splicing factors in the nucleus and interacts with SR proteins. CYP2B6 promoter-dependent recruitment of regulatory splicing factors, such as PGC-1α, to CAR could influence CYP2B6 transcription and the selection of splice sites.
Significant differences in CYP2B6 expression in liver were found between sexes. Indeed, CYP2B6 activity was below quantifiable limits in 7.1% of female and in 20% of male liver microsomes. This difference appears to translate to humans in vivo because a difference between sexes in metabolism of the anticancer drug ifosfamide, a CYP2B6 substrate, has been reported (Schmidt et al., 2001). The relative level of CAR mRNA, a master regulator of CYP2B (Wei et al., 2000), is higher in females than in males (Fig. 1), and could directly affect differences between the sexes in the rate of CYP2B6 transcription. In rodents, sexually dimorphic expression of hepatic CYP2B is regulated by the pattern of growth hormone secretion (Teglund et al., 1998). In humans, the patterns of growth hormone secretion are sexually dimorphic (Jaffe et al., 1998), and growth hormone secretion is a regulator of CYP expression (Jaffe et al., 2002). However, whether the differences between the sexes in growth hormone secretion in humans regulate either CAR or CYP2B expression remains to be determined. CYP2B6 is the main enzyme involved in the hydroxylation of efavirenz (an inhibitor of HIV-1 reverse transcriptase) (Ward et al., 2003) and in the bioactivation of ifosfamide (Schmidt et al., 2001). Therefore, the effectiveness of efavirenz and ifosfamide may be lower and higher, respectively, in females (especially Hispanic females) than in males.
It was important to look for relationships between CYP2B6 genotype and phenotype not just in the combined study population, but also in groups stratified on the basis of sex and of ethnicity, as others have done (Stengard et al., 2002). This stratification according to sex and ethnicity was crucial because our results clearly demonstrated that liver tissues of females express significantly higher amounts of CYP2B6 than do liver tissues of males, that CYP2B6 expression varied among the different ethnic groups, and that the frequencies of the SNPs and haplotypes differed among ethnic groups. The reason that we detected a genotype-phenotype association in females but not in males may be due to the high level of CYP2B6 expression in females and the low level in males. However, the CYP2B6 variability may depend on a combination of SNPs that differs in each ethnic group, and that a particular SNP may depend on another gene product (e.g., CAR) or environment (e.g., sex hormones) to exert its effect. Our results also suggest that no single SNPs are associated with CYP2B6 activity in both sexes. Therefore, we should not necessarily be asking which SNPs are absolute markers of CYP2B6 phenotype, but rather what are the genetic variants associated with CYP2B6 expression in each ethnic population stratified by sex.
Acknowledgments
We thank the Hartwell Center for much of the DNA sequencing, Dr. Sean Ekins for insightful comments, and Wenjian Yang for help with the haplotyping.
Footnotes
-
ABBREVIATIONS: SNP, single nucleotide polymorphism; CAR, constitutive androstane receptor; PCR, polymerase chain reaction; GAPDH, glyceraldehyde-3-phosphate dehydrogenase; SV, splice variant; ESE, exonic splicing enhancer; SR, serine/arginine-rich; PBREM, phenobarbital-responsive enhancer module; XREM, xenobiotic-responsive enhancer module; HNF, hepatic nuclear factor; Sp-1, simian virus 40 promoter factor 1; LD, linkage disequilibrium; SIFT, sorting intolerant from tolerant; PGC-1α, peroxisome proliferator-activated receptor 1α.
-
Normal human liver and hepatocytes were obtained through the Liver Tissue Procurement and Distribution System (Pittsburgh, PA), which was funded by National Institutes of Health (NIH) Contract N01-DK-9-2310.
-
This work is supported in part by NIH Grant GM60346; by the NIH/National Institute of General Medical Sciences (NIGMS) Pharmacogenetics Research Network and Database (U01GM61374, http://pharmgkb.org) under grant U01 GM61393; by NIH Grant P30 CA21765; by NIH Grant ES 10855; by NIH Grant CA51001; and by the American Lebanese Syrian Associated Charities (ALSAC).
-
DOI: 10.1124/jpet.103.054866.
- Received May 21, 2003.
- Accepted August 22, 2003.
- The American Society for Pharmacology and Experimental Therapeutics