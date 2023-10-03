Our study aimed to investigate if genetic variants around 16p13.3’s HBA1 locus, associated with erythrocyte indices and HbA1c levels, predict α-thalassemia-related erythrocyte indices, cardiometabolic traits, and diabetes risk in Taiwanese individuals. We analyzed Taiwan Biobank data, including whole-genome sequencing from 1,493 participants and genotyping arrays from 129,542 individuals. First, we performed regional association analysis using whole-genome sequencing data to identify genetic variants significantly associated with erythrocyte indices, confirming their linkage disequilibrium with the α 0 thalassemia -- SEA deletion mutation, a common cause of α-thalassemia in Southeast Asian populations. Deletion mutation sequencing further validated these variants’ association with α-thalassemia. Subsequently, we analyzed genotyping array data, revealing associations between specific genetic variants and cardiometabolic traits, including lipid profiles, HbA1c levels, bilirubin levels, and diabetes risk. Using Mendelian randomization, we established causal relationships between α-thalassemia-related erythrocyte indices and cardiometabolic traits, elucidating their role in diabetes susceptibility. Our findings highlight genetic variants around the α-globin genes as surrogate markers for common α-thalassemia mutations in Taiwan, emphasizing the causal links between α-thalassemia-related erythrocyte indices, cardiometabolic traits, and heightened diabetes risk.

For MR analyses, we selected rs375498857-related cardiometabolic traits (with a P-value threshold of 10 −4 ). In the 2SLS IV analysis for the direction and causality of α-thalassemia-related erythrocyte indices and cardiometabolic traits, the association of PGAP6 rs375498857 genotypes with cardiometabolic traits remained significant even after adjustment for multiple parameters associated with MCV or MCH ( Table 5 ): the association between the rs375498857 genotype and LDL cholesterol levels and DM subsided after adjustment for MCH. Moreover, the association between the rs375498857 genotype and total cholesterol and total bilirubin levels and HbA1c subsided after adjustment for MCV (all P > 0.05). The association between the rs375498857 genotype and HDL cholesterol levels did not totally abolish after either adjustment for MCH or MCV (P = 2.22 × 10 −10 and P = 0.0008, respectively). These results suggest that the association between the rs375498857 genotype and cardiometabolic traits and DM, with the exception of HDL cholesterol levels, is dependent on MCH or MCV. Moreover, consistent findings were observed in the same 2SLS analysis, using either NPRL3 rs191086839 or LUC7L rs372755452 as the instrumental variable (Table S5).

Various metabolic traits were significantly associated with MCV and MCH, and most of the study traits had a P-value threshold of 10 −4 . Consistent associations were observed between MCV and MCH and nearly all studied traits, with MCV generally having a stronger effect. Elevated MCV and MCH were associated with increased uric acid levels and lipid profiles, including total, HDL and LDL cholesterol and triglyceride levels. Elevated MCV reduced the risk of metabolic syndrome and most of the metabolic syndrome-related components, such as systolic blood pressure, hypertension, fasting plasma glucose level, HbA1c level, DM status, eGFR, albuminuria, and microalbuminuria. All liver function-related test results showed the same direction with changes in MCV and MCH ( Table 4 ).

The α 0 thalassemia -- SEA deletion is the most common cause of α-thalassemia in Taiwan. We analyzed the association among the three significant SNVs and the α 0 thalassemia -- SEA deletion mutation detected through PCR in 1,474 TWB participants with WGS data ( Fig 5M–O ). For NPRL3 rs191086839 risk allele carrier, the sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were 85.00%, 99.93%, 98.08%, and 99.37%, respectively. For LUC7L rs372755452 risk allele carrier, the sensitivity, specificity, PPV, and NPV were 90.00%, 99.79%, 94.74%, and 99.58%, respectively. For PGAP6 rs375498857 risk allele carrier, the sensitivity, specificity, PPV, and NPV were 85.00%, 99.65%, 91.07%, and 99.37%, respectively (Table S3).

(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O) Diagrams of α-thalassemia mutations for gene alignment (A, B), genotyping by BreakDancer v1.3.6 (C, D, E, F), PCR (G, H), direct DNA sequencing (I, J, K), and association results of chromosome 16p13.3 variants (M, N, O) and genotyping performed using BreakDancer v1.3.6 (L) with the α 0 thalassemia deletion -- SEA detected through PCR. (A) Genes are represented as black boxes and pseudogenes as white boxes. The α-thalassemia mutations are represented as grey lines. (B) Diagrams illustrate the structure of the α-globin gene cluster on chromosome 16 with α-thalassemia mutations. (G, H) Agarose gels show representative results of PCR assays. (G) Sizes of amplified fragments are expressed in base pairs (bp). Lane N, negative control; lane M, 100-bp DNA ladder H3 RTU (GeneDireX, Inc.); (G) Lanes 1, 2, 5, and 7 indicate α-thalassemia -- SEA heterozygotes because of the presence of the deletion-specific 188-bp band and a 314-bp band obtained from the control DNA sequence. Lanes 3, 4, 6, and 8 indicate participants without deletion mutations that provided only the 314-bp band. (H) Lanes 1 indicates α-thalassemia -- THAI heterozygotes because of the presence of the deletion specific 411-bp band and a 314-bp band. Lanes 2–5 indicate α-thalassemia -- FIL heterozygotes because of the presence of the deletion specific 550-bp band and a 314-bp band. Lanes 6 indicates participants without deletion mutation.

In total, the data of 1,493 TWB volunteers were included in the study on the association of genotypes and phenotypes with RBC parameters ( Table 2 and Fig 4 ). By performing general linear regression by using an additive model, we determined that after adjustment for age, sex, body mass index (BMI), and smoking status, participants harboring the minor allele of the three variants (i.e., the C allele for NPRL3 rs191086839, a single-nucleotide deletion for LUC7L rs372755452, and the A allele for PGAP6 rs375498857) exhibited a genome-wide significant association (P < 5 × 10 −8 ) with a higher erythrocyte count and lower Hb, MCV, MCH, and mean corpuscular hemoglobin concentration values and a subthreshold significant association (P between 5 × 10 −8 and 1 × 10 −4 ) with lower HCT values ( Table 2 ). Furthermore, participants harboring the minor allele of each variant had significantly higher risks of the microcytic hypochromic trait, anemia, and microcytic anemia ( Table 2 and Fig 4A–I ). Stepwise linear regression analysis for MCV and MCH revealed that the association with these erythrocyte traits markedly diminished for the PGAP6 rs375498857 genotype, but not for the other two variants ( Table 3 ). These results suggest that the association between PGAP6 rs375498857 and the erythrocyte traits is because of the strong LD with NPRL3 rs191086839 and LUC7L rs372755452. Through a logistic regression analysis of all the three variants, we determined a significant association of NPRL3 rs191086839 and LUC7L rs372755452 with the risk of the microcytic hypochromic trait (odds ratio [OR] = 40.87 and 20.2; 95% confidence interval [CI] = 7.26–229.99 and 2.13–191.3, respectively; Table S2) and of NPRL3 rs191086839 with the risk of microcytic anemia (ORs = 132.74; 95% CIs = 11.80–1,493.49). These results revealed that the NPRL3 rs191086839 variant was independently associated with all the erythrocyte traits analyzed.

We analyzed the association between genetic variants at positions between 0.06 and 0.68 Mb on chromosome 16p13.3 and the RBC parameters in TWB participants with WGS data. Our findings revealed that three single-nucleotide variations (SNVs), namely NPRL3 rs191086839, LUC7L rs372755452, and PGAP6 rs375498857, were specific to Asians (with minor allele frequencies ranging from 0.0177 to 0.0218 among TWB and East Asian populations versus all <0.0001 among other ethnic populations) (Table S1) and significantly associated with four RBC traits, namely the erythrocyte count, Hb level, mean corpuscular volume (MCV), and mean corpuscular hemoglobin (MCH), with the lowest P-values for each variant being 8.68 × 10 −93 , 2.30 × 10 −17 , 1.98 × 10 −101 , and 1.76 × 10 −122 , respectively ( Fig 2 ). MCV and MCH were thus selected for in-depth examination with cardiometabolic traits based on their most significant association with 16p13.3 and their impact on erythrocyte count and Hb level. The LD map revealed strong LD among these three variants ( Fig 3 ) in our study population.

Discussion

This study explored the association of chromosome 16p13.3 variants with erythrocyte indices in the Taiwanese population. Our findings revealed that three significant SNVs at this chromosome location around the HBA1 gene were closely associated with erythrocyte parameters, namely RBC counts, Hb, MCV, and MCH levels. TWB participants who had the minor allele of these three variants in their erythrocytes were both microcytic and hypochromic. Using BreakDancer v1.3.6 for screening, followed by PCR amplification and direct sequencing, we confirmed that the α-thalassemia deletion mutation --SEA, the most common α-thalassemia mutation in the Taiwanese population, exhibited strong LD with these SNVs, with their NPVs ranging from 0.994 to 0.996 and PPVs from 0.912 to 0.981. Thus, these SNVs can be considered as crucial surrogate genetic markers for the α-thalassemia deletion mutation --SEA. We performed an MR study by using PGAP6 rs375498857 as IV in participants with whole-genomic genotyping data. We observed causal relationships between MCV/MCH and rs375498857-related cardiometabolic traits and DM (Fig 3). This study is the first to demonstrate that α-thalassemia caused by the --SEA deletion mutation can be analyzed using SNVs as the surrogate marker, obviating the need to directly genotype the deletion mutation. This approach can be applied to a larger population using genotyping analyses, such as array data, for the mass screening of α-thalassemia in adults or newborns, making it a highly powerful and effective tool for epidemiological research on α-thalassemia in Taiwan.

Linking three significant SNVs located on chromosome 16p13.3 with a deletion mutation of α-thalassemia Previous studies have reported that chromosome 16p13.3 as a gene locus is associated with erythrocyte traits (Kichaev et al, 2019; Vuckovic et al, 2020; Sinnott-Armstrong et al, 2021). By including TWB participants in their study, Lee et al (2022) revealed that the PGAP6 rs375498857 genotype is associated with several RBC traits. In our study, we observed that NPRL3 rs191086839, LUC7L rs372755452, and PGAP6 rs375498857 are associated with multiple erythrocyte indices, exhibit strong LD, and are specific to the Asian population. The minor allele frequencies of all the three variants were <0.0001 in European populations but were between 0.0188 and 0.0218 in Asian populations in the 1,000 Genome Project (Table S1). Our findings indicate that the three significant SNVs may be linked to the common deletion mutation of α-thalassemia in the Taiwanese population. First, α-thalassemia is characterized by a deficit in α-globin chain synthesis and most commonly caused by the deletion in the HBA1 gene, which is encoded by α-globin and is localized on chromosome 16p13.3 where the three SNVs are located. Second, thalassemia is prevalent in some regions of Asia and around the Mediterranean region but not in most European countries; these three SNVs are specific to the Asian population. Third, most TWB participants having the minor alleles of these three variants were both microcytic and hypochromic, with more than 90% of the participants having the microcytic hypochromic trait. Finally, more than 100 genetic forms of α-thalassemia have been identified in which α0-thalassemia is usually the most common clinically relevant form (Giardine et al, 2014). In Taiwan, the --SEA type of the α0-thalassemia mutation is the most common deletion mutation, accounting for 69–91% cases; this finding indicates that the prevalence of this mutation is ∼4.0–4.5% in Taiwanese individuals (Chen et al, 2002; Lee et al, 2015; Wang et al, 2017). All the three SNPs are specific to the Asian population and highly linked to the microcytic hypochromic trait, with a minor allele frequency of ∼1.77–1.94%. This translates to a heterozygous genotype frequency of ∼3.5–3.8% (Table 2), which is close to the predicted prevalence of α0-thalassemia mutation --SEA. The associations were further confirmed using both BreakDancer screening and PCR with direct sequencing, which revealed considerably high NPV (0.9937–0.9958) and PPV (0.9107–0.9808) values for the three SNVs. With the increasing health and economic burden of thalassemia in recent years because of population growth and epidemiologic transition, identifying surrogate genetic markers can be helpful for future mass screening and developing preventive medicine strategies for thalassemia.

The NPRL3 rs191086839 variant is the strongest and independent genetic surrogate marker for the α0-thalassemia deletion mutation --SEA in Taiwanese individuals NPRL3 is a highly conserved gene located upstream of the HBA gene cluster. The α-globin locus of all mammalian species analyzed lies within a region of 135–155 kb of conserved synteny, with α-like genes arranged along the chromosome in the order 5′-ξ-α-α-3′ (Fig 3). The HBA cluster is located between NPRL3 and LUC7L genes in almost all mammals except mouse, in which the HBA cluster no longer has LUC7L downstream of the globin genes (Vernimmen, 2014; Philipsen & Hardison, 2018). The NPRL3 gene contains the enhancers of the HBA gene cluster. The erythroid-specific multispecies conserved sequences (MCSs) identified by DNase-hypersensitive sites are numbered from MCS-R1 to MCS-R4 (Hughes et al, 2005). Three of these elements (MCS-R1, MCS-R2, and MCS-R3) lie within the body, and MCS-R4 lies upstream of the promoter of the NPRL3 gene. MCS-R2 has multiple roles, and these roles may be applicable to any other enhancer: the recruitment of polymerase II and key transcription factors at the promoter, formation of a looped structure involving several remote regulatory elements, and removal of repressive complexes, such as PcG. By performing functional analysis, Miyata et al (2020) demonstrated that an erythroid-specific enhancer is located in the intron 7 of vertebrate NPRL3, which indicates the presence of a remote enhancer on nprl3 in multiglobin gene expression. In multivariate analysis, NPRL3 rs191086839 was observed to have the strongest effect on various erythrocyte indices, including MCV and MCH, and the microcytic hypochromic trait and microcytic anemia (Tables 3 and S2). Therefore, considering the robust evidence of co-inheritance, as reflected in the linkage disequilibrium analysis and the context of surrogate markers traditionally being associated with sensitivity and specificity, we propose NPRL3 rs191086839 as a strong and independent candidate genetic surrogate marker for the α0-thalassemia deletion mutation --SEA in Taiwanese individuals.

Association between α-thalassemia-related erythrocyte indices and metabolic traits Our findings revealed that an elevated MCV was associated with lower risks of metabolic syndrome and some of its components and complications, such as DM, hypertension, microalbuminuria, and lower HDL cholesterol levels. These results are compatible with those reported previously indicating that an elevated MCV was associated with lower risks of metabolic syndrome and visceral obesity (Tanaka et al, 2020, 2021). Metabolic syndrome and its related components have been reported to be associated with various adverse cardiovascular and cancer outcomes (Lakka et al, 2002; Emery et al, 2022). In addition, a high MCV was determined to be associated with elevated total and LDL cholesterol, uric acid, and triglyceride levels in our study. Previous studies have reported that macrocytosis was associated with the severity or poor prognosis of various cardio-renal diseases (Solak et al, 2013; Ueda et al, 2013; Hsieh et al, 2017; Wang et al, 2020). Similar trends of associations were noted between MCH and most of the metabolic traits analyzed. These results suggest the diverse and bidirectional effects of MCV and MCH, which result in both favorable and unfavorable cardiometabolic outcomes. Additional studies may be necessary to elucidate the effect of the interaction between erythrocyte indices and various cardiometabolic risk factors on the outcome of cardiovascular diseases.