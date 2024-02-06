Result

Alleles of genes exhibit similar burst kinetics but have different coordination during iPSC reprogramming Next, we investigated if the two alleles of a gene have similar burst kinetics or not. To explore this, we profiled burst frequency and burst size at the allelic level for the biallelic bursty genes using SCALE. SCALE relies on a “two-state model” of transcription where a gene switches from active to inactive state with activation rate of Kon and deactivation rate of Koff. When a gene is in the active state, the rate of transcription is S and the rate of RNA decay is d (Fig 1A). Burst frequency is determined by the number of bursts per unit time (Kon), and burst size is deduced through average number of mRNA molecules per burst when the gene is in active state (S/Koff) (Fig 1A). We observed a high degree of correlation of both burst frequency (r = 0.64–0.778) and burst size (r = 0.669–0.803) between two alleles across all-day points (Fig 2A and B). Very few genes exhibited significant burst frequency and size differences between the two alleles, as marked by the red triangles (Fig 2A and B). Taken together, our results suggest that alleles of most of the genes have similar burst kinetics. Next, we explored the degree of coordination of bursting between two alleles by plotting the percent of cells expressing neither allele (p 0 ) versus the percent of cells expressing both alleles (p 2 ) (Fig 2C). Blue diagonal line represents perfect coordination (p0 + p2 = 1), whereas the red curve signifies independent bursting with shared kinetics (Fig 2C). We categorized genes based on the degree of coordination of allelic bursting into three categories: (1) highly coordinated genes: with p0 + p2 > 0.90 marked by gray asterisk between blue dotted diagonal lines; (2) independent genes: genes near the red curve with a threshold of +0.05, signified by the upper red curve, and −0.05, signified by the lower red curve, marked by rosewood triangles; (3) semi-coordinated genes that lie between the uppermost red curve line and lower blue dotted diagonal line (Persian blue dots) (Fig 2C). Figure 2. Profiling of allelic burst kinetics and coordination. (A, B) Plots representing correlation between (A) allelic burst frequency across different stages of reprogramming day 0 MEF (r = 0.725), day 8 (r = 0.705), day 9 (r = 0.689), day 10 (r = 0.64), day 12A (r = 0.69), day 12B (r = 0.69) and induced pluripotent stem cells (iPSCs) (r = 0.778) and (B) allelic burst size in day 0 MEF (r = 0.741), day 8 (r = 0.757), day 9 (r = 0.747), day 10 (r = 0.772), day 12A (r = 0.66), day 12B (r = 0.80) and iPSCs (r = 0.76). Genes that exhibit significant differences in burst frequency and size between two alleles have been marked by red triangles. (C) Smooth scatterplots representing bursting coordination between two alleles of genes for day 0 MEF, day 8, day 9, day 10, day 12A, day 12B, and iPSCs. Percent of cells expressing neither allele (p0) is plotted with the percent of cells expressing both alleles (p2); the blue diagonal line represents perfect coordination (p0 + p2 = 1), whereas the red curve signifies independent bursting with shared kinetics. Different categories of genes based on allelic bursting coordination: low p0 high p2 (green filled squares), perfectly coordinated (p0 + p2 > 0.90 marked by gray asterisk between blue dotted diagonal lines), independent genes marked by rosewood triangles (between upper and lower red curved lines, with a threshold of +0.05 signified by upper red curve and −0.05, signified by the lower red curve), and semi-coordinated genes marked with persian blue dots. We observed that most of the genes exhibited semi-coordinated allelic bursting at all-day points. However, many genes also showed highly coordinated allelic bursting across the different stages of reprogramming (Fig 2C).

Coordinated allelic bursting is linked to chromatin accessibility To understand the mechanisms of allelic bursting coordination, we asked whether the degree of coordination of allelic bursting is linked to epigenomic states. To address this, we profiled genome-wide allelic chromatin accessibility across different stages of MEF to iPSC reprogramming through allele-specific analysis of available ATAC-sequencing (ATAC-seq) datasets (Talon et al, 2021). The same hybrid MEFs (129S1X CAST), as described for scRNA-seq, were used for this experiment, allowing us to profile chromatin accessibility at the allelic level. We analyzed ATAC-seq in MEFs (day 0) and across reprogramming stages (SSEA1+ reprogramming intermediates at days 8, 9, 10, 12, and iPSCs), like the burst kinetics analysis (Fig 4A). We first validated our allele-specific ATAC-seq analysis pipeline by quantifying the difference in the enrichment of ATAC-seq reads between active-X (129S1) versus inactive-X (CAST allele) (Fig S3A). In consistence with previous reports, in MEFs and early reprogramming intermediates, active-X (CAST) showed strong enrichment of ATAC-seq reads, whereas the inactive-X (129S1) showed almost no enrichment, and upon reactivation of the inactive-X towards the attainment of iPSCs, there was a gain of chromatin accessibility (Fig S3A). Taken together, enrichment analysis of ATAC-seq reads of X-linked genes validated the accuracy of our method. Next, we compared the enrichment of ATAC-seq reads between two alleles of different categories of genes (highly coordinated, semi-coordinated, and independent) across the gene body and 3 kb upstream of TSS and 3 kb downstream of TES during reprogramming (Fig 4A). Interestingly, our analysis revealed that the two alleles of highly coordinated genes have very similar enrichment at most day points. Whereas enrichment of ATAC-seq reads of alleles of semi-coordinated/independent genes differed in most cases (Fig 4A). As expected, allelic enrichment of ATAC-seq reads considering all autosomal genes was quite similar (Fig 4A). Figure 4. Comparison of allelic chromatin accessibility of highly coordinated, semi-coordinated, and independent genes. (A) Quantitative analysis of allelic accessibility enrichment in the gene body and across 3 kb upstream of TSS and 3 kb downstream of TES of all autosomal genes, highly coordinated, semi-coordinated, and independent genes throughout different stages of reprogramming: day 0, day 8, day 10, day 12, and induced pluripotent stem cells. In the boxplots, the line inside each of the boxes denotes the median value, and the edges of each box represent 25% and 75% of dataset, respectively (Mann–Whitney U test: P-value < 0.0001; **** P-value < 0.001; *** P-value < 0.01; ** and P-value < 0.05; *). (B) Allelic accessibility enrichment analysis in the gene body and across 3 kb upstream of TSS and 3 kb downstream of TES of genes that remain semicordinated or highly coordinated throughout the reprogramming (Mann–Whitney U test: P-value < 0.05; *). Figure S3. Related to Fig 4 (A) Allelic chromatin accessibility for X-linked genes. Quantification of enrichment of allelic accessibility across the gene body and 3 kb upstream of TSS and 3 kb downstream of TES of X-linked genes in all-day points of reprogramming. In the boxplots, the line inside of each box signifies median value, whereas the edges of each box denote 25% and 75% of the datasets (Mann-Whitney U test: P-value < 0.0001; ****). (B) Plots representing allelic accessibility enrichment for genes that converted from semi-coordinated in all days to highly coordinated in induced pluripotent stem cells. In the boxplots, the line inside each of the boxes denotes the median value, and the edges of each box represent 25% and 75% of the dataset, respectively (Mann–Whitney U test: P-value < 0.001; *** P-value < 0.01; ** and P-value < 0.05; *). Notably, genes that maintained semi-coordinated bursting throughout reprogramming maintained differences in allelic accessibility (Fig 4B). Similarly, genes that maintained highly coordinated bursting throughout reprogramming always maintained similar allelic accessibility (Fig 4B). Furthermore, we show that allelic accessibility differences reduce upon becoming highly coordinated in iPSC after maintaining a semi-coordinated bursting throughout the other day points (Fig S3B). A similar trend was found for genes which switched from highly coordinated in MEF to semi-coordinated on other days (data not shown). Together, our analysis suggested a positive correlation between the coordination of allelic bursting and the similarity of allelic chromatin accessibility. Next, we explored if allelic accessibility differences in semi-coordinated and independent genes are associated with the differential binding of transcription factors (TFs) between alleles of individual genes. To test this, we determined TFs binding scores of individual alleles of a gene using TOBIAS (Transcription factor Occupancy prediction By Investigation of ATAC-seq Signal). Interestingly, we found that many TFs had a significantly different binding score between individual alleles of semi-coordinated and independent genes, which was not the case for highly coordinated genes (Fig 5). Taken together, our analysis suggests that allelic accessibility differences in semi-coordinated or independent genes allow differential binding of certain TFs among alleles of genes, which in turn leads to semi-coordinated or independent transcriptional bursting. Figure 5. Binding kinetics of TFs between alleles correlate with the allelic bursting coordination. Left: Plots showing the correlation of TFs binding score between alleles of highly coordinated, semi-coordinated, and independent genes throughout different stages of reprogramming: day 0, day 8, day 10, day 12, and induced pluripotent stem cells. TFs that exhibit significant differences in binding between two alleles have been marked by red triangles. Right: Venn diagram representing the comparison of TFs that exhibit significant differences in binding between two alleles across highly coordinated, semi-coordinated, and independent genes.