Meta-analysis of dispensable essential genes and their interactions with bypass suppressors
Introduction
Identification of the genes required for viability is key for both fundamental and applied biological research. Essential genes constrain genome evolution (Jordan et al, 2002; Bergmiller et al, 2012; Luo et al, 2015), define core cellular processes (Wang et al, 2015), identify potential drug targets in pathogens and tumors (Roemer et al, 2003; Behan et al, 2019), and are the starting point to determine minimal genomes (Juhas et al, 2011; Hutchison et al, 2016). The fraction of essential genes within a genome reflects its complexity and redundancy and anticorrelates with the number of encoded genes (Rancati et al, 2018). For instance, 80% of 482 genes in Mycoplasma genitalium (Glass et al, 2006), 18% of ∼6,000 genes in S. cerevisiae (Giaever et al, 2002), and only 10% of the ∼20,000 genes in human cell lines (Blomen et al, 2015; Hart et al, 2015; Wang et al, 2015) are essential for viability. Essential genes tend to code for protein complex members (Dezso et al, 2003; Hart et al, 2007), play central roles in genetic networks (Costanzo et al, 2010), have few duplicates (Giaever et al, 2002), and share other properties (Deng et al, 2011; Hart et al, 2015) that differentiate them from non-essential genes, enabling their prediction (Hwang et al, 2009; Lloyd et al, 2015; Zhang et al, 2016). Although gene essentiality is significantly conserved, essentiality changes are frequent across species and even between individuals. For instance, 17% of the 1:1 orthologs between S. cerevisiae and Schizosaccharomyces pombe have different essentialities (Kim et al, 2010). Also, 57 genes differ in essentiality between two closely related S. cerevisiae strains (Dowell et al, 2010), and a systematic analysis of 324 cancer cell lines from 30 cancer types found that only ∼40% of essential genes were shared across cell lines (Behan et al, 2019). Thus, essentiality is not a static property, and changes in the genetic background can change the essentiality of a gene (Rancati et al, 2018).
Recently, we and others have systematically identified essential genes that are non-essential (i.e., dispensable essential genes [DEGs]) in the presence of suppressor mutations (i.e., the genetic changes enabling the bypass of gene essentiality) in S. cerevisiae (Liu et al, 2015; van Leeuwen et al, 2020) and S. pombe (Li et al, 2019b; Takeda et al, 2019). Both DEGs and their bypass suppressors exhibit specific features that differentiate them from other essential genes (i.e., core essential genes) and passenger mutations (i.e., randomly acquired mutations without an effect on fitness). For instance, DEGs are more likely to have paralogs, to be absent in other species, and to encode members of smaller protein complexes compared with core essential genes (Liu et al, 2015; van Leeuwen et al, 2020), whereas suppressor genes tend to be functionally related to the DEG (van Leeuwen et al, 2020). We previously exploited the specific properties of these genes for their accurate prediction (van Leeuwen et al, 2020).
Identification of the suppressor genes responsible for bypassing the requirement for the essential gene is important to dissect the function of both genes (van Leeuwen et al, 2016), to expose the genetic architecture of phenotypic traits (Mackay, 2014; Wei et al, 2014), and to understand drug resistance mechanisms (Woodford & Ellington, 2007). Suppressor mutations could also explain the presence of presumably highly detrimental genetic variants in natural populations (Jordan et al, 2015; Narasimhan et al, 2015; Chen et al, 2016). For instance, highly penetrant disease-associated mutations are sometimes present in healthy individuals (Chen et al, 2016), and human pathogenic variants can be fixed in other mammalian species without obvious deleterious consequences (Jordan et al, 2015). However, whether suppression interactions identified in laboratory strains are relevant in natural evolutionary landscapes and could explain the presence of deleterious genetic variants in populations remains an open question.
Here, we compiled a comprehensive set of DEGs in S. cerevisiae identified across different studies to exhaustively compare their properties to core essential and non-essential genes, with a particular focus on phylogenetic features. We integrated bypass suppressor genes into an interaction network with DEGs to identify prevalent interaction motifs and to analyze the relationship of bypass suppression pairs in other species. This work presents a systematic characterization of DEGs and explores how their interactions with suppressors reflect evolution in natural populations.
Results
Dispensable essential gene datasets
We compiled a comprehensive list of DEGs in S. cerevisiae from two large-scale studies (Liu et al, 2015; van Leeuwen et al, 2020) and from individual cases described in the literature (van Leeuwen et al, 2020) (Fig 1A). We only considered studies in which gene essentiality was bypassed in a laboratory yeast strain, as these often involve a single causal bypass suppressor gene, and disregarded studies that focused on essentiality changes across natural yeast strains, which are frequently driven by complex combinations of genetic variants (Dowell et al, 2010; Chen et al, 2022; Wang et al, 2022). In total, 205 DEGs had been identified, representing ∼20% of all tested essential genes (Fig 1B). Cases of bypass suppression were identified by looking for rare survivors in populations of 100–150 million cells deleted for an essential gene (van Leeuwen et al, 2020; experimental dataset), by following germination of single deletion mutant spores (Liu et al, 2015), or by a mixture of methods, including directly testing the effect of a mutation on essential gene deletion mutant viability (van Leeuwen et al, 2020; literature dataset). These methodological differences could possibly affect the detected DEGs.
(A) Number of dispensable essential genes per individual dataset and their overlap. (B) Fraction of dispensable and core essential genes in the combined dataset. Labels include the number of genes in each category. (C) Enrichment of dispensable essential versus core essential genes (left column), dispensable essential versus non-essential genes (center column), and essential versus non-essential genes (right column) for a panel of 21 gene features. Top and bottom panels include numeric and binary features, respectively. Dot size is proportional to the median z-score and fold enrichment, respectively, and only enrichments with P < 0.05 (Mann–Whitney U test and Fisher’s exact test, respectively) and FDR < 10% are shown. The arrows indicate properties of dispensable essential genes not previously identified.
To determine whether the datasets could be merged, we compared various properties of the DEGs described in each dataset. The DEGs identified in the three datasets overlapped significantly (P < 0.001, randomization test). In all datasets, DEGs showed similar functional enrichments (Fig S1A) and were depleted for fundamental cellular processes like RNA processing or translation and enriched for more peripheral functions related to signaling or transport (P < 0.05, Fisher’s exact tests, and false discovery rate [FDR] < 10%). Furthermore, protein complexes tended to be either completely dispensable or indispensable across datasets (P < 0.05 in the combined dataset, randomization test, Fig S1B). For instance, the combined dataset contained 14 protein complexes with only dispensable essential subunits (Fig S1C), significantly more than expected by chance (P < 0.002, randomization test, Fig S1B). DEGs were more likely than core essential genes to be non-essential in the closely related S. cerevisiae strain Sigma1278b (P < 0.0005 in the combined dataset, Fisher’s exact test, Fig S1D), and to be absent in the S. cerevisiae core pangenome (P < 0.05 in the combined dataset, Fisher’s exact tests, Fig S1E). Because the properties of the combined and individual datasets were similar, we used the combined dataset in the following analyses.
(A) Functional enrichments of each dataset of dispensable essential genes against a set of 14 broad functional classes. Significant enrichments correspond to cases with P < 0.05 (Fisher’s exact test) and FDR < 10%. (B) (top) Number of dispensable essential (i.e., all essential subunits are dispensable) and (bottom) core essential complexes (i.e., all essential subunits are core essential genes) for each dispensable gene set. The number of complexes in randomly selected gene sets of the same size is indicated in grey, from which we derived empirical P-values. (C) The 14 complexes for which all essential genes are dispensable in the combined dataset. The subunits are colored by their dispensability in each individual dataset. Note that for the literature dataset, we consider all essential genes as tested. (D) Fold enrichment of each dataset of dispensable essential genes for non-essential genes in the Sigma1278b S. cerevisiae strain. ***P < 0.0005 (Fisher’s exact test). (E) Overlap between dispensable essential genes and missing genes in the core pangenome, defined at different thresholds. Fold enrichment with respect to the corresponding core essential genes is shown. We also show the fold enrichment of absence in the pangenome for all core essential genes versus all dispensable essential genes and for non-essential genes versus essential genes. Significant enrichments are identified with a circle (P < 0.05, Fisher’s exact test). (F) Spearman’s correlation coefficients between the numeric features in the panel of gene features. Orange and blue dots identify significant positive and negative correlations, respectively (P < 0.05 and FDR < 10%).
Properties of dispensable essential genes
By querying an extensive panel of 21 gene features (see the Materials and Methods section, Fig S1F), we compared the properties of dispensable and core essential genes and found several significant differences (P < 0.05 using Mann–Whitney U tests for numeric features and Fisher’s exact tests for binary features, and FDR < 10%). DEGs tended to exhibit more stable gene expression levels and lower transcript counts, to be less conserved across species, to have more gene duplicates and higher evolutionary rates, and to be coexpressed with fewer genes than core essential genes. The proteins encoded by DEGs tended to be more multifunctional, to lack structural domains, to localize to a membrane, to be absent from protein complexes, and to have fewer protein–protein interactions, lower abundances, and shorter half-lives compared with those encoded by core essential genes (Fig 1C and Table S1). Interestingly, the observed differences between dispensable and core essential genes resembled the differences between non-essential and essential genes (Fig 1C and Table S1). Thus, we asked whether dispensable essential and non-essential genes shared the same properties and found that they comprised two different classes of genes with clearly distinct features (Fig 1C and Table S1). Broadly, features of DEGs fell between those of core essential and non-essential genes, consistent with and extending previous findings in a smaller dataset (Liu et al, 2015).
Phylogenetic analysis of dispensable essential genes
We further explored the differences in gene conservation between dispensable and core essential genes using the phylogeny of S. cerevisiae, starting with a large panel of sequenced S. cerevisiae strains (Peter et al, 2018). DEGs were more likely than core essential genes, but less than non-essential genes, to harbor deleterious mutations disrupting protein sequences (P < 0.0005, Fisher’s exact test, Fig S2A), to present higher non-synonymous mutation rates (P < 0.0005, Mann–Whitney U test, Fig S2B), and to show copy number loss (CNL) events in other S. cerevisiae strains (P < 0.0005, Fisher’s exact test, Fig S2C). To further investigate differences in the evolutionary pressure on dispensable essential and core essential genes, we analyzed essentiality data and orthology relationships in Candida albicans, S. pombe, and human cell lines (Figs 2A and S2D and E and Table S2). Genes that were dispensable essential in S. cerevisiae were more often absent than core essential genes in each of the analyzed species (P < 0.0005, Fisher’s exact tests, Fig 2B). We hypothesized that this bias could be caused by: (i) genes specific to the S. cerevisiae phylogenetic branch and, thus, not present in their common ancestor or (ii) genes present in their common ancestor but lost in the phylogenetic branch of the analyzed species. To determine the contribution of each factor, we calculated the age of each S. cerevisiae gene by identifying the furthest species with an orthologous gene. DEGs were enriched for younger genes with respect to core essential genes (P < 0.0005, Mann–Whitney U test, Fig 2C), particularly for genes with no ortholog in any other species (i.e., specific to S. cerevisiae; P < 0.005, Fisher’s exact test, Fig 2D). Next, for each species, we defined lost genes as those absent in that species but present in its common ancestor with S. cerevisiae. We found DEGs were more often lost in other species than core essential genes (P < 0.0005, Mann–Whitney U test, Fig 2E). Thus, the absence of DEGs in other species can be explained both by genes specific to S. cerevisiae and by gene loss events in those species.
(A) Fraction of loss-of-function mutations across S. cerevisiae strains for non-essential, dispensable essential, and core essential genes. (B) Evolutionary rate (dN/dS) across S. cerevisiae strains for the three gene sets. (C) Fraction of copy number loss events across S. cerevisiae strains for the three gene sets. (D) Orthology relationships in C. albicans of dispensable essential and core essential genes. The fraction of absent, duplicated, N:1, and essential and non-essential 1:1 orthologs is shown for each gene set. (E) Orthology relationships in human of dispensable essential and core essential genes. The fraction of absent, duplicated, N:1, and essential and non-essential 1:1 orthologs is shown for each gene set. (F) Fraction of essential genes in S. cerevisiae that is non-essential in S. uvarum. (G) Orthology relationships in S. pombe of non-essential and essential genes. The fraction of absent, duplicated, N:1, and essential and non-essential 1:1 orthologs is shown for each gene set. (H) Fold enrichment of dispensable genes with respect to non-essential genes for absence, duplication, N:1 relationships, and non-essential 1:1 orthologs in S. pombe, C. albicans, and human. Blue and green bars identify significant enrichments (P < 0.05, Fisher’s exact test) with higher overlaps for dispensable and non-essential genes, respectively (see Table S2 for details). Grey bars identify non-significant enrichments. (I) Fold enrichment of non-essential genes with respect to essential genes for absence, duplication, N:1 relationships, and non-essential 1:1 orthologs in S. pombe, C. albicans, and human. Green and orange bars identify significant enrichments (P < 0.05, Fisher’s exact test) with higher overlaps for non-essential and essential genes, respectively (see Table S2 for details). Grey bars identify non-significant enrichments. (J) Ratio between the protein sequence length in S. cerevisiae and the 1:1 ortholog in S. pombe. The shorter length is divided by the longer one. (K) Protein sequence identity between gene products in S. cerevisiae and 1:1 orthologs in C. albicans. (L) Ratio between the protein sequence length in S. cerevisiae and the 1:1 ortholog in C. albicans. The shorter length is divided by the longer one. (A, B, C, F, J, K, L) CE, core essential; DE, dispensable essential; NE, non-essential. Statistical significance was calculated using Fisher’s exact (A, C, F) and Mann–Whitney U tests (B, J, K, L). n.s., not significant; *P < 0.05; ***P < 0.0005.
(A) Orthology relationships in S. pombe of dispensable and core essential S. cerevisiae genes. The fraction of absent, duplicated, N:1, and essential and non-essential 1:1 orthologs is shown for each gene set. (B) Fold enrichment of dispensable essential S. cerevisiae genes with respect to core essential genes for absence, duplication, N:1 relationships, and non-essential 1:1 orthologs in S. pombe, C. albicans, and human. Purple and orange bars identify significant enrichments (P < 0.05, Fisher’s exact test) with higher overlaps for dispensable essential and core essential genes, respectively (see Table S2 for details). Grey bars identify non-significant enrichments. (C) Fraction of genes within each age group, ranging from zero (found only in S. cerevisiae) to five (found in the furthest ancestor), for the three sets of genes. (D) Fraction of genes with age zero (S. cerevisiae specific) for each gene set. (E) Fraction of gene loss events across species for each S. cerevisiae gene grouped by gene set. (F) Median fitness per gene knockout across a panel of cancer cell lines. Genes are grouped by their essentiality in S. cerevisiae, and the density is shown. (G) Protein sequence identity between gene products in S. cerevisiae and 1:1 orthologs in S. pombe. (C, D, E, F, G) CE, core essential; DE, dispensable essential; NE, non-essential. Statistical significance was calculated using Fisher’s exact (D) and Mann–Whitney U tests (C, E, F, G). n.s., not significant; *P < 0.05; **P < 0.005; ***P < 0.0005.
Furthermore, DEGs present in other species were more frequently duplicated and had more N:1 orthology relationships (P < 0.05, Fisher’s exact test, Fig 2B) than core essential genes. For genes with a 1:1 ortholog in other species, DEG orthologs were more often non-essential than orthologs of core essential genes (P < 0.0005, Fisher’s exact test, Fig 2B), also in the closely related Saccharomyces uvarum species (P < 0.05, Fisher’s exact test, Fig S2F). Similarly, fitness data from a panel of 1,070 cancer cell lines (Meyers et al, 2017) revealed that knockout of DEG orthologs led to less severe proliferation defects than knockout of core essential gene orthologs (P < 0.0005, Mann–Whitney U test, Fig 2F). Thus, genes that can be bypassed by genetic mutations in S. cerevisiae tend to be non-essential in other species. We show the comparison between essential and non-essential genes and dispensable essential and non-essential genes to contextualize the observed differences (Fig S2G–I and Table S2).
Finally, we compared sequences of S. cerevisiae proteins and their 1:1 orthologs in S. pombe and C. albicans. Gene products of DEGs had lower sequence identity and differed more in sequence length than core essential proteins (P < 0.05, Mann–Whitney U tests, Figs 2G and S2J–L), in line with the dN/dS data (Figs 1C and S2B). Overall, orthology relationships, phenotypic changes, and sequence divergence reflect that the evolutionary pressure on DEGs is more lenient than on core essential genes but more strict than on non-essential genes.
The bypass suppressor interaction network
Identification of the relevant genetic changes (i.e., suppressors) required to tolerate the deletion of an essential gene is key to interpreting the presence of deleterious genetic variants in natural populations. To improve our knowledge on the mechanisms of genetic suppression, we built an interaction network between DEGs and their bypass suppressors by combining data from our recent systematic study (van Leeuwen et al, 2020) and the literature (van Leeuwen et al, 2020). The two individual suppression interaction networks overlapped significantly (P < 0.001, randomization test, Fig 3A) and were similarly enriched in functional associations (P < 0.0005, Fisher’s exact tests, Fig S3A). The combined network included a total of 319 unique bypass suppression gene pairs, corresponding to 243 suppressors and 137 DEGs out of the 205 known DEGs. For the remaining DEGs (33% of the dataset), the suppressor variants have not been identified. Dispensable essential and suppressor genes tended to be functionally related (P < 0.05, randomization test, and FDR < 10%, Fig S3B), particularly for close functional relationships like cocomplex or copathway membership (P < 0.0005, Fisher’s exact tests, Fig S3A), and suppressors related to nuclear-cytoplasmic transport and transcription processes were more frequent than expected by chance (P < 0.05, Fisher’s exact test, and FDR < 10%, Fig S3B). For a subset of bypass suppressors, we and others have previously determined experimentally whether a suppressor mutation had a loss-of-function (LOF) or gain-of-function (GOF) effect, by testing the effect of suppressor gene mutation or overexpression on the viability of the corresponding DEG deletion mutant (van Leeuwen et al, 2020). Here, we found that for 50% and 26% of the dispensable genes, only LOF and GOF suppressors had been identified, respectively, and in 15% of the cases, both types of suppressors had been described (Fig S3C). For the remaining cases, the nature of the suppressor had not been determined.
(A) Number of bypass suppression gene pairs in each individual dataset and their overlap. (B) Fraction of loss-of-function (LOF) and gain-of-function (GOF) bypass suppression pairs that overlap with negative and positive genetic interactions. (C) (left) Fraction of monochromatic complexes in which all dispensable essential genes are suppressed by either LOF or GOF bypass suppressors. Only complexes with two or more dispensable essential subunits are shown. In one complex, all subunits could be suppressed by LOF suppressors but also by GOF suppressors (indicated by “LOF & GOF” in the panel). (right) Number of monochromatic complexes in the suppression bypass network (blue) and in 1,000 randomized networks (grey). (D) Fraction of gene pairs encoding members of the same complex and of different complexes that share an interactor. Dispensable essential gene pairs are shown on the left, bypass suppressor gene pairs on the right. (E) Interaction modularity of the bypass suppressor genes coding for members of the RPD3L histone deacetylase complex (CPX-1852). (F) Genetic interaction profiles of the bypass suppressor genes in (E). (left) Hierarchical clustering of the genetic interaction profiles; (right) network showing genetic interaction profile similarities above 0.2. (B, D) Statistical significance was calculated using Fisher’s exact test. *P < 0.05; ***P < 0.0005.
(A) Functional enrichment for cocomplex, copathway, coexpression, colocalization, and GO co-annotation functional relationships of bypass suppression gene pairs in the combined and the two individual bypass suppression datasets. All P < 0.0005 (Fisher’s exact tests). (B) (top) Functional enrichment of interacting pairs in the bypass suppression network for 14 broad functional categories. Only significant enrichments are shown (P < 0.05, empirical test, and FDR < 10%). Functional enrichments among suppressor genes (bottom) are also shown, in which orange and grey bars identify significant (P < 0.05, Fisher’s exact test, and FDR < 10%) and non-significant associations, respectively. (C) Fraction of dispensable essential genes with only loss-of-function suppressors, only gain-of-function suppressors, both loss-of-function and gain-of-function suppressors, and with suppressors of unknown type. Labels include the number of genes in each category. (D) Fraction of monochromatic complexes supported by each bypass suppression dataset. (E) (left) Number of suppressors per dispensable essential gene. (right) Number of dispensable essential genes per suppressor. Labels include the number of genes in each category. (F) Enrichment of dispensable essential genes for which multiple suppressors have been described versus dispensable essential genes with a single identified suppressor for a panel of 21 gene features. Top and bottom panels include numeric and binary features, respectively. Dot size is proportional to the median z-score and fold enrichment, respectively, and only significant enrichments with P-value < 0.05 (Mann–Whitney U test and Fisher’s exact test, respectively) and FDR < 10% are shown. (G) Interaction modularity of the bypass suppressor genes coding for members of the negative cofactor 2 complex (NC2, CPX-1662).
Genetic interactions identify combinations of mutants that result in unexpected phenotypes given the phenotypes of the individual mutants. In negative genetic interactions, the resulting phenotype is more severe than expected, whereas in positive genetic interactions, the phenotype is healthier than expected. In a bypass suppression interaction, a secondary mutation recovers the lethal phenotype caused by an essential gene deletion, therefore representing an extreme form of positive genetic interaction. We grouped interacting pairs in the bypass suppression network by their suppression mode (i.e., LOF or GOF) and evaluated their overlap with a global genetic interaction network (Costanzo et al, 2016), generated using hypomorphic alleles of essential genes and deletion alleles of non-essential genes (see the Materials and Methods section). We first analyzed bypass suppression gene pairs with LOF suppressors and found that LOF alleles of these gene pairs often had a positive genetic interaction with each other in the global network (P < 0.0005, Fisher’s exact test, Fig 3B). In spite of the different experimental protocols, this overlap is expected because both bypass suppression and genetic interactions were identified using LOF alleles. Conversely, when analyzing bypass suppression gene pairs with GOF suppressors, we found that the corresponding LOF alleles mainly showed negative genetic interactions (P < 0.05, Fisher’s exact test, Fig 3B). Thus, GOF and LOF alleles of the suppressor gene have opposite effects when combined with a LOF allele of the corresponding DEG, being beneficial or detrimental as shown by the bypass suppression and genetic interaction networks, respectively.
Structure of the bypass suppression interaction network
Interaction density (i.e., the percentage of gene pairs with an interaction) of the bypass suppression network ranged from 0.007% to 0.96% depending on whether we considered all possible gene pairs or only pairs between the identified dispensable essential and suppressor genes, respectively. In spite of the sparsity of this network, several patterns emerge showing its structure and modularity. For instance, all DEGs in the same protein complex tended to interact with either GOF or LOF suppressors. These monochromatic interactions affected 13 out of 17 non-redundant protein complexes with at least two dispensable essential subunits in our dataset (P < 0.05, randomization test, Fig 3C), suggesting similar suppression types apply for functionally related genes. Importantly, both individual suppression networks contributed to this result (Fig S3D), discarding the potential bias from specific hypothesis-driven experiments in the literature dataset. We analyzed the topology of the network and found that for 45% of the DEGs, multiple suppressors had been described (Fig S3E). This set of genes exhibited specific features compared with DEGs for which only a single suppressor had been described (Fig S3F). For instance, DEGs with multiple identified suppressors tended to have higher multifunctionality and an increased number of structural domains (P < 0.05, Fisher’s exact test, and FDR < 10%), which suggest multiple different molecular mechanisms of suppression may exist for these DEGs. Suppressors were more specific than DEGs, and only 23% of them interacted with multiple genes (Fig S3E). Next, we explored the relationship between functional similarity and connectivity patterns. We found that genes in the same protein complex tended to have the same interactors: 52% of the DEGs encoding members of the same complex shared suppressor genes, and 70% of the suppressor genes encoding members of the same complex shared DEGs (Fig 3D), more than expected by chance (P < 0.0005, Fisher’s exact test).
To illustrate the underlying modular structure of the bypass suppression interaction network, we explored the connectivity of NCB2 and BUR6, both DEGs with known suppressors and the only two members of the negative cofactor 2 transcription regulator complex (ID CPX-1662 in the Complex Portal [Meldal et al, 2021]). NCB2 and BUR6 have seven and 10 identified bypass suppressor genes, respectively, six of which are in common, again showing that functionally related DEGs tend to share suppressors (Fig S3G). Two of these common suppressors belong to the core Mediator complex that plays a role in the regulation of transcription (CPX-3226), showing that interactors of the same dispensable gene tend to be functionally related both to each other and to the DEG they are suppressing. The other four shared suppressor genes also affect transcription and encode subunits of the transcription factor TFIIA complex (CPX-1633), the general transcription factor complex TFIIH (CPX-1659), and the DNA-directed RNA polymerase II complex (CPX-2662). Interestingly, the NCB2-specific suppressor, TOA2, also encodes a member of TFIIA, and three of the four BUR6-specific suppressors members of RNA pol II or Mediator, further illustrating the modularity of the network. In another example (Fig 3E), members of the RPD3L histone deacetylase complex (CPX-1852) suppress two different protein complexes. DEP1, SAP30, and SDS3 suppress the two essential subunits of piccolo NuA4 histone acetyltransferase complex (CPX-3185), whereas RPD3, SIN3, and SDS3 interact with the Rer2 subunit of the dehydrodolichyl diphosphate synthase complex (CPX-162). This modularity in the suppression interaction pattern of RPD3L subunits is also observed in genome-wide genetic interaction patterns, which are more similar for RPD3L subunits that suppress the same query gene than for RPD3L subunits that suppress functionally diverse query genes (Fig 3F). These patterns suggest a functional modularity within the complex which is supported by its modeled structure (Sardiu et al, 2009).
Mutational landscape of S. cerevisiae strains reflects bypass suppression relationships
We wondered if the genetic dependencies described in the suppression interaction network were reflected in the genomic variation present in natural populations. Because homozygous deletions of essential genes are extremely rare across S. cerevisiae strains (median of one per strain), we first focused on DEGs with CNL events. Hemizygosity is associated with a decrease in gene expression levels and can impact cell growth (Pavelka et al, 2010), particularly in essential genes. For instance, even if only ∼10% of essential genes were haploinsufficient under rich media conditions (Deutschbauer et al, 2005), this increased to 30–50% when more conditions and phenotypes were tested (Delneri et al, 2008; Ohnuki & Ohya, 2018). In strains in which a copy of a DEG was lost, we evaluated if the corresponding suppressor gene had a simultaneous copy number change. Interestingly, bypass suppression gene pairs with LOF and GOF suppressor mutations showed different preferences for co-occurring copy number changes, in agreement with their LOF or GOF phenotype. Bypass suppression gene pairs that involved a LOF suppressor mutation were enriched for co-loss of both dispensable essential and suppressor genes (P < 0.0005, Fisher’s exact tests, Figs 4A and S4A). In contrast, cases with GOF suppressor mutations were enriched for events in which CNL of the DEG was accompanied by a copy number gain of the suppressor gene (P < 0.005, Fisher’s exact tests, Figs 4A and S4A). Thus, when the DEG has a CNL in a natural strain, the functional effect of the bypass suppressor mutation (GOF or LOF) identifies the most likely copy number change of the suppressor gene in that same strain. Next, we asked whether deleterious coding mutations in DEGs and in identified bypass suppressor genes co-occurred in S. cerevisiae isolates. We only considered haploid strains so the deleterious effects of mutations would not be masked by other alleles. When considering only bypass suppression gene pairs in which the suppressor carried a LOF mutation, we found 18 cases in which both the DEG and the suppressor gene carried deleterious mutations in at least one of the haploid strains, significantly more than in randomized gene pairs (P < 0.05, randomization test, Fig 4B). As expected, we did not observe a similar enrichment in diploid strains (P > 0.05, randomization test, Fig S4B) or for gene pairs involving GOF suppressor mutations (P > 0.05, randomization test, Fig S4C). Thus, the bypass suppression network mapped in a laboratory environment reflects evolutionary outcomes in natural S. cerevisiae strains.
(A) Proportion of copy number co-loss and loss-gain (DEG–suppressor) events across a panel of S. cerevisiae strains for bypass suppression gene pairs in which the suppressor carried either a LOF or a GOF mutation and for a set of background pairs. CNL–CNL: DEG and suppressor have both a copy number loss; CNL–CNG: DEG and suppressor have a copy number loss and gain, respectively. ***P < 0.0005 (Fisher’s exact test). (B) (left) Fraction of dispensable essential genes with no deleterious mutation across haploid S. cerevisiae strains, with a deleterious mutation in at least one of the strains but not co-occurring with deleterious mutations in any of its bypass suppressor genes, and with at least one strain in which it has a deleterious mutation co-occurring with a deleterious mutation in one of its known bypass suppressor genes. (right) Number of dispensable essential genes with a deleterious mutation in any of the haploid S. cerevisiae strains co-occurring with a deleterious variant in at least one of its known bypass suppressor genes using the bypass suppression network (pink) and a set of 1,000 randomized networks. In both analyses, only bypass suppression gene pairs with LOF suppressor mutations are considered.
(A) Proportion of S. cerevisiae strains in which bypass suppression gene pairs overlap more with copy number co-loss events than copy number loss–gain events (pink), and vice versa (green). Gene pairs are grouped by the bypass suppressor type: loss-of-function and gain-of-function. We also show the result for a background set of gene pairs. Statistical significance was calculated with Fisher’s exact tests. **P < 0.005; ***P < 0.0005. (B) Number of dispensable essential genes with a deleterious mutation in any of the diploid S. cerevisiae strains that co-occurs with a deleterious variant in at least one of its known bypass suppressor genes using the bypass suppression network (pink) and a set of 1,000 randomized networks. Only bypass suppression gene pairs with loss-of-function suppressor mutations are considered. (C) Number of dispensable essential genes with a deleterious mutation in any of the haploid S. cerevisiae strains that co-occurs with a deleterious variant in at least one of its known bypass suppressor genes using the bypass suppression network (green) and a set of 1,000 randomized networks. Only bypass suppression gene pairs with gain-of-function suppressor mutations are considered.
Co-occurrence of viability changes and fixed bypass suppressor mutations
We have shown that genes that are dispensable essential in S. cerevisiae are often non-essential in other species (Fig 2B). Differences in the genetic background in those species may be responsible for these changes in essentiality. Here, we hypothesized that the genetic changes that bypass the essentiality of a gene in S. cerevisiae should be reflected in the genome of species in which the gene is also dispensable (i.e., non-essential or absent). To test this, we evaluated whether changes in essentiality for DEGs in a given target species co-occurred with bypass suppressor mutations that were fixed in the target genome. Briefly, we considered as equivalent bypass mutations those that could reduce or increase the gene activity in the target species, for LOF and GOF suppressors, respectively (see the Materials and Methods section). Given that genome-scale essentiality data are scarce, we focused our analysis on S. pombe, for which high-quality essentiality data are available for most genes (Harris et al, 2022).
We found that 67% (18/27) of the S. cerevisiae DEGs that are non-essential in S. pombe co-occurred with bypass suppressor mutations in that species, whereas this happened for only 26% (12/47) of the DEGs that were essential in S. pombe (P < 0.005, Fisher’s exact test, Fig 5A and B). A similar trend (48%) was observed for S. cerevisiae DEGs that were absent (i.e., without an ortholog) in S. pombe, although this difference was not significant compared with the set of essential orthologs (P > 0.05, Fisher’s exact test, Fig 5B). To increase the statistical power of our analyses, we combined the non-essential and absent genes in S. pombe into a single set and observed a clear difference with the essential orthologs (2.3-fold enrichment, P < 0.005, Fisher’s exact test).
(A) Dispensable essential S. cerevisiae genes without an ortholog or with a 1:1 ortholog in S. pombe, and their bypass suppressors. Color code reflects whether dispensable essential and bypass suppressor genes have similar phenotypes (i.e., absent or non-essential) and mutations, respectively, in S. pombe compared with the bypass suppression interactions identified in S. cerevisiae. Blue squares with a black border identify dispensable essential genes without an ortholog in S. pombe. The circle indicates, for each dispensable essential gene, whether any of the bypass suppressor mutations is present in S. pombe. (B) Fraction of dispensable essential genes with at least one bypass suppressor mutation in the S. pombe genome. Dispensable essential genes are grouped by the phenotype of their 1:1 ortholog in S. pombe (E, essential; NE, non-essential; absent: without an ortholog). (C) Fraction of dispensable essential genes with bypass suppressor mutations in both S. pombe and C. albicans or in neither of those species. Dispensable essential genes are grouped by the essentiality of their 1:1 orthologs in those species. (B, C) Statistical significance was calculated using Fisher’s exact tests. *P < 0.05; **P < 0.005.
We controlled for potential biases to ensure the robustness of our observation (Fig 5B). We evaluated the effect of interaction degree by generating 1,000 randomized bypass suppression networks while respecting the original topology (Fig S5A) and by considering only DEGs with a single known bypass suppressor (Fig S5B). In addition, we removed bypass suppression interactions from the literature which may have been identified because of phylogenetic properties (Fig S5C), functionally related bypass suppression pairs which may be prone to present similar evolutionary patterns (Fig S5D), and every node in the network to discard dependence on a single gene (Fig S5E). We only considered suppressors with 1:1 orthologs or absent in S. pombe to account for the potential expression divergence of duplicated genes (Fig S5F) and calculated the genes with large expression changes between both species to identify gene activity changes (Fig S5G). Also, we applied three alternative orthology mappings (Fig S5H) and used essentiality annotations and orthology mappings from C. albicans (Fig S5I). In all these analyses, DEGs without orthologs or with non-essential orthologs more often co-occurred with bypass suppressor mutations than DEGs with essential orthologs (P < 0.05, Fisher’s exact tests). Conversely, switching LOF and GOF annotations resulted in a non-significant difference, as expected (P > 0.05, Fisher’s exact test, Fig S5J).
(A) Presence of bypass suppressor mutations in the S. pombe genome for dispensable essential genes without an ortholog or with 1:1 orthologs that is non-essential in S. pombe versus dispensable essential genes with 1:1 orthologs that are essential. Fold enrichments for the suppression interaction network (blue) and 1,000 randomizations (grey), and the derived empirical P-value are shown. (B) Like (5B) but considering only dispensable essential genes with a single bypass suppressor. (C) Like (5B) but removing bypass suppression pairs from the literature. (D) Like (5B) but removing bypass suppression pairs belonging to the same protein complex or pathway. (E) Bypass suppression subnetworks grouped by the enrichment P-value of the co-occurrence of dispensable essential S. cerevisiae genes without orthologs or with 1:1 orthologs that were non-essential in S. pombe, and bypass suppressor mutations in the S. pombe genome. Each bypass suppression subnetwork has a different gene removed. (F) Like (5B) but considering only suppressors with a 1:1 ortholog in S. pombe or absent in that species. (G) Like (5B) but considering large expression changes between S. cerevisiae and S. pombe instead of orthology relationships (i.e., duplications and N:1 orthologs) to identify suppressor activity changes. (H) Like (5B) but using different orthology mappings for S. pombe: OrthoMCL (left), PomBase (middle), and SonicParanoid (right). (I) Like (5B) but using C. albicans orthology relationships and phenotype data: OrthoMCL (left), Panther (middle), and SonicParanoid (right). (J) Like (5B) but with loss-of-function and gain-of-function suppressor annotations switched. (K) Pearson’s correlation between the transcript counts of 1:1 orthologs in S. cerevisiae and S. pombe. (L) S. pombe to S. cerevisiae transcript ratios for N:1, 1:1, and 1:N orthologs. Transcript ratios were calculated by dividing the transcript counts in S. pombe by the counts in S. cerevisiae. The x-axis shows the S. cerevisiae to S. pombe orthology relationship. In N:1 orthologs, the transcript levels of the N S. cerevisiae genes were aggregated before calculating the transcript ratio versus the single S. pombe gene. In 1:N orthologs, the transcript levels of the N S. pombe genes were aggregated before calculating the transcript ratio versus the single S. cerevisiae gene. (B, C, D, F, G, H, I, J, L) Statistical significance was calculated using Fisher’s exact (B, C, D, F, G, H, I, J) and Mann–Whitney U tests (L). n.s., not significant; *P < 0.05; **P < 0.005; ***P < 0.0005.
Finally, we selected DEGs with 1:1 orthologs in both S. pombe and C. albicans and found that DEGs with non-essential orthologs in both species were more likely to have bypass suppressor mutations in those species than DEGs with essential orthologs (P < 0.05, Fisher’s exact test, Fig 5C). In all, these analyses reveal that the relationship between DEGs and their bypass suppressor genes identified in S. cerevisiae is reflected in the gene essentiality and mutational space of other species.
Discussion
Differences between essential and non-essential genes have been widely characterized (Figs 1C and S2G and I) and a myriad of machine learning algorithms have exploited this information for the successful prediction of gene essentiality (Hwang et al, 2009; Lloyd et al, 2015; Zhang et al, 2016). Recently, we and others have identified a subset of S. cerevisiae essential genes that become dispensable in the presence of specific genetic variants (Liu et al, 2015; van Leeuwen et al, 2020). Here, we have combined these datasets of DEGs, after showing they exhibit similar properties (Fig 1), for the comprehensive characterization of these genes. While recapitulating previously reported features in smaller datasets, we have also revealed new properties of DEGs (Figs 1C and 2). These features can be incorporated in existing methods for the prediction of essential gene dispensability (van Leeuwen et al, 2020). Because properties of DEGs are highly conserved (van Leeuwen et al, 2020), predictions could potentially target other species. Although the differences between dispensable essential and core essential genes resemble the differences between essential and non-essential genes (Figs 1C, 2B, and S2I), dispensable essential and non-essential genes also make up two clearly distinct groups (Figs 1C and S2H). Thus, in contrast to the classical binary classification of genes based on their essentiality, three different sets of genes exist with specific properties that distinguish them from each other: non-essential, dispensable essential, and core essential genes, as was also previously suggested (Liu et al, 2015).
Importantly, we presented extensive evidence of the distinct evolutionary pressure exerted on these gene sets by performing phylogenetic analyses spanning very different evolutionary timescales (Figs 2 and S2), further expanding previous observations (Liu et al, 2015; van Leeuwen et al, 2020). The observed differences in conservation of dispensable essential compared with core essential S. cerevisiae genes in S. uvarum, C. albicans, S. pombe, and even human, which diverged from S. cerevisiae ∼1 billion years ago, reflect the substantial evolutionary footprint of essential gene dispensability.
For a better characterization of the mechanisms associated with the tolerance of highly deleterious mutations, we integrated data from multiple studies to build a bypass suppression interaction network between DEGs and their suppressors. Several properties emerged demonstrating the modularity and structure of the bypass suppression network. Complexes tended to be either composed of only dispensable essential subunits or of only core essential subunits (Fig S1B), mirroring the essentiality composition bias previously described (Hart et al, 2007) and the functional modularity that complexes encapsulate. Dispensable essentiality, thus, would be a modular feature of protein complexes (Li et al, 2019b), as is essentiality. Also, protein complexes exhibited monochromaticity of suppressor type (Fig 3C) with members of the same complex being all suppressed by either LOF or GOF mutations. Last, members of the same complex exhibited interaction coherence, with cocomplexed DEGs sharing suppressors and cocomplexed suppressor genes interacting with the same DEGs (Fig 3D), as illustrated in Figs 3E and S3G. All these observations expose the inherent modularity of the bypass suppression network and suggest that similar suppression mechanisms apply for functionally related genes, which can lead to the identification of new dispensable essential and suppressor genes. Certainly, network modularity is not restricted to the bypass suppression network, and it is in fact a hallmark of a global genetic interaction network (Costanzo et al, 2016), but it is particularly relevant here, given its directionality, small size, and low interaction density, reflecting the strong functional relationships bypass suppression interactions encapsulate.
The potential role of genetic suppression in explaining the existence of deleterious variants among natural populations (Chen et al, 2016) is still not fully understood. To address this knowledge gap, we evaluated how bypass suppression gene pairs reflected simultaneous genomic changes across evolution. Remarkably, we found co-occurrence of copy number changes and deleterious mutations in both the dispensable essential and the suppressor genes across S. cerevisiae strains (Fig 4). Furthermore, S. cerevisiae DEGs that were absent or non-essential in S. pombe were more likely to co-occur with a bypass suppressor mutation in the S. pombe genome than DEGs that were essential in S. pombe (Fig 5). These results suggest that within- and across species genetic variability can follow the same evolutionary paths as spontaneous mutations in a laboratory environment, illustrating the constraints genetic networks may impose on evolutionary trajectories.
Legal Disclaimer:
EIN Presswire provides this news content "as is" without warranty of any kind. We do not accept any responsibility or liability for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this article. If you have any complaints or copyright issues related to this article, kindly contact the author above.