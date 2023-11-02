Here, we compiled a comprehensive set of DEGs in S. cerevisiae identified across different studies to exhaustively compare their properties to core essential and non-essential genes, with a particular focus on phylogenetic features. We integrated bypass suppressor genes into an interaction network with DEGs to identify prevalent interaction motifs and to analyze the relationship of bypass suppression pairs in other species. This work presents a systematic characterization of DEGs and explores how their interactions with suppressors reflect evolution in natural populations.

Results

Properties of dispensable essential genes By querying an extensive panel of 21 gene features (see the Materials and Methods section, Fig S1F), we compared the properties of dispensable and core essential genes and found several significant differences (P < 0.05 using Mann–Whitney U tests for numeric features and Fisher’s exact tests for binary features, and FDR < 10%). DEGs tended to exhibit more stable gene expression levels and lower transcript counts, to be less conserved across species, to have more gene duplicates and higher evolutionary rates, and to be coexpressed with fewer genes than core essential genes. The proteins encoded by DEGs tended to be more multifunctional, to lack structural domains, to localize to a membrane, to be absent from protein complexes, and to have fewer protein–protein interactions, lower abundances, and shorter half-lives compared with those encoded by core essential genes (Fig 1C and Table S1). Interestingly, the observed differences between dispensable and core essential genes resembled the differences between non-essential and essential genes (Fig 1C and Table S1). Thus, we asked whether dispensable essential and non-essential genes shared the same properties and found that they comprised two different classes of genes with clearly distinct features (Fig 1C and Table S1). Broadly, features of DEGs fell between those of core essential and non-essential genes, consistent with and extending previous findings in a smaller dataset (Liu et al, 2015).

Phylogenetic analysis of dispensable essential genes We further explored the differences in gene conservation between dispensable and core essential genes using the phylogeny of S. cerevisiae, starting with a large panel of sequenced S. cerevisiae strains (Peter et al, 2018). DEGs were more likely than core essential genes, but less than non-essential genes, to harbor deleterious mutations disrupting protein sequences (P < 0.0005, Fisher’s exact test, Fig S2A), to present higher non-synonymous mutation rates (P < 0.0005, Mann–Whitney U test, Fig S2B), and to show copy number loss (CNL) events in other S. cerevisiae strains (P < 0.0005, Fisher’s exact test, Fig S2C). To further investigate differences in the evolutionary pressure on dispensable essential and core essential genes, we analyzed essentiality data and orthology relationships in Candida albicans, S. pombe, and human cell lines (Figs 2A and S2D and E and Table S2). Genes that were dispensable essential in S. cerevisiae were more often absent than core essential genes in each of the analyzed species (P < 0.0005, Fisher’s exact tests, Fig 2B). We hypothesized that this bias could be caused by: (i) genes specific to the S. cerevisiae phylogenetic branch and, thus, not present in their common ancestor or (ii) genes present in their common ancestor but lost in the phylogenetic branch of the analyzed species. To determine the contribution of each factor, we calculated the age of each S. cerevisiae gene by identifying the furthest species with an orthologous gene. DEGs were enriched for younger genes with respect to core essential genes (P < 0.0005, Mann–Whitney U test, Fig 2C), particularly for genes with no ortholog in any other species (i.e., specific to S. cerevisiae; P < 0.005, Fisher’s exact test, Fig 2D). Next, for each species, we defined lost genes as those absent in that species but present in its common ancestor with S. cerevisiae. We found DEGs were more often lost in other species than core essential genes (P < 0.0005, Mann–Whitney U test, Fig 2E). Thus, the absence of DEGs in other species can be explained both by genes specific to S. cerevisiae and by gene loss events in those species. Figure S2. Phylogenetic analysis of dispensable essential genes. (A) Fraction of loss-of-function mutations across S. cerevisiae strains for non-essential, dispensable essential, and core essential genes. (B) Evolutionary rate (dN/dS) across S. cerevisiae strains for the three gene sets. (C) Fraction of copy number loss events across S. cerevisiae strains for the three gene sets. (D) Orthology relationships in C. albicans of dispensable essential and core essential genes. The fraction of absent, duplicated, N:1, and essential and non-essential 1:1 orthologs is shown for each gene set. (E) Orthology relationships in human of dispensable essential and core essential genes. The fraction of absent, duplicated, N:1, and essential and non-essential 1:1 orthologs is shown for each gene set. (F) Fraction of essential genes in S. cerevisiae that is non-essential in S. uvarum. (G) Orthology relationships in S. pombe of non-essential and essential genes. The fraction of absent, duplicated, N:1, and essential and non-essential 1:1 orthologs is shown for each gene set. (H) Fold enrichment of dispensable genes with respect to non-essential genes for absence, duplication, N:1 relationships, and non-essential 1:1 orthologs in S. pombe, C. albicans, and human. Blue and green bars identify significant enrichments (P < 0.05, Fisher’s exact test) with higher overlaps for dispensable and non-essential genes, respectively (see Table S2 for details). Grey bars identify non-significant enrichments. (I) Fold enrichment of non-essential genes with respect to essential genes for absence, duplication, N:1 relationships, and non-essential 1:1 orthologs in S. pombe, C. albicans, and human. Green and orange bars identify significant enrichments (P < 0.05, Fisher’s exact test) with higher overlaps for non-essential and essential genes, respectively (see Table S2 for details). Grey bars identify non-significant enrichments. (J) Ratio between the protein sequence length in S. cerevisiae and the 1:1 ortholog in S. pombe. The shorter length is divided by the longer one. (K) Protein sequence identity between gene products in S. cerevisiae and 1:1 orthologs in C. albicans. (L) Ratio between the protein sequence length in S. cerevisiae and the 1:1 ortholog in C. albicans. The shorter length is divided by the longer one. (A, B, C, F, J, K, L) CE, core essential; DE, dispensable essential; NE, non-essential. Statistical significance was calculated using Fisher’s exact (A, C, F) and Mann–Whitney U tests (B, J, K, L). n.s., not significant; *P < 0.05; ***P < 0.0005. Figure 2. Phylogenetic analysis of dispensable essential genes. (A) Orthology relationships in S. pombe of dispensable and core essential S. cerevisiae genes. The fraction of absent, duplicated, N:1, and essential and non-essential 1:1 orthologs is shown for each gene set. (B) Fold enrichment of dispensable essential S. cerevisiae genes with respect to core essential genes for absence, duplication, N:1 relationships, and non-essential 1:1 orthologs in S. pombe, C. albicans, and human. Purple and orange bars identify significant enrichments (P < 0.05, Fisher’s exact test) with higher overlaps for dispensable essential and core essential genes, respectively (see Table S2 for details). Grey bars identify non-significant enrichments. (C) Fraction of genes within each age group, ranging from zero (found only in S. cerevisiae) to five (found in the furthest ancestor), for the three sets of genes. (D) Fraction of genes with age zero (S. cerevisiae specific) for each gene set. (E) Fraction of gene loss events across species for each S. cerevisiae gene grouped by gene set. (F) Median fitness per gene knockout across a panel of cancer cell lines. Genes are grouped by their essentiality in S. cerevisiae, and the density is shown. (G) Protein sequence identity between gene products in S. cerevisiae and 1:1 orthologs in S. pombe. (C, D, E, F, G) CE, core essential; DE, dispensable essential; NE, non-essential. Statistical significance was calculated using Fisher’s exact (D) and Mann–Whitney U tests (C, E, F, G). n.s., not significant; *P < 0.05; **P < 0.005; ***P < 0.0005. Furthermore, DEGs present in other species were more frequently duplicated and had more N:1 orthology relationships (P < 0.05, Fisher’s exact test, Fig 2B) than core essential genes. For genes with a 1:1 ortholog in other species, DEG orthologs were more often non-essential than orthologs of core essential genes (P < 0.0005, Fisher’s exact test, Fig 2B), also in the closely related Saccharomyces uvarum species (P < 0.05, Fisher’s exact test, Fig S2F). Similarly, fitness data from a panel of 1,070 cancer cell lines (Meyers et al, 2017) revealed that knockout of DEG orthologs led to less severe proliferation defects than knockout of core essential gene orthologs (P < 0.0005, Mann–Whitney U test, Fig 2F). Thus, genes that can be bypassed by genetic mutations in S. cerevisiae tend to be non-essential in other species. We show the comparison between essential and non-essential genes and dispensable essential and non-essential genes to contextualize the observed differences (Fig S2G–I and Table S2). Finally, we compared sequences of S. cerevisiae proteins and their 1:1 orthologs in S. pombe and C. albicans. Gene products of DEGs had lower sequence identity and differed more in sequence length than core essential proteins (P < 0.05, Mann–Whitney U tests, Figs 2G and S2J–L), in line with the dN/dS data (Figs 1C and S2B). Overall, orthology relationships, phenotypic changes, and sequence divergence reflect that the evolutionary pressure on DEGs is more lenient than on core essential genes but more strict than on non-essential genes.

The bypass suppressor interaction network Identification of the relevant genetic changes (i.e., suppressors) required to tolerate the deletion of an essential gene is key to interpreting the presence of deleterious genetic variants in natural populations. To improve our knowledge on the mechanisms of genetic suppression, we built an interaction network between DEGs and their bypass suppressors by combining data from our recent systematic study (van Leeuwen et al, 2020) and the literature (van Leeuwen et al, 2020). The two individual suppression interaction networks overlapped significantly (P < 0.001, randomization test, Fig 3A) and were similarly enriched in functional associations (P < 0.0005, Fisher’s exact tests, Fig S3A). The combined network included a total of 319 unique bypass suppression gene pairs, corresponding to 243 suppressors and 137 DEGs out of the 205 known DEGs. For the remaining DEGs (33% of the dataset), the suppressor variants have not been identified. Dispensable essential and suppressor genes tended to be functionally related (P < 0.05, randomization test, and FDR < 10%, Fig S3B), particularly for close functional relationships like cocomplex or copathway membership (P < 0.0005, Fisher’s exact tests, Fig S3A), and suppressors related to nuclear-cytoplasmic transport and transcription processes were more frequent than expected by chance (P < 0.05, Fisher’s exact test, and FDR < 10%, Fig S3B). For a subset of bypass suppressors, we and others have previously determined experimentally whether a suppressor mutation had a loss-of-function (LOF) or gain-of-function (GOF) effect, by testing the effect of suppressor gene mutation or overexpression on the viability of the corresponding DEG deletion mutant (van Leeuwen et al, 2020). Here, we found that for 50% and 26% of the dispensable genes, only LOF and GOF suppressors had been identified, respectively, and in 15% of the cases, both types of suppressors had been described (Fig S3C). For the remaining cases, the nature of the suppressor had not been determined. Figure 3. Bypass suppression interaction network. (A) Number of bypass suppression gene pairs in each individual dataset and their overlap. (B) Fraction of loss-of-function (LOF) and gain-of-function (GOF) bypass suppression pairs that overlap with negative and positive genetic interactions. (C) (left) Fraction of monochromatic complexes in which all dispensable essential genes are suppressed by either LOF or GOF bypass suppressors. Only complexes with two or more dispensable essential subunits are shown. In one complex, all subunits could be suppressed by LOF suppressors but also by GOF suppressors (indicated by “LOF & GOF” in the panel). (right) Number of monochromatic complexes in the suppression bypass network (blue) and in 1,000 randomized networks (grey). (D) Fraction of gene pairs encoding members of the same complex and of different complexes that share an interactor. Dispensable essential gene pairs are shown on the left, bypass suppressor gene pairs on the right. (E) Interaction modularity of the bypass suppressor genes coding for members of the RPD3L histone deacetylase complex (CPX-1852). (F) Genetic interaction profiles of the bypass suppressor genes in (E). (left) Hierarchical clustering of the genetic interaction profiles; (right) network showing genetic interaction profile similarities above 0.2. (B, D) Statistical significance was calculated using Fisher’s exact test. *P < 0.05; ***P < 0.0005. Figure S3. Bypass suppression interaction network. (A) Functional enrichment for cocomplex, copathway, coexpression, colocalization, and GO co-annotation functional relationships of bypass suppression gene pairs in the combined and the two individual bypass suppression datasets. All P < 0.0005 (Fisher’s exact tests). (B) (top) Functional enrichment of interacting pairs in the bypass suppression network for 14 broad functional categories. Only significant enrichments are shown (P < 0.05, empirical test, and FDR < 10%). Functional enrichments among suppressor genes (bottom) are also shown, in which orange and grey bars identify significant (P < 0.05, Fisher’s exact test, and FDR < 10%) and non-significant associations, respectively. (C) Fraction of dispensable essential genes with only loss-of-function suppressors, only gain-of-function suppressors, both loss-of-function and gain-of-function suppressors, and with suppressors of unknown type. Labels include the number of genes in each category. (D) Fraction of monochromatic complexes supported by each bypass suppression dataset. (E) (left) Number of suppressors per dispensable essential gene. (right) Number of dispensable essential genes per suppressor. Labels include the number of genes in each category. (F) Enrichment of dispensable essential genes for which multiple suppressors have been described versus dispensable essential genes with a single identified suppressor for a panel of 21 gene features. Top and bottom panels include numeric and binary features, respectively. Dot size is proportional to the median z-score and fold enrichment, respectively, and only significant enrichments with P-value < 0.05 (Mann–Whitney U test and Fisher’s exact test, respectively) and FDR < 10% are shown. (G) Interaction modularity of the bypass suppressor genes coding for members of the negative cofactor 2 complex (NC2, CPX-1662). Genetic interactions identify combinations of mutants that result in unexpected phenotypes given the phenotypes of the individual mutants. In negative genetic interactions, the resulting phenotype is more severe than expected, whereas in positive genetic interactions, the phenotype is healthier than expected. In a bypass suppression interaction, a secondary mutation recovers the lethal phenotype caused by an essential gene deletion, therefore representing an extreme form of positive genetic interaction. We grouped interacting pairs in the bypass suppression network by their suppression mode (i.e., LOF or GOF) and evaluated their overlap with a global genetic interaction network (Costanzo et al, 2016), generated using hypomorphic alleles of essential genes and deletion alleles of non-essential genes (see the Materials and Methods section). We first analyzed bypass suppression gene pairs with LOF suppressors and found that LOF alleles of these gene pairs often had a positive genetic interaction with each other in the global network (P < 0.0005, Fisher’s exact test, Fig 3B). In spite of the different experimental protocols, this overlap is expected because both bypass suppression and genetic interactions were identified using LOF alleles. Conversely, when analyzing bypass suppression gene pairs with GOF suppressors, we found that the corresponding LOF alleles mainly showed negative genetic interactions (P < 0.05, Fisher’s exact test, Fig 3B). Thus, GOF and LOF alleles of the suppressor gene have opposite effects when combined with a LOF allele of the corresponding DEG, being beneficial or detrimental as shown by the bypass suppression and genetic interaction networks, respectively.

Structure of the bypass suppression interaction network Interaction density (i.e., the percentage of gene pairs with an interaction) of the bypass suppression network ranged from 0.007% to 0.96% depending on whether we considered all possible gene pairs or only pairs between the identified dispensable essential and suppressor genes, respectively. In spite of the sparsity of this network, several patterns emerge showing its structure and modularity. For instance, all DEGs in the same protein complex tended to interact with either GOF or LOF suppressors. These monochromatic interactions affected 13 out of 17 non-redundant protein complexes with at least two dispensable essential subunits in our dataset (P < 0.05, randomization test, Fig 3C), suggesting similar suppression types apply for functionally related genes. Importantly, both individual suppression networks contributed to this result (Fig S3D), discarding the potential bias from specific hypothesis-driven experiments in the literature dataset. We analyzed the topology of the network and found that for 45% of the DEGs, multiple suppressors had been described (Fig S3E). This set of genes exhibited specific features compared with DEGs for which only a single suppressor had been described (Fig S3F). For instance, DEGs with multiple identified suppressors tended to have higher multifunctionality and an increased number of structural domains (P < 0.05, Fisher’s exact test, and FDR < 10%), which suggest multiple different molecular mechanisms of suppression may exist for these DEGs. Suppressors were more specific than DEGs, and only 23% of them interacted with multiple genes (Fig S3E). Next, we explored the relationship between functional similarity and connectivity patterns. We found that genes in the same protein complex tended to have the same interactors: 52% of the DEGs encoding members of the same complex shared suppressor genes, and 70% of the suppressor genes encoding members of the same complex shared DEGs (Fig 3D), more than expected by chance (P < 0.0005, Fisher’s exact test). To illustrate the underlying modular structure of the bypass suppression interaction network, we explored the connectivity of NCB2 and BUR6, both DEGs with known suppressors and the only two members of the negative cofactor 2 transcription regulator complex (ID CPX-1662 in the Complex Portal [Meldal et al, 2021]). NCB2 and BUR6 have seven and 10 identified bypass suppressor genes, respectively, six of which are in common, again showing that functionally related DEGs tend to share suppressors (Fig S3G). Two of these common suppressors belong to the core Mediator complex that plays a role in the regulation of transcription (CPX-3226), showing that interactors of the same dispensable gene tend to be functionally related both to each other and to the DEG they are suppressing. The other four shared suppressor genes also affect transcription and encode subunits of the transcription factor TFIIA complex (CPX-1633), the general transcription factor complex TFIIH (CPX-1659), and the DNA-directed RNA polymerase II complex (CPX-2662). Interestingly, the NCB2-specific suppressor, TOA2, also encodes a member of TFIIA, and three of the four BUR6-specific suppressors members of RNA pol II or Mediator, further illustrating the modularity of the network. In another example (Fig 3E), members of the RPD3L histone deacetylase complex (CPX-1852) suppress two different protein complexes. DEP1, SAP30, and SDS3 suppress the two essential subunits of piccolo NuA4 histone acetyltransferase complex (CPX-3185), whereas RPD3, SIN3, and SDS3 interact with the Rer2 subunit of the dehydrodolichyl diphosphate synthase complex (CPX-162). This modularity in the suppression interaction pattern of RPD3L subunits is also observed in genome-wide genetic interaction patterns, which are more similar for RPD3L subunits that suppress the same query gene than for RPD3L subunits that suppress functionally diverse query genes (Fig 3F). These patterns suggest a functional modularity within the complex which is supported by its modeled structure (Sardiu et al, 2009).

Mutational landscape of S. cerevisiae strains reflects bypass suppression relationships We wondered if the genetic dependencies described in the suppression interaction network were reflected in the genomic variation present in natural populations. Because homozygous deletions of essential genes are extremely rare across S. cerevisiae strains (median of one per strain), we first focused on DEGs with CNL events. Hemizygosity is associated with a decrease in gene expression levels and can impact cell growth (Pavelka et al, 2010), particularly in essential genes. For instance, even if only ∼10% of essential genes were haploinsufficient under rich media conditions (Deutschbauer et al, 2005), this increased to 30–50% when more conditions and phenotypes were tested (Delneri et al, 2008; Ohnuki & Ohya, 2018). In strains in which a copy of a DEG was lost, we evaluated if the corresponding suppressor gene had a simultaneous copy number change. Interestingly, bypass suppression gene pairs with LOF and GOF suppressor mutations showed different preferences for co-occurring copy number changes, in agreement with their LOF or GOF phenotype. Bypass suppression gene pairs that involved a LOF suppressor mutation were enriched for co-loss of both dispensable essential and suppressor genes (P < 0.0005, Fisher’s exact tests, Figs 4A and S4A). In contrast, cases with GOF suppressor mutations were enriched for events in which CNL of the DEG was accompanied by a copy number gain of the suppressor gene (P < 0.005, Fisher’s exact tests, Figs 4A and S4A). Thus, when the DEG has a CNL in a natural strain, the functional effect of the bypass suppressor mutation (GOF or LOF) identifies the most likely copy number change of the suppressor gene in that same strain. Next, we asked whether deleterious coding mutations in DEGs and in identified bypass suppressor genes co-occurred in S. cerevisiae isolates. We only considered haploid strains so the deleterious effects of mutations would not be masked by other alleles. When considering only bypass suppression gene pairs in which the suppressor carried a LOF mutation, we found 18 cases in which both the DEG and the suppressor gene carried deleterious mutations in at least one of the haploid strains, significantly more than in randomized gene pairs (P < 0.05, randomization test, Fig 4B). As expected, we did not observe a similar enrichment in diploid strains (P > 0.05, randomization test, Fig S4B) or for gene pairs involving GOF suppressor mutations (P > 0.05, randomization test, Fig S4C). Thus, the bypass suppression network mapped in a laboratory environment reflects evolutionary outcomes in natural S. cerevisiae strains. Figure 4. Co-occurring mutations in S. cerevisiae strains. (A) Proportion of copy number co-loss and loss-gain (DEG–suppressor) events across a panel of S. cerevisiae strains for bypass suppression gene pairs in which the suppressor carried either a LOF or a GOF mutation and for a set of background pairs. CNL–CNL: DEG and suppressor have both a copy number loss; CNL–CNG: DEG and suppressor have a copy number loss and gain, respectively. ***P < 0.0005 (Fisher’s exact test). (B) (left) Fraction of dispensable essential genes with no deleterious mutation across haploid S. cerevisiae strains, with a deleterious mutation in at least one of the strains but not co-occurring with deleterious mutations in any of its bypass suppressor genes, and with at least one strain in which it has a deleterious mutation co-occurring with a deleterious mutation in one of its known bypass suppressor genes. (right) Number of dispensable essential genes with a deleterious mutation in any of the haploid S. cerevisiae strains co-occurring with a deleterious variant in at least one of its known bypass suppressor genes using the bypass suppression network (pink) and a set of 1,000 randomized networks. In both analyses, only bypass suppression gene pairs with LOF suppressor mutations are considered. Figure S4. Co-occurring mutations in S. cerevisiae strains. (A) Proportion of S. cerevisiae strains in which bypass suppression gene pairs overlap more with copy number co-loss events than copy number loss–gain events (pink), and vice versa (green). Gene pairs are grouped by the bypass suppressor type: loss-of-function and gain-of-function. We also show the result for a background set of gene pairs. Statistical significance was calculated with Fisher’s exact tests. **P < 0.005; ***P < 0.0005. (B) Number of dispensable essential genes with a deleterious mutation in any of the diploid S. cerevisiae strains that co-occurs with a deleterious variant in at least one of its known bypass suppressor genes using the bypass suppression network (pink) and a set of 1,000 randomized networks. Only bypass suppression gene pairs with loss-of-function suppressor mutations are considered. (C) Number of dispensable essential genes with a deleterious mutation in any of the haploid S. cerevisiae strains that co-occurs with a deleterious variant in at least one of its known bypass suppressor genes using the bypass suppression network (green) and a set of 1,000 randomized networks. Only bypass suppression gene pairs with gain-of-function suppressor mutations are considered.