calculate_kegg_enrichment function

Perform KEGG pathway enrichment analysis

Perform KEGG pathway enrichment analysis

Analyses enrichment of KEGG pathways associated with proteins in the fraction of significant proteins compared to all detected proteins. A Fisher's exact test is performed to test significance of enrichment.

calculate_kegg_enrichment( data, protein_id, is_significant, pathway_id = pathway_id, pathway_name = pathway_name, plot = TRUE, plot_cutoff = "adj_pval top10" )

Arguments

  • data: a data frame that contains at least the input variables.
  • protein_id: a character column in the data data frame that contains the protein accession numbers.
  • is_significant: a logical column in the data data frame that indicates if the corresponding protein has a significantly changing peptide. The input data frame may contain peptide level information with significance information. The function is able to extract protein level information from this.
  • pathway_id: a character column in the data data frame that contains KEGG pathway identifiers. These can be obtained from KEGG using fetch_kegg.
  • pathway_name: a character column in the data data frame that contains KEGG pathway names. These can be obtained from KEGG using fetch_kegg.
  • plot: a logical value indicating whether the result should be plotted or returned as a table.
  • plot_cutoff: a character value indicating if the plot should contain the top 10 most significant proteins (p-value or adjusted p-value), or if a significance cutoff should be used to determine the number of GO terms in the plot. This information should be provided with the type first followed by the threshold separated by a space. Example are plot_cutoff = "adj_pval top10", plot_cutoff = "pval 0.05" or plot_cutoff = "adj_pval 0.01". The threshold can be chosen freely.

Returns

A bar plot displaying negative log10 adjusted p-values for the top 10 enriched pathways. Bars are coloured according to the direction of the enrichment. If plot = FALSE, a data frame is returned.

Examples

# Load libraries library(dplyr) set.seed(123) # Makes example reproducible # Create example data kegg_data <- fetch_kegg(species = "eco") if (!is.null(kegg_data)) { # only proceed if information was retrieved data <- kegg_data %>% group_by(uniprot_id) %>% mutate(significant = rep( sample( x = c(TRUE, FALSE), size = 1, replace = TRUE, prob = c(0.2, 0.8) ), n = n() )) # Plot KEGG enrichment calculate_kegg_enrichment( data, protein_id = uniprot_id, is_significant = significant, pathway_id = pathway_id, pathway_name = pathway_name, plot = TRUE, plot_cutoff = "pval 0.05" ) # Calculate KEGG enrichment kegg <- calculate_kegg_enrichment( data, protein_id = uniprot_id, is_significant = significant, pathway_id = pathway_id, pathway_name = pathway_name, plot = FALSE ) head(kegg, n = 10) }
  • Maintainer: Jan-Philipp Quast
  • License: MIT + file LICENSE
  • Last published: 2024-10-21