Analyses enrichment of KEGG pathways associated with proteins in the fraction of significant proteins compared to all detected proteins. A Fisher's exact test is performed to test significance of enrichment.
data: a data frame that contains at least the input variables.
protein_id: a character column in the data data frame that contains the protein accession numbers.
is_significant: a logical column in the data data frame that indicates if the corresponding protein has a significantly changing peptide. The input data frame may contain peptide level information with significance information. The function is able to extract protein level information from this.
pathway_id: a character column in the data data frame that contains KEGG pathway identifiers. These can be obtained from KEGG using fetch_kegg.
pathway_name: a character column in the data data frame that contains KEGG pathway names. These can be obtained from KEGG using fetch_kegg.
plot: a logical value indicating whether the result should be plotted or returned as a table.
plot_cutoff: a character value indicating if the plot should contain the top 10 most significant proteins (p-value or adjusted p-value), or if a significance cutoff should be used to determine the number of GO terms in the plot. This information should be provided with the type first followed by the threshold separated by a space. Example are plot_cutoff = "adj_pval top10", plot_cutoff = "pval 0.05" or plot_cutoff = "adj_pval 0.01". The threshold can be chosen freely.
Returns
A bar plot displaying negative log10 adjusted p-values for the top 10 enriched pathways. Bars are coloured according to the direction of the enrichment. If plot = FALSE, a data frame is returned.
Examples
# Load librarieslibrary(dplyr)set.seed(123)# Makes example reproducible# Create example datakegg_data <- fetch_kegg(species ="eco")if(!is.null(kegg_data)){# only proceed if information was retrieved data <- kegg_data %>% group_by(uniprot_id)%>% mutate(significant = rep( sample( x = c(TRUE,FALSE), size =1, replace =TRUE, prob = c(0.2,0.8)), n = n()))# Plot KEGG enrichment calculate_kegg_enrichment( data, protein_id = uniprot_id, is_significant = significant, pathway_id = pathway_id, pathway_name = pathway_name, plot =TRUE, plot_cutoff ="pval 0.05")# Calculate KEGG enrichment kegg <- calculate_kegg_enrichment( data, protein_id = uniprot_id, is_significant = significant, pathway_id = pathway_id, pathway_name = pathway_name, plot =FALSE) head(kegg, n =10)}