net4pg R package [Documentation]

cc_composition

Get peptides and peptide-to-protein mappings for each connected compon...

cc_stats

Provide statistics on the CCs size

get_adj_matrix

Generate adjacency matrix

get_cc

Generate graph and calculate its connected components

net4pg-package

net4pg: Handle Ambiguity of Protein Identifications from Shotgun Prote...

peptide_stats

Calculate percentage of shared vs specific peptides

plot_cc

Plot peptide-to-protein mapping graph

read_inc_matrix

Read incidence matrix of proteomic identifications

reduce_inc_matrix

Reduce size of incidence matrix for downstream analyses

transcriptome_filter

Perform transcriptome-informed post-hoc filtering

Download source package Read PDF manual

In shotgun proteomics, shared peptides (i.e., peptides that might originate from different proteins sharing homology, from different proteoforms due to alternative mRNA splicing, post-translational modifications, proteolytic cleavages, and/or allelic variants) represent a major source of ambiguity in protein identifications. The 'net4pg' package allows to assess and handle ambiguity of protein identifications. It implements methods for two main applications. First, it allows to represent and quantify ambiguity of protein identifications by means of graph connected components (CCs). In graph theory, CCs are defined as the largest subgraphs in which any two vertices are connected to each other by a path and not connected to any other of the vertices in the supergraph. Here, proteins sharing one or more peptides are thus gathered in the same CC (multi-protein CC), while unambiguous protein identifications constitute CCs with a single protein vertex (single-protein CCs). Therefore, the proportion of single-protein CCs and the size of multi-protein CCs can be used to measure the level of ambiguity of protein identifications. The package implements a strategy to efficiently calculate graph connected components on large datasets and allows to visually inspect them. Secondly, the 'net4pg' package allows to exploit the increasing availability of matched transcriptomic and proteomic datasets to reduce ambiguity of protein identifications. More precisely, it implement a transcriptome-based filtering strategy fundamentally consisting in the removal of those proteins whose corresponding transcript is not expressed in the sample-matched transcriptome. The underlying assumption is that, according to the central dogma of biology, there can be no proteins without the corresponding transcript. Most importantly, the package allows to visually inspect the effect of the filtering on protein identifications and quantify ambiguity before and after filtering by means of graph connected components. As such, it constitutes a reproducible and transparent method to exploit transcriptome information to enhance protein identifications. All methods implemented in the 'net4pg' package are fully described in Fancello and Burger (2022) <doi:10.1186/s13059-022-02701-2>.

Maintainer: Laura Fancello
License: GPL-3
Last published: 2025-12-14

Useful links

net4pg0.1.2 package

Functions

Readme

Dependencies

Imports

Versions