biogram1.6.3 package

N-Gram Analysis of Biological Sequences

add_1grams

Add 1-grams

as.data.frame.feature_test

Coerce feature_test object to a data frame

binarize

Binarize

biogram-package

biogram - analysis of biological sequences using n-grams

calc_criterion

Calculate value of criterion

calc_cs

Calculate Chi-squared-based measure

calc_ed

Calculate encoding distance

calc_ig

Calculate IG for single feature

calc_kl

Calculate KL divergence of features

calc_pi

Calculate partition index

calc_si

Compute similarity index

check_criterion

Check chosen criterion

cluster_reg_exp

Clustering of sequences based on regular expression

code_ngrams

Code n-grams

construct_ngrams

Construct and filter n-grams

count_multigrams

Detect and count multiple n-grams in sequences

count_ngrams

Count n-grams in sequences

count_specified

Count specified n-grams

count_total

Count total number of n-grams

create_encoding

Create encoding

create_feature_target

Create feature according to given contingency matrix

create_ngrams

Get all possible n-Grams

criterion_distribution

criterion_distribution class

cut.feature_test

Categorize tested features

decode_ngrams

Decode n-grams

degenerate

Degenerate protein sequence

degenerate_ngrams

Degenerate n-grams

distr_crit

Compute criterion distribution

encoding2df

Convert encoding to data frame

fast_crosstable

2d cross-tabulation

feature_test

feature_test class

full2simple

Convert encoding from full to simple format

gap_ngrams

Gap n-grams

generate_sequence

Generate sequence

generate_single_region

Generate single region

generate_single_unigram

Generate single unigram

generate_unigrams

Generate unigrams

get_ngrams_ind

Get indices of n-grams

is_ngram

Validate n-gram

l2n

Convert letters to numbers

list2matrix

Convert list of sequences to matrix

n2l

Convert numbers to letters

ngrams2df

n-grams to data frame

plot.criterion_distribution

Plot criterion distribution

position_ngrams

Position n-grams

print.feature_test

Print tested features

read_fasta

Read FASTA files

regenerate

Regenerate n-grams

regional_param

regional_param class

seq2ngrams

Extract n-grams from sequence

simple2full

Convert encoding from simple to full format

summary.feature_test

Summarize tested features

table_ngrams

Tabulate n-grams

test_features

Permutation test for feature selection

validate_encoding

Validate encoding

write_encoding

Write encodings to a file

write_fasta

Write FASTA files

Tools for extraction and analysis of various n-grams (k-mers) derived from biological sequences (proteins or nucleic acids). Contains QuiPT (quick permutation test) for fast feature-filtering of the n-gram data.

  • Maintainer: Michal Burdukiewicz
  • License: GPL-3
  • Last published: 2020-03-31