biomartr1.0.7 package

Genomic Data Retrieval

is.genome.available

Check Genome Availability

listDatabases

Retrieve a List of Available NCBI Databases for Download

biomart

Main BioMart Query Function

biomartr-package

Genomic Data Retrieval

cachedir

Get directory to store back end files like kingdom summaries etc

cachedir_set

Set directory to store back end files like kingdom summaries etc

check_annotation_biomartr

Check whether an annotation file contains outlier lines

download.database.all

Download all elements of an NCBI databse

download.database

Download a NCBI Database to Your Local Hard Drive

ensembl_divisions

List all available ENSEMBL divisions

get.ensembl.info

Helper function to retrieve species information from the ENSEMBL API

getAssemblyStats

Genome Assembly Stats Retrieval

getAttributes

Retrieve All Available Attributes for a Specific Dataset

getBio

A wrapper to all bio getters, selected with 'type' argument

getBioSet

Generic Bio data set extractor

getCDS

Coding Sequence Retrieval

getCDSSet

CDS retrieval of multiple species

getCollection

Retrieve a Collection: Genome, Proteome, CDS, RNA, GFF, Repeat Masker,...

getCollectionSet

Retrieve a Collection: Genome, Proteome, CDS, RNA, GFF, Repeat Masker,...

getDatasets

Retrieve All Available Datasets for a BioMart Database

getENSEMBL.gtf

Helper function for retrieving gtf files from ENSEMBL

getENSEMBL

Download sequence or annotation from ENSEMBL

getENSEMBL.Seq

Helper function for retrieving biological sequence files from ENSEMBL

getENSEMBLGENOMESInfo

Retrieve ENSEMBLGENOMES info file

getENSEMBLInfo

Retrieve ENSEMBL info file

getFilters

Retrieve All Available Filters for a Specific Dataset

getGenome

Genome Retrieval

getGENOMEREPORT

Retrieve NCBI GENOME_REPORTS file

getGenomeSet

Genome Retrieval of multiple species

getGFF

Genome Annotation Retrieval (GFF3)

getGFFSet

GFF retrieval of multiple species

getGO

Gene Ontology Query

getGroups

Retrieve available groups for a kingdom of life (only available for NC...

getGTF

Genome Annotation Retrieval (GTF)

getKingdomAssemblySummary

Retrieve and summarise the assembly_summary.txt files from NCBI for al...

getKingdoms

Retrieve available kingdoms of life

getMarts

Retrieve information about available Ensembl Biomart databases

getMetaGenomeAnnotations

Retrieve annotation *.gff files for metagenomes from NCBI Genbank

getMetaGenomes

Retrieve metagenomes from NCBI Genbank

getMetaGenomeSummary

Retrieve the assembly_summary.txt file from NCBI genbank metagenomes

getProteome

Proteome Retrieval

getProteomeSet

Proteome retrieval of multiple species

getReleases

Retrieve available database releases or versions of ENSEMBL

getRepeatMasker

Repeat Masker Retrieval

getRNA

RNA Sequence Retrieval

getRNASet

RNA Retrieval of multiple species

getSummaryFile

Helper function to retrieve the assembly_summary.txt file from NCBI

getUniProtInfo

Get uniprot info from organism

getUniProtSTATS

Retrieve UniProt Database Information File (STATS)

listGenomes

List All Available Genomes either by kingdom, group, or subgroup

listGroups

List number of available genomes in each taxonomic group

listKingdoms

List number of available genomes in each kingdom of life

listMetaGenomes

List available metagenomes on NCBI Genbank

meta.retrieval.all

Perform Meta-Genome Retrieval of all organisms in all kingdoms of life

meta.retrieval

Perform Meta-Genome Retrieval

organismAttributes

Retrieve Ensembl Biomart attributes for a query organism

organismBM

Retrieve Ensembl Biomart marts and datasets for a query organism

organismFilters

Retrieve Ensembl Biomart filters for a query organism

read_assemblystats

Import Genome Assembly Stats File

read_cds

Import CDS as Biostrings or data.table object

read_genome

Import Genome Assembly as Biostrings or data.table object

read_gff

Import GFF File

read_proteome

Import Proteome as Biostrings or data.table object

read_rm

Import Repeat Masker output file

read_rna

Import RNA as Biostrings or data.table object

refseqOrganisms

Retrieve All Organism Names Stored on refseq

summary_cds

Retrieve summary statistics for a coding sequence (CDS) file

summary_genome

Retrieve summary statistics for a genome assembly file

Perform large scale genomic data retrieval and functional annotation retrieval. This package aims to provide users with a standardized way to automate genome, proteome, 'RNA', coding sequence ('CDS'), 'GFF', and metagenome retrieval from 'NCBI RefSeq', 'NCBI Genbank', 'ENSEMBL', and 'UniProt' databases. Furthermore, an interface to the 'BioMart' database (Smedley et al. (2009) <doi:10.1186/1471-2164-10-22>) allows users to retrieve functional annotation for genomic loci. In addition, users can download entire databases such as 'NCBI RefSeq' (Pruitt et al. (2007) <doi:10.1093/nar/gkl842>), 'NCBI nr', 'NCBI nt', 'NCBI Genbank' (Benson et al. (2013) <doi:10.1093/nar/gks1195>), etc. with only one command.

  • Maintainer: Hajk-Georg Drost
  • License: GPL-2
  • Last published: 2023-12-02