BigDataStatMeth1.0.3 package

Tools and Infrastructure for Developing 'Scalable' 'HDF5'-Based Methods

bd_wproduct

Weighted matrix–vector products and cross-products

bdapply_Function_hdf5

Apply function to different datasets inside a group

bdBind_hdf5_datasets

Bind matrices by rows or columns

bdblockmult_hdf5

Hdf5 datasets multiplication

bdblockmult_sparse_hdf5

Block matrix multiplication for sparse matrices

bdblockMult

Block-Based Matrix Multiplication

bdblockSubstract_hdf5

HDF5 dataset subtraction

bdblockSubstract

Block-Based Matrix Subtraction

bdblockSum_hdf5

HDF5 dataset addition

bdblockSum

Block-Based Matrix Addition

bdCheckMatrix_hdf5

Check Matrix Suitability for Eigenvalue Decomposition with Spectra

bdCholesky_hdf5

Cholesky Decomposition for HDF5-Stored Matrices

bdcomputeMatrixVector_hdf5

Apply Vector Operations to HDF5 Matrix

bdCorr_hdf5

Compute correlation matrix for matrices stored in HDF5 format

bdCorr_matrix

Compute correlation matrix for in-memory matrices (unified function)

bdCreate_diagonal_hdf5

Create Diagonal Matrix or Vector in HDF5 File

bdCreate_hdf5_emptyDataset

Create an empty HDF5 dataset (no data written)

bdCreate_hdf5_group

Create Group in an HDF5 File

bdCreate_hdf5_matrix

Create hdf5 data file and write data to it

bdCrossprod_hdf5

Crossprod with hdf5 matrix

bdCrossprod

Efficient Matrix Cross-Product Computation

bdDiag_add_hdf5

Add Diagonal Elements from HDF5 Matrices or Vectors

bdDiag_divide_hdf5

Divide Diagonal Elements from HDF5 Matrices or Vectors

bdDiag_multiply_hdf5

Multiply Diagonal Elements from HDF5 Matrices or Vectors

bdDiag_scalar_hdf5

Apply Scalar Operations to Diagonal Elements

bdDiag_subtract_hdf5

Subtract Diagonal Elements from HDF5 Matrices or Vectors

bdEigen_hdf5

Eigenvalue Decomposition for HDF5-Stored Matrices using Spectra

bdgetDatasetsList_hdf5

List Datasets in HDF5 Group

bdgetDiagonal_hdf5

Get Matrix Diagonal from HDF5

bdgetDim_hdf5

Get HDF5 Dataset Dimensions

bdgetSDandMean_hdf5

Compute Matrix Standard Deviation and Mean in HDF5

bdImportData_hdf5

Import data from URL or file to HDF5 format

bdImportTextFile_hdf5

Import Text File to HDF5

bdImputeSNPs_hdf5

Impute Missing SNP Values in HDF5 Dataset

bdInvCholesky_hdf5

Matrix Inversion using Cholesky Decomposition for HDF5-Stored Matrices

bdIsLocked_hdf5

Test whether an HDF5 file is locked (in use)

bdmove_hdf5_dataset

Move HDF5 Dataset

bdNormalize_hdf5

Normalize dataset in HDF5 file

bdPCA_hdf5

Principal Component Analysis for HDF5-Stored Matrices

bdpseudoinv_hdf5

Compute Matrix Pseudoinverse (HDF5-Stored)

bdpseudoinv

Compute Matrix Pseudoinverse (In-Memory)

bdQR_hdf5

QR Decomposition for HDF5-Stored Matrices

bdQR

QR Decomposition for In-Memory Matrices

bdReduce_hdf5_dataset

Reduce Multiple HDF5 Datasets

bdRemove_hdf5_element

Remove Elements from HDF5 File

bdRemovelowdata_hdf5

Remove Low-Representation SNPs from HDF5 Dataset

bdRemoveMAF_hdf5

Remove SNPs Based on Minor Allele Frequency

bdScalarwproduct

Matrix–scalar weighted product

bdSolve_hdf5

Solve Linear System AX = B (HDF5-Stored)

bdSolve

Solve Linear System AX = B (In-Memory)

bdSort_hdf5_dataset

Sort HDF5 Dataset Using Predefined Order

bdSplit_matrix_hdf5

Split HDF5 Dataset into Submatrices

bdsubset_hdf5_dataset

Create Subset of HDF5 Dataset

bdSVD_hdf5

Singular Value Decomposition for HDF5-Stored Matrices

bdtCrossprod_hdf5

Transposed cross product with HDF5 matrices

bdtCrossprod

Efficient Matrix Transposed Cross-Product Computation

bdWrite_hdf5_dimnames

Write dimnames to an HDF5 dataset

bdWriteDiagonal_hdf5

Write Matrix Diagonal to HDF5

bdWriteOppsiteTriangularMatrix_hdf5

Write Upper/Lower Triangular Matrix

BigDataStatMeth

BigDataStatMeth package documentation

A framework for 'scalable' statistical computing on large on-disk matrices stored in 'HDF5' files. It provides efficient block-wise implementations of core linear-algebra operations (matrix multiplication, SVD, PCA, QR decomposition, and canonical correlation analysis) written in C++ and R. These building blocks are designed not only for direct use, but also as foundational components for developing new statistical methods that must operate on datasets too large to fit in memory. The package supports data provided either as 'HDF5' files or standard R objects, and is intended for high-dimensional applications such as 'omics' and precision-medicine research.

  • Maintainer: Dolors Pelegri-Siso
  • License: MIT + file LICENSE
  • Last published: 2025-12-22