bigPLSR0.7.2 package

Partial Least Squares Regression Models with Big Matrices

bigPLSR_stream_kstats

Streamed centering statistics for RKHS kernels

bigPLSR-package

bigPLSR-package

cpp_irls_binomial

Fast IRLS for binomial logit with class weights

cpp_kernel_pls

Internal kernel and wide-kernel PLS solver

dot-finalize_pls_fit

Finalize pls objects

dot-resolve_training_ref

Internal: resolve training reference for RKHS predictions

kf_pls_state_fit

Finalize a KF-PLS state into a fitted model

kf_pls_state_new

KF-PLS streaming state (constructor)

kf_pls_state_update

Update a KF-PLS streaming state with a mini-batch

plot_pls_biplot

PLS biplot

plot_pls_bootstrap_coefficients

Boxplots of bootstrap coefficient distributions

plot_pls_bootstrap_scores

Boxplots of bootstrap score distributions

plot_pls_individuals

Plot individual scores

plot_pls_variables

Plot variable loadings

plot_pls_vip

Plot Variable Importance in Projection (VIP)

pls_bootstrap

Bootstrap a PLS model

pls_cross_validate

Cross-validate PLS models

pls_cv_select

Select components from cross-validation results

pls_fit

Unified PLS fit with auto backend and selectable algorithm

pls_information_criteria

Compute information criteria for component selection

pls_predict_response

Predict responses from a PLS fit

pls_predict_scores

Predict latent scores from a PLS fit

pls_select_components

Component selection via information criteria

pls_threshold

Naive sparsity control by coefficient thresholding

pls_vip

Variable importance in projection (VIP) scores

predict.big_plsr

Predict method for big_plsr objects

print.summary.big_plsr

Print a summary.big_plsr object

summarise_pls_bootstrap

Summarise bootstrap estimates

summary.big_plsr

Summarize a big_plsr model

Fast partial least squares (PLS) for dense and out-of-core data. Provides SIMPLS (straightforward implementation of a statistically inspired modification of the PLS method) and NIPALS (non-linear iterative partial least-squares) solvers, plus kernel-style PLS variants ('kernelpls' and 'widekernelpls') with parity to 'pls'. Optimized for 'bigmemory'-backed matrices with streamed cross-products and chunked BLAS (Basic Linear Algebra Subprograms) (XtX/XtY and XXt/YX), optional file-backed score sinks, and deterministic testing helpers. Includes an auto-selection strategy that chooses between XtX SIMPLS, XXt (wide) SIMPLS, and NIPALS based on (n, p) and a configurable memory budget. About the package, Bertrand and Maumy (2023) <https://hal.science/hal-05352069>, and <https://hal.science/hal-05352061> highlighted fitting and cross-validating PLS regression models to big data. For more details about some of the techniques featured in the package, Dayal and MacGregor (1997) <doi:10.1002/(SICI)1099-128X(199701)11:1%3C73::AID-CEM435%3E3.0.CO;2-%23>, Rosipal & Trejo (2001) <https://www.jmlr.org/papers/v2/rosipal01a.html>, Tenenhaus, Viennet, and Saporta (2007) <doi:10.1016/j.csda.2007.01.004>, Rosipal (2004) <doi:10.1007/978-3-540-45167-9_17>, Rosipal (2019) <https://ieeexplore.ieee.org/document/8616346>, Song, Wang, and Bai (2024) <doi:10.1016/j.chemolab.2024.105238>. Includes kernel logistic PLS with 'C++'-accelerated alternating iteratively reweighted least squares (IRLS) updates, streamed reproducing kernel Hilbert space (RKHS) solvers with reusable centering statistics, and bootstrap diagnostics with graphical summaries for coefficients, scores, and cross-validation workflows, alongside dedicated plotting utilities for individuals, variables, ellipses, and biplots. The streaming backend uses far less memory and keeps memory bounded across data sizes. For PLS1, streaming is often fast enough while preserving a small memory footprint; for PLS2 it remains competitive with a bounded footprint. On small problems that fit comfortably in RAM (random-access memory), dense in-memory solvers are slightly faster; the crossover occurs as n or p grow and the Gram/cross-product cost dominates.

  • Maintainer: Frederic Bertrand
  • License: GPL-3
  • Last published: 2025-12-01