bioLeak0.1.0 package

Leakage-Safe Modeling and Auditing for Genomic and Clinical Data

as_rsample

Convert LeakSplits to an rsample resample set

audit_leakage_by_learner

Audit leakage per learner

audit_leakage

Audit leakage and confounding

audit_report

Render an HTML audit report

calibration_summary

Calibration diagnostics for binomial predictions

confounder_sensitivity

Confounder sensitivity summaries

dot-circular_block_permute

Circular block permutation indices

dot-guard_ensure_levels

Ensure consistent categorical levels for guarded preprocessing

dot-guard_fit

Fit leakage-safe preprocessing pipeline

dot-permute_labels_factory

Restricted permutation label factory

dot-quantile_break_cache

Quantile break cache for permutation stratification

dot-stationary_bootstrap

Stationary bootstrap indices

fit_resample

Fit and evaluate with leakage guards over predefined splits

impute_guarded

Leakage-safe data imputation via guarded preprocessing

LeakClasses

S4 Classes for bioLeak Pipeline

make_split_plan

Create leakage-resistant splits

plot_calibration

Plot calibration curve for binomial predictions

plot_confounder_sensitivity

Plot confounder sensitivity

plot_fold_balance

Plot fold balance of class counts per fold

plot_overlap_checks

Plot overlap diagnostics between train/test groups

plot_perm_distribution

Plot permutation distribution for a LeakAudit object

plot_time_acf

Plot ACF of test predictions for time-series leakage checks

predict_guard

Apply a fitted GuardFit transformer to new data

show-LeakSplits-method

Display summary for LeakSplits objects

simulate_leakage_suite

Simulate leakage scenarios and audit results

summary.LeakAudit

Summarize a leakage audit

summary.LeakFit

Summarize a LeakFit object

summary.LeakTune

Summarize a nested tuning result

tune_resample

Leakage-aware nested tuning with tidymodels

Prevents and detects information leakage in biomedical machine learning. Provides leakage-resistant split policies (subject-grouped, batch-blocked, study leave-out, time-ordered), guarded preprocessing (train-only imputation, normalization, filtering, feature selection), cross-validated fitting with common learners, permutation-gap auditing, batch and fold association tests, and duplicate detection.

  • Maintainer: Selcuk Korkmaz
  • License: MIT + file LICENSE
  • Last published: 2026-02-06