Ultra-Fast Analysis of Sparse DNA Methylome via Recurrent Pattern Encoding
Produce confidence score based on top 95 percent for XGBoost predictio...
Produce confidence score for XGBoost prediction
Filter final prediction to reduce noise
Generate pattern level data for cell type annotation
Generate reference pattern labels (no default writing)
Impute missing value for 100K window matrix
Train XGBoost model to predict cell type
Estimate cell type relative proportion
Generate confusion table for the final prediction
Generate F1 score barplot for each class
Generate UMAP for the final prediction based on fixed window eg.100kb ...
Generate UMAP for the final prediction based on cell patterns
Predict cell type annotation from the trained model
Smooth cell by pattern matrix to reduce noise
Methods for analyzing DNA methylation data via Most Recurrent Methylation Patterns (MRMPs). Supports cell-type annotation, spatial deconvolution, unsupervised clustering, and cancer cell-of-origin inference. Includes C-backed summaries for YAME “.cg/.cm” files (overlap counts, log2 odds ratios, beta/depth aggregation), an XGBoost classifier, NNLS deconvolution, and plotting utilities. Scales to large spatial and single-cell methylomes and is robust to extreme sparsity.