Privacy-Preserving Synthetic Data for 'LLM' Workflows
Detect sensitive columns by name
Save a fake dataset to disk
Generate Fake Data from Real Dataset Structure
Generate fake data from a DB schema data.frame
Generate a Fake POSIXct Column
Generate fake data with privacy controls
Create a copy-paste prompt for LLMs
Build an LLM bundle directly from a database table
Create a fake-data bundle for LLM workflows
Prepare Input Data: Coerce to data.frame and (optionally) normalize va...
Extract a table schema from a DB connection
Validate a fake dataset against the original
Zip a set of files for easy sharing
Generate privacy-preserving synthetic datasets that mirror structure, types, factor levels, and missingness; export bundles for 'LLM' workflows (data plus 'JSON' schema and guidance); and build fake data directly from 'SQL' database tables without reading real rows. Methods are related to approaches in Nowok, Raab and Dibben (2016) <doi:10.32614/RJ-2016-019> and the foundation-model overview by Bommasani et al. (2021) <doi:10.48550/arXiv.2108.07258>.
Useful links