Toolkit for Credit Modeling, Analysis and Visualization
Logarithmic transformation
Loop Function. #' loop_function
is an iterator to loop through
love_color
Filtering Low Variance Variables
Logistic Regression & Scorecard Parameters
Variance-Inflation Factors
Max Min Normalization
Merge Category
Min Max Normalization
model result plots model_result_plot
is a wrapper of following: `per...
Arrange list of plots into a grid
multi_left_join
The length of a string.
Encode NAs
One-Hot Encoding
Outliers Detection outliers_detection
is for outliers detecting usin...
Entropy
prob to socre
partial_dependence_plot
PCA Dimension Reduction
Plot Colors
add_variable_process
address_varieble
missing Analysis
Outliers Analysis
Percent Format
auc_value auc_value
is for get best lambda required in lasso_filter....
Cramer's V matrix between categorical variables.
character to number
Checking Data
city_varieble
Processing of Address Variables
cohort_table_plot cohort_table_plot
is for ploting cohort(vintage) a...
Correlation Heat Plot
Correlation Plot
cos_sim
creditmodel: toolkit for credit modeling and data analysis
Customer Segmentation
Generating Initial Equal Size Sample Bins
Stratified Folds
Data Cleaning
Data Exploration
Date Time Cut Point
Recovery One-Hot Encoding
Recovery Percent Format
derived_interval
derived_partial_acf
derived_pct
Derivation of Behavioral Variables
Number of digits
Entropy Weight Method
Max Percent of missing Value
euclid_dist
Functions of xgboost feval
high_cor_filter
Feature Selection Wrapper
Fuzzy Cluster means.
gather or aggregate data
Select Features using GBM
GBM Parameters
get_auc_ks_lambda get_auc_ks_lambda
is for get best lambda required ...
Table of Binning
Generates Best Breaks for Binning
get_correlation_group
Calculate Information Value (IV) get_iv
is used to calculate Informa...
get logistic coef
get central value.
Get Variable Names
get_nas_random
Calculate Population Stability Index (PSI) get_psi
is used to calcul...
Calculate IV & PSI
Plot PSI(Population Stability Index)
Score Card
get_shadow_nas
get_sim_sign_lambda get_sim_sign_lambda
is for get Best lambda requi...
Getting the breaks for terminal nodes from decision tree
Get X List.
Fuzzy String matching
Fuzzy String matching
Compare the two highly correlated variables
is_date
Imputate nas using KNN
ks_table & plot
ks_value
Variable selection by LASSO
lift_value
local_outlier_factor local_outlier_factor
is function for calculatin...
plot_oot_perf plot_oot_perf
is for ploting performance of cross time...
plot_table
plot_theme
pred_score
missing Treatment
Outliers Treatment
Variable reduction based on Information Value & Population Stability I...
List as data.frame quickly
Ranking Percent Process
re_code re_code
search for matches to argument pattern within each e...
Rename
Read data
Filtering highly correlated variables with reduce method
Remove Duplicated Observations
Replace Value
Packages required and intallment
Random Forest Parameters
Functions for vector operation.
Save data
Score Transformation
Generates Best Binning Breaks
sim_str
split_bins
Split bins all
Automatic production of hive SQL
Parallel computing and export variables to global Env.
Stop parallel computing
string match #' str_match
search for matches to argument pattern wit...
Summary table
TF-IDF
Process time series data
Time Format Transfering
time_variable
Processing of Time or Date Variables
tnr_value
Trainig LR model
Train-Test-Split
Training XGboost
Training model
Process group numeric variables
variable_process
WOE Transformation
XGboost data
Select Features using XGB
XGboost Parameters
Provides a highly efficient R tool suite for Credit Modeling, Analysis and Visualization.Contains infrastructure functionalities such as data exploration and preparation, missing values treatment, outliers treatment, variable derivation, variable selection, dimensionality reduction, grid search for hyper parameters, data mining and visualization, model evaluation, strategy analysis etc. This package is designed to make the development of binary classification models (machine learning based models as well as credit scorecard) simpler and faster. The references including: 1 Refaat, M. (2011, ISBN: 9781447511199). Credit Risk Scorecard: Development and Implementation Using SAS; 2 Bezdek, James C.FCM: The fuzzy c-means clustering algorithm. Computers & Geosciences (0098-3004),<DOI:10.1016/0098-3004(84)90020-7>.