creditmodel1.3.1 package

Toolkit for Credit Modeling, Analysis and Visualization

log_trans

Logarithmic transformation

loop_function

Loop Function. #' loop_function is an iterator to loop through

love_color

love_color

low_variance_filter

Filtering Low Variance Variables

lr_params

Logistic Regression & Scorecard Parameters

lr_vif

Variance-Inflation Factors

max_min_norm

Max Min Normalization

merge_category

Merge Category

min_max_norm

Min Max Normalization

model_result_plot

model result plots model_result_plot is a wrapper of following: `per...

multi_grid

Arrange list of plots into a grid

multi_left_join

multi_left_join

n_char

The length of a string.

null_blank_na

Encode NAs

one_hot_encoding

One-Hot Encoding

outliers_detection

Outliers Detection outliers_detection is for outliers detecting usin...

p_ij

Entropy

p_to_score

prob to socre

partial_dependence_plot

partial_dependence_plot

PCA_reduce

PCA Dimension Reduction

plot_colors

Plot Colors

add_variable_process

add_variable_process

address_varieble

address_varieble

analysis_nas

missing Analysis

analysis_outliers

Outliers Analysis

as_percent

Percent Format

auc_value

auc_value auc_value is for get best lambda required in lasso_filter....

char_cor_vars

Cramer's V matrix between categorical variables.

char_to_num

character to number

checking_data

Checking Data

city_varieble

city_varieble

city_varieble_process

Processing of Address Variables

cohort_table_plot

cohort_table_plot cohort_table_plot is for ploting cohort(vintage) a...

cor_heat_plot

Correlation Heat Plot

cor_plot

Correlation Plot

cos_sim

cos_sim

creditmodel-package

creditmodel: toolkit for credit modeling and data analysis

customer_segmentation

Customer Segmentation

cut_equal

Generating Initial Equal Size Sample Bins

cv_split

Stratified Folds

data_cleansing

Data Cleaning

data_exploration

Data Exploration

date_cut

Date Time Cut Point

de_one_hot_encoding

Recovery One-Hot Encoding

de_percent

Recovery Percent Format

derived_interval

derived_interval

derived_partial_acf

derived_partial_acf

derived_pct

derived_pct

derived_ts_vars

Derivation of Behavioral Variables

digits_num

Number of digits

entropy_weight

Entropy Weight Method

entry_rate_na

Max Percent of missing Value

euclid_dist

euclid_dist

eval_auc

Functions of xgboost feval

fast_high_cor_filter

high_cor_filter

feature_selector

Feature Selection Wrapper

fuzzy_cluster_means

Fuzzy Cluster means.

gather_data

gather or aggregate data

gbm_filter

Select Features using GBM

gbm_params

GBM Parameters

get_auc_ks_lambda

get_auc_ks_lambda get_auc_ks_lambda is for get best lambda required ...

get_bins_table_all

Table of Binning

get_breaks_all

Generates Best Breaks for Binning

get_correlation_group

get_correlation_group

get_iv_all

Calculate Information Value (IV) get_iv is used to calculate Informa...

get_logistic_coef

get logistic coef

get_median

get central value.

get_names

Get Variable Names

get_nas_random

get_nas_random

get_psi_all

Calculate Population Stability Index (PSI) get_psi is used to calcul...

get_psi_iv_all

Calculate IV & PSI

get_psi_plots

Plot PSI(Population Stability Index)

get_score_card

Score Card

get_shadow_nas

get_shadow_nas

get_sim_sign_lambda

get_sim_sign_lambda get_sim_sign_lambda is for get Best lambda requi...

get_tree_breaks

Getting the breaks for terminal nodes from decision tree

get_x_list

Get X List.

grapes-alike-grapes

Fuzzy String matching

grapes-islike-grapes

Fuzzy String matching

high_cor_selector

Compare the two highly correlated variables

is_date

is_date

knn_nas_imp

Imputate nas using KNN

ks_table

ks_table & plot

ks_value

ks_value

lasso_filter

Variable selection by LASSO

lift_value

lift_value

local_outlier_factor

local_outlier_factor local_outlier_factor is function for calculatin...

plot_oot_perf

plot_oot_perf plot_oot_perf is for ploting performance of cross time...

plot_table

plot_table

plot_theme

plot_theme

pred_score

pred_score

process_nas

missing Treatment

process_outliers

Outliers Treatment

psi_iv_filter

Variable reduction based on Information Value & Population Stability I...

quick_as_df

List as data.frame quickly

ranking_percent_proc

Ranking Percent Process

re_code

re_code re_code search for matches to argument pattern within each e...

re_name

Rename

read_data

Read data

reduce_high_cor_filter

Filtering highly correlated variables with reduce method

remove_duplicated

Remove Duplicated Observations

replace_value

Replace Value

require_packages

Packages required and intallment

rf_params

Random Forest Parameters

rowAny

Functions for vector operation.

save_data

Save data

score_transfer

Score Transformation

select_best_class

Generates Best Binning Breaks

sim_str

sim_str

split_bins

split_bins

split_bins_all

Split bins all

sql_hive_text_parse

Automatic production of hive SQL

start_parallel_computing

Parallel computing and export variables to global Env.

stop_parallel_computing

Stop parallel computing

str_match

string match #' str_match search for matches to argument pattern wit...

sum_table

Summary table

term_tfidf

TF-IDF

time_series_proc

Process time series data

time_transfer

Time Format Transfering

time_variable

time_variable

time_vars_process

Processing of Time or Date Variables

tnr_value

tnr_value

train_lr

Trainig LR model

train_test_split

Train-Test-Split

train_xgb

Training XGboost

training_model

Training model

var_group_proc

Process group numeric variables

variable_process

variable_process

woe_trans_all

WOE Transformation

xgb_data

XGboost data

xgb_filter

Select Features using XGB

xgb_params

XGboost Parameters

Provides a highly efficient R tool suite for Credit Modeling, Analysis and Visualization.Contains infrastructure functionalities such as data exploration and preparation, missing values treatment, outliers treatment, variable derivation, variable selection, dimensionality reduction, grid search for hyper parameters, data mining and visualization, model evaluation, strategy analysis etc. This package is designed to make the development of binary classification models (machine learning based models as well as credit scorecard) simpler and faster. The references including: 1 Refaat, M. (2011, ISBN: 9781447511199). Credit Risk Scorecard: Development and Implementation Using SAS; 2 Bezdek, James C.FCM: The fuzzy c-means clustering algorithm. Computers & Geosciences (0098-3004),<DOI:10.1016/0098-3004(84)90020-7>.

  • Maintainer: Dongping Fan
  • License: AGPL-3
  • Last published: 2022-01-07