funModeling1.9.5 package

Exploratory Data Analysis and Data Preparation Tool-Box

auto_grouping

Reduce cardinality in categorical variable by automatic grouping

categ_analysis

Profiling analysis of categorical vs. target variable

compare_df

Compare two data frames by keys

concatenate_n_vars

Concatenate 'N' variables

convert_df_to_categoric

Convert every column in a data frame to character

coord_plot

Coordinate plot

correlation_table

Get correlation against target variable

cross_plot

Cross-plotting input variable vs. target variable

data_integrity

Data integrity

data_integrity_model

Check data integrity model

desc_groups

Profiling categorical variable

desc_groups_rank

Profiling categorical variable (rank)

df_status

Get a summary for the given data frame (o vector).

discretize_df

Discretize a data frame

discretize_get_bins

Get the data frame thresholds for discretization

discretize_rgr

Variable discretization by gain ratio maximization

entropy_2

Computes the entropy between two variables

equal_freq

Equal frequency binning

export_plot

Export plot to jpeg file

fibonacci

Fibonacci series

freq

Frequency table for categorical variables

funModeling-package

funModeling: Exploratory data analysis, data preparation and model per...

gain_lift

Generates lift and cumulative gain performance table and plot

gain_ratio

Gain ratio

get_sample

Sampling training and test data

hampel_outlier

Hampel Outlier Threshold

infor_magic

Computes several information theory metrics between two vectors

information_gain

Information gain

plot_num

Plotting numerical data

plotar

Correlation plots

prep_outliers

Outliers Data Preparation

profiling_num

Profiling numerical data

range01

Transform a variable into the [0-1] range

status

Get a summary for the given data frame (o vector).

tukey_outlier

Tukey Outlier Threshold

v_compare

Compare two vectors

var_rank_info

Importance variable ranking based on information theory

Around 10% of almost any predictive modeling project is spent in predictive modeling, 'funModeling' and the book Data Science Live Book (<https://livebook.datascienceheroes.com/>) are intended to cover remaining 90%: data preparation, profiling, selecting best variables 'dataViz', assessing model performance and other functions.

  • Maintainer: Pablo Casas
  • License: GPL-2
  • Last published: 2024-04-01