casimir0.3.3 package

Comparing Automated Subject Indexing Methods in R

ndcg_score

Helper function for document-wise computation of ranked retrieval scor...

option_params

Declaration of options to be used as identical function arguments

options

casimir Options

pr_curve_post_processing

Postprocessing of pr curve data

process_cost_fp

Process cost for false positives

check_repair_relevance_pred

Check for inconsistent relevance values

apply_threshold

Filter predictions based on score and rank

boot_worker_fn

Compute bootstrap replica of pr auc

casimir-package

casimir: Comparing Automated Subject Indexing Methods in R

check_id_vars_col

Coerce column to character

check_id_vars

Coerce id columns to character

check_repair_relevance_compare

Check for inconsistent relevance values

compute_intermediate_results_rr

Compute intermediate ranked retrieval results per group

compute_intermediate_results

Compute intermediate set retrieval results per group

compute_pr_auc_from_curve

Compute area under precision-recall curve

compute_pr_auc

Compute area under precision-recall curve

compute_pr_curve

Compute precision-recall curve

compute_propensity_scores

Compute inverse propensity scores

compute_ranked_retrieval_scores

Compute ranked retrieval scores

compute_set_retrieval_scores

Compute multi-label metrics

create_comparison

Join gold standard and predicted results

create_rank_col

Create a rank column

dcg_score

Helper function for document-wise computation of ranked retrieval scor...

find_ps_rprec_deno

Compute the denominator for R-precision

lrap_score

Helper function for document-wise computation of ranked retrieval scor...

generate_pr_auc_replica

Compute bootstrap replica of pr auc

generate_replicate_results

Compute bootstrapping results

helper_f_dplyr

Calculate bootstrapping results for one sample

helper_f

Calculate bootstrapping results for one sample

join_propensity_scores

Join propensity scores

rename_metrics

Rename metrics

set_grouping_var

Set grouping variables

set_ps_flags

Set flags for propensity scores

summarise_intermediate_results_dplyr

Compute the mean of intermediate results

summarise_intermediate_results

Compute the mean of intermediate results

Perform evaluation of automatic subject indexing methods. The main focus of the package is to enable efficient computation of set retrieval and ranked retrieval metrics across multiple dimensions of a dataset, e.g. document strata or subsets of the label set. The package also provides the possibility of computing bootstrap confidence intervals for all major metrics, with seamless integration of parallel computation and propensity scored variants of standard metrics.