aifeducation1.1.3 package

Artificial Intelligence for Education

generate_embeddings

Generate test embeddings

generate_id

Generate ID suffix for objects

generate_tensors

Generate test tensors

get_alpha_3_codes

Country Alpha 3 Codes

get_batches_index

Assign cases to batches

get_called_args

Called arguments

get_layer_dict

Dictionary of layers

get_layer_documentation

Generate layer documentation

get_magnitude_values

Magnitudes of an argument

get_n_chunks

Get the number of chunks/sequences for each case

get_param_def

Definition of an argument

gwet_ac

Calculate Gwet's AC1 and AC2

HuggingFaceTokenizer

HuggingFaceTokenizer

inspect_tmp_dir

Inspect Temporary directory

install_aifeducation_studio

Install 'AI for Education - Studio' on a machine

LargeDataSetForTextEmbeddings

Abstract class for large data sets containing text embeddings

License_Server

Server function for: graphical user interface for showing the license.

load_all_py_scripts

Load and re-load all python scripts

load_from_disk

Loading objects created with 'aifeducation'

load_py_scripts

Load and re-load python scripts

long_load_target_data

Load target data for long running tasks

matrix_to_array_c

Reshape matrix to array

ModelsBasedOnTextEmbeddings

Base class for models using neural nets

monitor_test_time_on_CI

Print duration of a test on CI

output_message

Print message

prepare_r_array_for_dataset

Convert R array for arrow data set

reset_loss_log

Reset log for loss information

run_py_file

Run python file

save_to_disk

Saving objects created with 'aifeducation'

set_transformers_logger

Sets the level for logging information of the 'transformers' library

start_aifeducation_studio

Aifeducation Studio

tensor_to_matrix_c

Transform tensor to matrix

tensor_to_numpy

Tensor_to_numpy

TextEmbeddingModel

Text embedding model

to_categorical_c

Transforming classes to one-hot encoding

TokenizerBase

Base class for tokenizers

TokenizerIndex

List of all available Tokenizers

create_data_embeddings_description

Generate description for text embeddings

get_test_data_for_classifiers

Get test data

get_time_stamp

Time stamp

build_layer_stack_documentation_for_vignette

Generate documentation of all layers for an vignette or article

calc_standard_classification_measures

Calculate recall, precision, and f1-scores

clean_pytorch_log_transformers

Clean pytorch log of transformers

Reliability_UI

Graphical user interface for displaying the reliability of classifiers...

reset_log

Function that resets a log file.

create_dir

Create directory if not exists

create_object

Create object#'

add_missing_args

Add missing arguments to a list of arguments

get_TEClassifiers_class_names

Get names of classifiers

AIFEBaseModel

Base class for objects using a pytorch model as core model.

AIFEMaster

Base class for most objects

auto_n_cores

Number of cores for multiple tasks

BaseModelBert

BERT-Transformer

BaseModelCore

Abstract class for all BaseModels

BaseModelDebertaV2

DeBERTa V2

cohens_kappa

Calculate Cohen's Kappa

BaseModelFunnel

Funnel transformer

BaseModelModernBert

ModernBert

BaseModelMPNet

MPNet

BaseModelRoberta

RoBERTa

BaseModelsIndex

List of all available BaseModels

build_documentation_for_model

Generate documentation for a classifier class

create_config_state

Create config for R interfaces

calc_tokenizer_statistics

Estimate tokenizer statistics

cat_message

Print message (cat())

check_adjust_n_samples_on_CI

Set sample size for argument combinations

check_aif_py_modules

Check if all necessary python modules are available

check_all_args

Check arguments automatically

check_class_and_type

Check class and type

class_vector_to_py_dataset

Convert class vector to arrow data set

ClassifiersBasedOnTextEmbeddings

Abstract class for all classifiers that use numerical representations ...

create_synthetic_units_from_matrix

Create synthetic units

data.frame_to_py_dataset

Convert data.frame to arrow data set

DataManagerClassifier

Data manager for classification tasks

DataSetsIndex

List of all available types of data sets

doc_formula

Create rd formula

EmbeddedText

Abstract class for small data sets containing text embeddings

fleiss_kappa

Calculate Fleiss' Kappa

generate_args_for_tests

Generate combinations of arguments

get_coder_metrics

Calculate reliability measures based on content analysis

get_current_args_for_print

Print arguments

get_depr_obj_names

Get names of deprecated objects

get_desc_for_core_model_architecture

Generate documentation for core models

get_dict_cls_type

Dictionary of classifier types

get_dict_core_models

Dictionary of core models

get_dict_input_types

Dictionary of input types

get_file_extension

Get file extension

get_fixed_test_tensor

Generate static test tensor

get_param_dict

Get dictionary of all parameters

get_param_doc_desc

Description of an argument

get_parameter_documentation

Generate layer documentation

get_py_package_version

Get versions of a specific python package

get_py_package_versions

Get versions of python components

get_recommended_py_versions

Recommended version of python packages

get_synthetic_cases_from_matrix

Create synthetic cases for balancing training data

install_aifeducation

Install aifeducation on a machine

install_py_modules

Installing necessary python modules to an environment

kendalls_w

Calculate Kendall's coefficient of concordance w

knnor_is_same_class

Validate a new point

knnor

K-Nearest Neighbor OveRsampling approach (KNNOR)

kripp_alpha

Calculate Krippendorff's Alpha

LargeDataSetBase

Abstract base class for large data sets

LargeDataSetForText

Abstract class for large data sets containing raw texts

prepare_session

Function for setting up a python environment within R.

print_message

Print message (message())

py_dataset_to_embeddings

Convert arrow data set to an arrow data set

random_bool_on_CI

Random bool on Continuous Integration

read_log

Function for reading a log file in R

read_loss_log

Function for reading a log file containing a record of the loss during...

reduce_to_unique

Reduce to unique cases

Reliability_Server

Server function for: graphical user interface for displaying the relia...

summarize_args_for_long_task

Summarize arguments from shiny input

summarize_tracked_sustainability

Summarizing tracked sustainability data

TEClassifierParallel

Text embedding classifier with a neural net

TEClassifierParallelPrototype

Text embedding classifier with a ProtoNet

TEClassifierProtoNet

Text embedding classifier with a ProtoNet

TEClassifierRegular

Text embedding classifier with a neural net

TEClassifiersBasedOnProtoNet

Base class for classifiers relying on numerical representations of tex...

TEClassifiersBasedOnRegular

Base class for regular classifiers relying on EmbeddedText or LargeDat...

TEClassifierSequential

Text embedding classifier with a neural net

TEClassifierSequentialPrototype

Text embedding classifier with a ProtoNet

TEFeatureExtractor

Feature extractor for reducing the number for dimensions of text embed...

tensor_list_to_numpy

Convert list of tensors into numpy arrays

update_aifeducation

Updates an existing installation of 'aifeducation' on a machine

WordPieceTokenizer

WordPieceTokenizer

write_log

Write log

In social and educational settings, the use of Artificial Intelligence (AI) is a challenging task. Relevant data is often only available in handwritten forms, or the use of data is restricted by privacy policies. This often leads to small data sets. Furthermore, in the educational and social sciences, data is often unbalanced in terms of frequencies. To support educators as well as educational and social researchers in using the potentials of AI for their work, this package provides a unified interface for neural nets in 'PyTorch' to deal with natural language problems. In addition, the package ships with a shiny app, providing a graphical user interface. This allows the usage of AI for people without skills in writing python/R scripts. The tools integrate existing mathematical and statistical methods for dealing with small data sets via pseudo-labeling (e.g. Cascante-Bonilla et al. (2020) <doi:10.48550/arXiv.2001.06001>) and imbalanced data via the creation of synthetic cases (e.g. Islam et al. (2012) <doi:10.1016/j.asoc.2021.108288>). Performance evaluation of AI is connected to measures from content analysis which educational and social researchers are generally more familiar with (e.g. Berding & Pargmann (2022) <doi:10.30819/5581>, Gwet (2014) <ISBN:978-0-9708062-8-4>, Krippendorff (2019) <doi:10.4135/9781071878781>). Estimation of energy consumption and CO2 emissions during model training is done with the 'python' library 'codecarbon'. Finally, all objects created with this package allow to share trained AI models with other people.

  • Maintainer: Berding Florian
  • License: GPL-3
  • Last published: 2025-11-19