whisper0.1.0 package

Native R 'torch' Implementation of 'OpenAI' 'Whisper'

apply_bpe

Apply BPE Merges

audio_duration

Get Audio Duration

audio_to_mel

Convert Audio to Mel Spectrogram

byte_to_token

Convert Byte to BPE Token

clean_text

Clean Transcribed Text

compute_stft

Compute STFT Magnitude

copy_if_exists

Copy Weight if Exists

create_decoder

Create Decoder from Config

create_encoder

Create Encoder from Config

create_mel_filterbank_fallback

Create Mel Filterbank (Fallback)

decode_bpe_bytes

Decode BPE Bytes Back to Text

decode_timestamp

Decode Timestamp Token

download_tokenizer_files

Download Tokenizer Files from HuggingFace

download_whisper_model

Download Model from HuggingFace

ensure_tokenizer_files

Ensure Tokenizer Files are Downloaded

extract_segments

Extract Segments with Timestamps

get_initial_tokens

Get Initial Decoder Tokens

get_model_path

Get Model Cache Path

get_weights_path

Get Path to Model Weights

greedy_decode

Greedy Decoding

hz_to_mel

Convert Hz to Mel Scale

is_timestamp_token

Check if Token is Timestamp

list_downloaded_models

List Downloaded Models

list_whisper_models

List Available Models

load_added_tokens

Load Added Tokens from HuggingFace

load_audio

Load and Preprocess Audio

load_decoder_weights

Load Decoder Weights

load_encoder_weights

Load Encoder Weights

load_mel_filterbank

Load Pre-computed Mel Filterbank

load_whisper_model

Load Whisper Model

load_whisper_weights

Load Weights from Safetensors

mel_to_hz

Convert Mel Scale to Hz

model_exists

Check if Model is Downloaded

pad_or_trim

Pad or Trim Audio to Fixed Length

parse_device

Parse Device Argument

parse_dtype

Parse Dtype Argument

split_audio

Split Long Audio into Chunks

tokenizer_decode

Decode Token IDs to Text

tokenizer_encode

Encode Text to Token IDs

transcribe_chunk

Transcribe Single Chunk

transcribe_long

Transcribe Long Audio

transcribe

Whisper Transcription

whisper_attention

Whisper Encoder

whisper_config

Whisper Model Configurations

whisper_decoder_layer

Whisper Decoder

whisper_decoder

Text Decoder

whisper_device

Device and Dtype Management

whisper_dtype

Get Default Dtype

whisper_encoder_layer

Encoder Layer

whisper_encoder

Audio Encoder

whisper_lang_token

Get Language Token ID

whisper_model

Whisper Model

WHISPER_SAMPLE_RATE

Audio Preprocessing for Whisper

whisper_special_tokens

Special Token IDs

whisper_tokenizer

Whisper BPE Tokenizer

Speech-to-text transcription using a native R 'torch' implementation of 'OpenAI' 'Whisper' model <https://github.com/openai/whisper>. Supports multiple model sizes from tiny (39M parameters) to large-v3 (1.5B parameters) with integrated download from 'HuggingFace' <https://huggingface.co/> via the 'hfhub' package. Provides automatic speech recognition with optional language detection and translation to English. Audio preprocessing, mel spectrogram computation, and transformer-based encoder-decoder inference are all implemented in R using the 'torch' package.

  • Maintainer: Troy Hernandez
  • License: MIT + file LICENSE
  • Last published: 2026-02-06