Helper function to download training data from the official tessdata repository. On Linux, the fast training data can be installed directly with yum or apt-get.
tesseract_download( lang, datapath =NULL, model = c("fast","best"), progress = interactive())
Arguments
lang: three letter code for language, see tessdata repository.
datapath: destination directory where to download store the file
model: either fast or best is currently supported. The latter downloads more accurate (but slower) trained models for Tesseract 4.0 or higher
progress: print progress while downloading
Details
Tesseract uses training data to perform OCR. Most systems default to English training data. To improve OCR performance for other languages you can to install the training data from your distribution. For example to install the spanish training data: