ocr function

Image Text OCR

Image Text OCR

Extract text from an image using the tesseract package.

image_ocr(image, language = "eng", HOCR = FALSE, ...) image_ocr_data(image, language = "eng", ...)

Arguments

  • image: magick image object returned by image_read() or image_graph()
  • language: passed to tesseract . To install additional languages see instructions in tesseract_download() .
  • HOCR: if TRUE return results as HOCR xml instead of plain text
  • ...: additional parameters passed to tesseract

Details

To use this function you need to tesseract first:

install.packages("tesseract")

Best results are obtained if you set the correct language in tesseract . To install additional languages see instructions in tesseract_download() .

Examples

if(require("tesseract")){ img <- image_read("http://jeroen.github.io/images/testocr.png") image_ocr(img) image_ocr_data(img) }

See Also

Other image: _index_, analysis, animation, attributes(), color, composite, defines, device, edges, editing, effects(), fx, geometry, morphology, options(), painting, segmentation, transform(), video

  • Maintainer: Jeroen Ooms
  • License: MIT + file LICENSE
  • Last published: 2025-03-23