Version5.4.1+pkg-944a
Revision2507
Size35.5 MB
LicenseProprietary
Confinementstrict
Basecore22

open source optical character recognition engine


Tesseract has unicode (UTF-8) support, and can recognize more than 100
languages "out of the box". It can be trained to recognize other languages.
Tesseract supports various output formats: plain-text, hocr(html), pdf.

If you want to access the files under /media/* or /run/media/* you'll have
to connect the snap to the core snap's removable-media interface:

 $ sudo snap connect tesseract:removable-media

Update History

5.4.1+pkg-944a (2507)
13 Dec 2025, 09:47 UTC

Published30 Jun 2017, 12:17 UTC

Last updated23 Aug 2024, 04:14 UTC

First seen13 Dec 2025, 09:47 UTC