Thanks to Artificial Intelligence tools and machine learning, applied to text recognition, nowadays it is possible to translate texts in real time or transcribe the text of an image just by focusing it with the mobile camera.
OCR technology was born in the Hewell-Packard laboratories, but is currently being funded by Google
Optical recognition technology, optical character recognition (OCR), is one of the best known and used digitization tools in the world to convert physical documents to digital documents. OCR is a free software tool, which has its own repository on Github, so that developers can access it and use it to create their own applications.
The development of OCR began in the Hewell-Packard laboratories And now it is funded by Google and is compatible with all operating systems out there. The latest version of the tool, called Tesseract 4, uses neural networks to translate images to text and it supports more than a hundred languages.
OCR Tesseract works through command lines. In this sense, although the official information of the tool offers a complete user guide and commands, in reality it is not easy to use it that way. Fortunately, there are some applications that help to manage this tool through an application, such as gImageReader.
What is gImageReader
gImageReader is an application free and free license (GPL 3.0), available for Windows and Linux, which is capable of converting images to text. In this sense, if an image containing text is opened, the tool detects the text, extracts it and converts it into a text document, for example, in Word.
In this sense, it is possible to convert text from images within a PDF, from images in various formats and from different devices, even from screenshots or from an image that has been copied to the clipboard.
The following image shows the gImageReader user interface:
In function of the image resolution quality, text recognition can be done manually or automatically. Once the image is converted to text, it can be formatted or edited to correct any grammatical or spelling errors.