url https://towardsdatascience.com/secret-of-google-web-based-ocr-service-fe30eecedd01

Introduction to Optical Character Recognition

Optical Character Recognition (OCR) is one of the way to connect reality world and virtual word. First OCR system is introduced in late 1920s. The objective of OCR is recognising text from image. However, it is very challenge to achieve a very high accuracy due to lots of factors. In the following story, I will introduce how Google build solution which is one of the Google Cloud Vision API to tackle this problem.

Talking about OCR, tesseract is one of the famous open source library that everyone can leverage it to execute OCR. Tesseract is found by HP and development has been sponsored by Google since 2006. Tesseract 3.x model is old version while 4.x version is built by deep learning (LSTM). If you want to understand difference between 3.x and 4.x, you can visit sharing for more detail.

As tesseract is implemented by C++, we cannot invoke it as other python library. Indeed, we can invoke C-API in python but it is not quite user friendly. Therefore, python wrapper, pytesseract, is introduced to make our life easier.

openpolitics.com

Secret of Google Web-Based OCR Service

Introduction to Optical Character Recognition