Secret of Google Web-Based OCR Service

Introduction to Optical Character Recognition

How to OCR Documents for Free in Google Drive

Google Drive makes it painless to go paperless. Its collaborative documents, spreadsheets, and presentations already help curtail paper usage, but its OCR feature helps curb the paper mess even more.

OCR, or Optical Character Recognition, is the most important tech to help you go paperless. Scanned documents on their own are only glorified pictures of your documents, but let your computer recognize the text and they instantly become a ton more useful. We’ve already looked at how to OCR documents in Adobe Acrobat:

We’ve looked at turning PDF files into documents you can edit in Word as well:

Now if you don’t have a copy of Acrobat or Word, there’s an even better option: Google Drive. It includes a little-known free OCR tool that is a powerful, easy to use image to text converter.

In this tutorial, we’ll look at what is Google Drive’s OCR process and simple steps to begin working with it. I’ll show you how to use Google Drive to quickly convert your scanned images and PDF documents into editable text files online.

Google Cloud: Optical Character Recognition (OCR) Tutorial

Learn how to perform optical character recognition (OCR) on Google Cloud Platform. This tutorial demonstrates how to upload image files to Google Cloud Storage, extract text from the images using the Google Cloud Vision API, translate the text using the Google Cloud Translation API, and save your translations back to Cloud Storage. Google Cloud Pub/Sub is used to queue various tasks and trigger the right Cloud Functions to carry them out.

Objectives

  • Write and deploy several Background Cloud Functions.
  • Upload images to Cloud Storage.
  • Extract, translate and save text contained in uploaded images.

Visualizing the flow of data

The flow of data in the OCR tutorial application involves several steps:

  1. An image that contains text in any language is uploaded to Cloud Storage.
  2. A Cloud Function is triggered, which uses the Vision API to extract the text and detect the source language.
  3. The text is queued for translation by publishing a message to a Pub/Sub topic. A translation is queued for each target language different from the source language.
  4. If a target language matches the source language, the translation queue is skipped, and text is sent to the result queue, another Pub/Sub topic.
  5. A Cloud Function uses the Translation API to translate the text in the translation queue. The translated result is sent to the result queue.
  6. Another Cloud Function saves the translated text from the result queue to Cloud Storage.
  7. The results are found in Cloud Storage as txt files for each translation.

It may help to visualize the steps: