Using OCR for Document Recognition and Automatic Data Entry

Ivan Ozhiganov

Ivan Ozhiganov

Founder & CEO at Azoft

#Advanced technology


6 Jul 2017

Reading time:

6 Jul 2017

Arnold is an insurance salesman. Every single day he spends 2 hours editing reports and issuing policies. There are 18 more insurance brokers working alongside Arnold. All of them have one concern: too much manual, routine work. Since routine work is rarely exciting, it sabotages overall productivity. The time spent on typing, editing and spell checking would better be spent on client communication and skills development.

Let machines do all the manual, routine work to boost productivity and inspire people to solve cognitive tasks.

What is OCR?

Optical character recognition or OCR is a technology that helps you convert hard copies of documents into soft copies in a matter of minutes. You just scan documents and get them in a digital format with fully readable, editable text.

You need OCR when it comes to manual, routine work, and paperwork. Insurance, fintech, transport providers and government entities have been the first to take advantage of the technology. In fact, everyone who needs to scan documents and extract text from images can benefit from OCR.

How it works?

An OCR software recognizes and converts an image with text or a hard copy of a text document into a soft copy. You are free to edit the obtained copy any way you like; there are no limitations. These are the generic principles of OCR.

We developed a specific algorithm to serve the needs of Arnold’s company. Insurance salesmen are now able to issue policies in a heartbeat. Once you take a photo of the needed document using a mobile device, the OCR software instantly fills in all the required fields.

Why is it helpful?

You complete tasks faster

Using the traditional approach of manual typing, you’ll need about 20 minutes to issue one policy. Our OCR application requires no more than 5 minutes to solve the task. 3 easy steps for you to consider:

  • Take a photo of the required hard copy
  • See the required fields filled out automatically
  • Scan the text to eliminate errors, if there are any

No more frustrating, manual, routine work

The less routine work employees need to do, the better it is for their productivity, mood and desire to do their jobs. Our OCR software simplifies policy issuing by recognizing text and filling in required fields.

Arnold and his co-workers don’t waste their time on manual typing anymore. Now they are able to spend free hours solving worthy tasks.

How we recognized insurance documents

Here is an overview of how we tested our document recognition algorithms on insurance papers. With some small changes, the entire workflow can be applied to recognize all kinds of hard copies regardless their size and format. We used the methods listed below to recognize a passport and a driving license.

  • Localization
  • Filtration
  • Lines detection
  • Character recognition


Localization is the process of finding documents on images. We applied 3 different approaches during our localization efforts.

  • OpenCV accompanied by a trained Haar classifier
  • Fully convolutional neural networks
  • Analytical approach based on finding connected components

The first approach implies that you apply a rectangle frame to an image and crop it until the area that contains text is captured inside the frame. The document recognition algorithm runs until it captures the document on a video.

The second approach is about fully convolutional neural networks. A colored passport image is taken as an input. The output has 2 channels. First is used to find the center of the first passport page, the second one is used to find the center of the second page. The network, in turn, is trained to guess the mask that contains Gaussian peaks.

The third approach was applied to recognize a passport. We used the template with two areas on a passport spread, each containing the document serial numbers. Every time a picture matched the template, we knew it was a passport. This approach is based on finding connected components.


The backgrounds of all documents are laminated and contain watermarks. This means nothing more than just noise for a neural network, and prevents it from proper character recognition. We used the following methods to get rid of this noise:

  • The fastNlMeansDenoisingColored algorithm that reduces noise and smooths watermark lines
  • Bilateral filter that helps obtain a homogeneous background
  • Adaptive binarization

Detecting lines

To detect lines in driving licenses we used OpenCV and contour detection. We then split the lines into separate characters and used them as an input for our neural network.

We made use of a fully convolutional neural network to detect passport lines.

Text recognition

We applied neural networks to recognize text. For drivers license recognition we adapted a convolutional neural network that was trained on a huge dataset. The dataset contained images of different characters. The Torch framework was very useful to design and train the network.

Passport text was recognized by a fully convolutional neural network. By the way, recognition of individual letters wasn’t considered. We used a specifically written script, training data generator, to train our neural network.

Training data generator

  • Driving license
  • Camera
  • Executable file
  • C++ sources to build the executable file
  • Script in Lua that operates the neural network. The CNN architecture is classified in the script
  • Binary file with networks weights that is executed by Lua script. Weights were obtained after the training

Why does your business need our document recognition & automatic data entry solution?

We achieved 90% document recognition accuracy by combining various approaches. This is far beyond the industry average.

How did we manage to be so accurate? We tried hard testing hypotheses and figured out what works best for each and every business case.

Start using OCR technologies in business to recognize documents and take advantage of automatic data entry. OCR is a real turbocharger that speeds up workflow, boosts productivity, and keeps your employees from doing meaningless work.

Ping us any time to get a quote! We are happy to craft a unique solution that serves your personal needs.


Filter by