OCR Text Recognition: How We Used OCR for Embossed Text

Imagine being able to take a picture of barely legible text, and then have your smartphone automatically identify whatever is written in the text. In fact, there are already many solutions available for scanning and decoding printed text in an image, but these solutions usually require the text to be clear and with good contrast. Now, what if you need to detect text that doesn’t have much contrast compared to the background, such as embossed credit card numbers?

Detecting embossed text in images is a task that poses a number of challenges. Embossed characters don’t typically have a uniform color and may have low contrast with their background or intersect various surrounding irregularities. Traditional approaches to character segmentation designed for scanned text cannot be used in such conditions. Obviously, some kind of preprocessing is required here, but classical filters such as Gaussian and Median fail to produce good results. For all these reasons, we decided to search for a specialized algorithm and found one particularly suitable for our project. In this article we’d like to present a slightly modified version of this algorithm that helped us OCR embossed text in images.

Stroke width OCR text recognition algorithm is based on an assumption that textual characters generally have nearly constant stroke width. This kind of strokes is separated from other elements by the algorithm to recover regions with text. Background noise is reduced, leaving outlines and patterns.

Stroke width algorithm requires certain preprocessing of the original data to achieve the desired result. The preprocessing stage consists of the following steps:

Step 1. Convert the source image I to grayscale.

Step 2. Detect edges in the grayscale image using the Sobel or similar operator. The resulting image we denote by Ie.

Step 3. Perform binarization of Ie using Otsu’s method. The resulting image we denote by Ib.

Both Ie and Ib are used as input for the stroke width algorithm. Next, we need to perform the local binarization and voting steps:

Step 4. Create a 2-dimensional array S with the same dimensions as I and fill it with zeroes.

Step 5. Create a binary mask W_in and binary mask W_out. Their dimensions should be N_in×N_in and N_out×N_out respectively. N_in and N_out values depend on the stroke width in the image and N_in is always less than or equal to N_out.

Step 6. For every pixel Ie[i, j] that satisfies the condition Ib[i, b] = 1 we apply the W_in mask centered on this pixel to the image and look for minimum and maximum values (P_min, P_max) among the pixels found within this mask.

Step 7. The same pixel Ie[i, j] is then used to center the W_out mask and for every pixel falling into W_out we perform the transform: S[i+k, j+l] = S[i+k, j+l] + 1, if Ie[i+k, j+l] ≥ t(i,j), where k,l ≤ N_out / 2 and t(i, j) = (Pmax + Pmin) / 2.

The resulting grayscale image stored in 2-dimensional array S will have suppressed background and intensified strokes that compose text. The image is suitable for additional binarization or further processing (segmentation of digits, etc.). Like Ie, S is a grayscale image but with decreased range of pixel brightness. The brightness range depends on the size of W_in and W_out (smaller masks result in a smaller range).

After some experiments, we discovered that the results can be improved if the binarization level used to produce Ib (step 3) is calculated as follows:

Apply Gaussian blur to Ie after detecting edges with the Sobel operator. The resulting image we denote by Ig, and we’ll continue to use Ie in step 6.
Calculate the binarization level by processing the difference matrix abs(Ig-Ie) using Otsu’s method.

The presented algorithm is applicable for practical use and does have several distinct features, namely:

The algorithm reduces noise and emphasizes text boundaries resulting in better character segmentation;
Distinctive character shapes don’t get lost during processing, except when the original image is preprocessed by binarization methods.
The algorithm works equally well with high contrast and low contrast images. It does not require a separate normalization step.
The resulting image has a reduced brightness range compared to the original. This may come useful for character recognition and further binarization.

Implementing credit card number recognition in Objective C: Hands-on example

- (void)processingByStrokesMethod:(cv::Mat)src dst:(cv::Mat*)dst 
{ 
/* 
 src - input grayscale image 
 dst - output grayscale image 
*/ 
 cv::Mat tmp; 
 cv::GaussianBlur(src, tmp, cv::Size(3,3), 2.0); // gaussian blur 
 tmp = cv::abs(src - tmp); // matrix of differences between source image and blur iamge 
 
 //Binarization: 
 cv::threshold(tmp, tmp, 0, 255, CV_THRESH_BINARY | CV_THRESH_OTSU); 
 
 //Using method of strokes: 
 int Wout = 12; 
 int Win = Wout/2; 
 int startXY = Win; 
 int endY = src.rows - Win; 
 int endX = src.cols - Win; 
 
 for (int j = startXY; j < endY; j++) { 
 for (int i = startXY; i < endX; i++) { 
 //Only edge pixels: 
 if (tmp.at<unsigned char="">(j,i) == 255) {

How to OCR Embossed Text: A Quick Guide With Examples

Implementing credit card number recognition in Objective C: Hands-on example

Comments

Filter by

How to OCR Embossed Text: A Quick Guide With Examples

Implementing credit card number recognition in Objective C: Hands-on example

Sign Up for The free Azoft Newsletter

Related articles

Instant OpenCV License Plate Recognition in iOS Apps with GPGPU

VR, AR, MR Technologies: Differences and Areas of Application

Fully Convolutional Networks for Semantic Segmentation

Comments

Filter by

Thank You for Subscribing!