Image Quality Assessment for predicting OCR

The aim of OCR (Optical Character recognition) is to read and transcript the text present in an image.

The most famous commercial OCR software is certainly FineReader from ABBYY company. Omnipage, from Nuance company is also quite famous. For open source OCR, Tesseract (Google) works quite well.

OCR systems works quite well on relatively “new”, black & white, 300dpi document images. Rotation, size and fonts of characters, blur, stains, noise, etc. will decrease the quality of OCR. Most of the time, many preprocessing are applied on images before submitting them to the OCR : deskewing, despeckling, segmentation, etc.

Some researchers are working on predicting OCR by using IQA (Image Quality Assessment). 3 kinds of methods exist : full image reference [1],   reduced reference [2] and without reference [3].

[1] CAPODIFERRO, Licia, JACOVITTI, Giovanni, et DI CLAUDIO, Elio D. Two-Dimensional Approach to Full-Reference Image Quality Assessment Based on Positional Structural Information. Image Processing, IEEE Transactions on, 2012, vol. 21, no 2, p. 505-516.

[2] REHMAN, Abdul et WANG, Zhou. Reduced-reference image quality assessment by structural similarity estimation. Image Processing, IEEE Transactions on, 2012, vol. 21, no 8, p. 3378-3389.

[3] CIANCIO, Alexandre, DA COSTA, ALN Targino, DA SILVA, Eduardo AB, et al. No-reference blur assessment of digital pictures based on multifeature classifiers. Image Processing, IEEE Transactions on, 2011, vol. 20, no 1, p. 64-75.

CIFED 2012: Best Paper Award

From 21 to 23 March 2012,  the Symposium on Writing and Document (CIFED) held in Bordeaux. Since 1992, every 2 years, CIFED is gathering the French scientific community to expose and exchange ideas around the themes of the written document.

International researchers presented their works about document image analysis and processing. I had the opportunity to participate to this conference, but also to present a part of my thesis work.

At the end of this conference, I had the honor of receiving the Best Paper Award for my work about the recognition of semi-structured documents.

My oral presentation (in French) is available here.

Document image recognition

Document image matching

The problem of the recognition of document image can be complex because it requires to be robust in translation, rotation and zoom. It may also happen that the documents are degraded (noise, spots, cuts, etc.).

Techniques based on using interest points such as SIFT and SURF are commonly used in natural images (pictures). I worked on an extension of this method to quickly recognize patterns document given by a user, such as an identity card, a passport, train ticket, etc..

The method is simple and extensible to many other image document, it is divided into four main steps:

  1. Extraction of interest points. (SURF)
  2. Description of points. (SURF)
  3. Matching the current image points with those of the query image. (FLANN)
  4. Estimation of a 4-parameter transformation. (RANSAC)

Technological choices in brackets will be changed in the future by new more efficient algorithms and more suitable to the context.

The details of the technique can be found in the publication in 2012 CIFED: Recognition and Extraction of identity documents (in French).