Handwritten digits recognition based on neural networks

During summer 2012 and summer 2013 I supervised two master internship about handwritten digit recognition at Gestform digitizing company. The first one was about basic notions for reading, in a simple case, a handwritten digit and the second one was about segmentation of digits.

Here is an example of what we want to do :

HandwrittenDigitsTwo things are done here : 1) recognizing which part of the text contain digits or not and then 2) reading the digits. Almost the same tools are used for part 1 and 2.

Which part of the text contain digits ?

First, we try to segment the text line by line. Several algorithms can be apply for doing that, here a simple one is used : RLSA.

RLSAThen each connected component are extracted and analyzed in order to classified it as digit or non-digit.

CCCC2As we can see, before extracting connected components some preprocessing should be done such as morphological mathematics (dilatation and erosion) because some parts of digits are cut.

In order to extract the features homogeneously, some preprocessing are applied before feature extraction in order to normalize the features : correction the slope and the angle of the digit.

Then many features are extracted such as :

  • Hu invariant moments
  • Projection histograms
  • Profile histograms
  • Intersection with horizontal lines
  • Position of holes
  • Extremities
  • Junction points
Profile

Profile histogram

Projection

Projection histogram

lines

Intersection with horizontal and vertical lines

They are all concatenated into one vector of 124 dimensions. Another vector is build from Freeman chain code (an histogram of 128 dimensions).

Freeman

Freeman chain code

After extracting features, two neural networks are used in order to classify connected component as digit or non-digit. The first one have 124 inputs and the second one 128, each have 2 output : D (digit) or R (reject or non-digit). Many example have to be used in order to train the classifier (around 10 000 for each class).

NonDigits

Reading the digits

Here, the same features and neural network are used, but instead of 2 classes (digit / non-digit) 10 classes are used (0,1,2,3,4,5,6,7,8,9).

You can download training examples here. Some examples of 6 digits :

mnist6

 

Correcting some classification errors

Many digits are touching others and are classified as R. So we introduced a new class. as “DD” (double digit). Furthermore, by using the sequences it is possible to correct some errors. By example, if you are looking for a 5 digit postal code it is possible to change a result such as : RRDDDRDRR as RRDDDDDRR or also filtering noise : RRRRRRDRRRR -> RRRRRRRRRRR. In order to do this HMM is used. Here is a HMM designed for postal code with D (digit) DD (double digits) and R (reject / non-digit) classes :

HMM

Double digits segmentation

In order to do double digits segmentation, the “drop fall” algorithm is used.
dropFallThe drop fall algorithm can be seen as if a drop of water is sliding along the digits. 4 drop fall can be done depending if the starting point is set up/left, up/right, down/right or down/left. Then, in order to chose the best segmentation the digits are recognized by the neural network, the couple of digits with the best recognition rate is kept.

Bibliography

Yi-Kai Chen and Jhing-Fa Wang. Segmentation of single-or multiple-touching handwritten numeral string using background and foreground analysis. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 22(11) :1304–1317, 2000.

Britto de S, Robert Sabourin, Flavio Bortolozzi, Ching Y Suen, et al. A string length predictor to control the level building of hmms for handwritten numeral recognition. In Pattern Recognition, 2002. Proceedings. 16th International Conference on, volume 4, pages 31–34. IEEE, 2002.

RV Kulkarni and PN Vasambekar. An overview of segmentation techniques for handwritten
connected digits. In Signal and Image Processing (ICSIP), 2010 International
Conference on, pages 479–482. IEEE, 2010.

Umapada Pal, Abdel Belaïd, and Christophe Choisy. Water reservoir based approach
for touching numeral segmentation. In Document Analysis and Recognition, 2001. Proceedings. Sixth International Conference on, pages 892–896. IEEE, 2001.

Ma Rui, Du Jie, Gu Yunhua, and Yan Yunyang. An improved drop-fall algorithm based
on background analysis for handwritten digits segmentation. In Intelligent Systems,
2009. GCIS’09. WRI Global Congress on, volume 4, pages 374–378. IEEE, 2009.

Javad Sadri, Ching Y Suen, and Tien D Bui. Automatic segmentation of unconstrained
handwritten numeral strings. In Frontiers in Handwriting Recognition, 2004. IWFHR-9
2004. Ninth International Workshop on, pages 317–322. IEEE, 2004.

Jie Zhou and Ching Y Suen. Unconstrained numeral pair recognition using enhanced
error correcting output coding : a holistic approach. In Document Analysis and Recognition, Proceedings. Eighth International Conference on, pages 484–488. IEEE, 2005.

Clément Chatelain, Guillaume Koch, Laurent Heutte, and Thierry Paquet. Une méthode dirigée par la syntaxe pour l’extraction de champs numériques dans les courriers entrants. 2006.

Leave a Reply

Your email address will not be published. Required fields are marked *


4 − = two

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>