The 1st December 2009, I started a “CIFRE” thesis in Gestform digitizing company, and the LaBRI (Bordeaux Laboratory for Computer Science), in the Image and Sound team. My supervisors are Nicholas Journet and Jean-Philippe Domenger.

A CIFRE thesis is a little special because it is founded by a company and I worked half in a company and half in a laboratory.

My thesis focuses on the analysis and classification of image documents.

Paperless office have many interests: reducing the number of prints and therefore the consumption of paper and ink and saves physical space. But the main interest is to save time when searching for documents by using digital research tools. We can easily do research on the content of a document or on its creation date, type, etc..

The complexity is to transform a paper document to a digital document. We need to analyze and retrieve information as the text (using OCR), the layout of the document, designs, logos, etc. It is also interesting for a company to sort or group the documents automatically or semi-automatically. So when a new document is scanned it could be automatically stored in a folder with the same type of document.

The main researches of my thesis will be: image analysis and processing, classification and learning techniques.

This blog will mainly contain information about my personal and professional research on the analysis, processing and classification of document image.

I certainly also put information on related areas such as image processing, machine learning, data mining, etc.

You can also visit my website: and contact me at the following address: augereau.o at