October 6, 2019

U-Net-bin: hacking the document image binarization contest

Computer Optics. – 2019. – Vol. 43(5). – P. 825-832. – DOI: 10.18287/2412-6179-2019-43-5-825-832.

Image binarization is still a challenging task in a variety of applications. In particular, Document Image Binarization Contest (DIBCO) is organized regularly to track the state-of-the-art techniques for the historical document binarization. In this work we present a binarization method that was ranked first in the DIBCO`17 contest. It is a convolutional neural network (CNN) based method which uses U-Net architecture, originally designed for biomedical image segmentation. We describe our approach to training data preparation and contest ground truth examination and provide multiple insights on its construction (so called hacking). It led to more accurate historical document binarization problem statement with respect to the challenges one could face in the open access datasets. A docker container with the final network along with all the supplementary data we used in the training process has been published on Github.


Test Drive Our Smart Engines

Free demo apps allow you to experience the power of Smart Engines software for intelligent document scanning in a real-world context.

Why not experience the power of Smart Engines for yourself? Our demo apps allow you to test the capabilities of our identity document recognition software on mobile devices in videostream or in a single image (photo, scan).

Simply display any document to the camera in real-time or choose a photo from the gallery, and the app will recognize and capture the necessary data.

Demo apps Privacy Policy

id documents enginge by Smart Engines
Apple App Store Badge
Google Play Badge
id documents enginge by Smart Engines

Get in Touch

For questions about our products, research, people or project proposals, please get in touch.

Contact Form
Warning before submitting your request:

Smart Engines is fully committed to provide an answer within 2 working days. However, it is your responsibility that your IT infrastructure does not block our reply or redirect it into your spams. If you haven’t received any answer from us within 2 working days, please check your spams or simply call us.

Smart Engines guarantees that the provided information will not be made public and will be used only internally.