Automatic recognition of business forms and documents

Smart Document Engine

— automatic analysis and data extraction from business documents for desktop, server and mobile platforms.

 

Smart Document Engine — high-performance software for automatic classification, recognition and analysis of documents and forms. The system helps to automate document management workflows and optimize document entry processes. Smart Document Engine quickly and securely scans and extracts the required data from various document types — standard and reporting forms, primary, business, statutory, financial, notarial, legal, insurance and banking documents, as well as standard questionnaires and forms of strict accountability.

 

Like other Smart Engines products, Smart Document Engine works autonomously – data is NOT transferred to servers, is NOT saved or stored and internet access is NOT required. The document recognition process is performed on the user’s device.

 

Thanks to our state-of-the-art recognition and computer vision algorithms, compact deep neural network models with the Hough transform (HoughNet and HoughEncoder) and energy efficient GreenOCR® text recognition technology developed by our scientists and engineers, Smart Document Engine solves business document recognition tasks even on mobile phones, previously only high-performance servers managed to do it.

 

Thanks to Smart Engines technologies, a modern mobile phone scans a document stream from a specialized document scanner with a capacity of up to 30 pages per minute, previously only high-performance workstations or servers were able to do it. At the same time, the quality is on a high level and the solution ensures the best text recognition, digital and other document data accuracy. Thus, the recognition of a full-page tax certificate on Galaxy S10 takes less than 3 seconds. At the same time, the system quickly and precisely scans both scans and photos and automatically performs all the actions necessary for classification, data extraction and text recognition, robust to lighting conditions, geometric distortions and poor image quality.

 

The most important feature of Smart Document Engine is the ability to create specialized solutions for automatic processing, classification, recognition and analysis of documents and forms of various complexity required by particular customers. High-performance and accurate document recognition solutions, optimized for a specific data flow and business processes, will allow your organization to reduce costs and ease the workload processing, without compromising data security — the images are not transferred to any third-parties or their services.

 

The GreenOCR® technology included in Smart Document Engine provides high recognition accuracy of printed texts (OCR), handprinted and handwritten form fields (ICR), as well as label and checkmarks recognition (OMR). The implemented AI-based approach “I extract what I see” does not use linguistic models and high-quality text recognition is achieved by extremely high accuracy of character recognition. Thus, it allows you to achieve high digitization accuracy and avoid errors when the recognition system substitutes the result based on a particular linguistic or statistical language model.

 

The developers are provided with the simple API to integrate Smart Document Engine into the software, with support for various programming languages such as C, C++, C#, Java, Python, PHP, Swift, Objective C for a wide range of operating systems: iOS, Android, Sailfish Mobile, Linux, Windows, macOS, Astra Linux, Atlix OS, etc. It is also possible to connect with popular RPA frameworks.

 

Send Request

 

 

 

Smart Document Engine customers

Gazprombank

Gazprombank integrated Smart Engines AI for forms and documents scanning

Rosbank

Rosbank implemented an artificial intelligence solution for the paperwork

Tessi

Tessi partners with Russian developers of ID recognition systems Smart Engines

Alfa Insurance

Alfa Insurance has introduced the Intelligent Document Recognition system from Smart Engines

 

 

Send Request



Overview

Classification, scanning and recognition of documents with fixed layout

 

Documents with fixed layout are documents, various copies of which match when deleting the field details. Classic examples of fixed document forms are the medical absence excuse or machine-readable student examination forms. Smart Document Engine includes state-of-the-art algorithms for fast location and typization and of fixed-layout forms in scans, photos and video stream. The technology allows you to instantly detect and scan such documents, as well as verify their types.

Classification and recognition of flexible forms

 

Flexible forms are documents, whose elements and details can change their position relative to one another. Classic examples of flexible forms are tax and accounting documents, such as bills, enterprise balance sheets, financial performance reports, payment orders, etc. Smart Document Engine quickly identifies such documents, detects significant details and perform their recognition and analysis.

Analysis of unstructured documents

 

Smart Document Engine allows to create solutions for scanning and analysis of documents such as power of attorney, agreements and contracts. With the Smart Document Engine solution it is possible to classify such documents, extract text or other details, verify signatures and stamps, and much more.

Scans, photos and videos support

 

Smart Document Engine can recognize documents and forms both from images captured from flatbed and sheetfed scanners, as well as from photos and videos captured using smartphones, tablets or other mobile devices.

Analysis of multi-page documents

 

Smart Document Engine allows you to classify and recognize both individual images of documents, questionnaires and forms, as well as multi-page documents, or page sequences containing multiple documents. Page sequence processing allows you to simplify the stream scanning process, sort the stream and check the presence of the required documents (completeness check).

Text details recognition

 

The technology stack implemented in the Smart Document Engine platform allows you to quickly and accurately recognize single-line and multi-line text fields in more than 100 languages, extract details from dense paragraphs, typewritten and handprinted fields, and much more.

Recognition of marks and checkboxes

 

Smart Document Engine allows you to accurately and reliably recognize marks and checkboxes made both digitally and with a pen.

Verification and recognition of handwritten notes and signatures

 

Smart Document Engine allows you to extract handwritten fields and signatures, recognize handwritten numerical fields, and verify the presence of handwritten marks and signatures.

Table recognition

 

For high-quality processing of accounting, tax, banking and other documents, search and recognition of tabular data have been implemented in the Smart Document Engine system. Tables with a fixed structure, relational and non-relational, tables with a variable number of columns and table-like structured data recognition is supported.

Stamp detection and recognition

 

Smart Document Engine is equipped with fast stamp detection, location, and classification modules. In addition to the stamp search and verification, the platform allows you to recognize individual text components of stamps.

Checking keywords and keyphrases

 

While analyzing complex structured documents to identify significant phrases and paragraphs, as well as to detect incorrect changes in printed document texts, Smart Document Engine allows you to check the presence of keywords and keyphrases and monitor the integrity of the targeted paragraphs.

Content control

 

Smart Document Engine verifies document data entries, including mandatory text or graphic field entries check, and performs analysis of document zones designed for handwritten or handprinted filling.

Detection of blemishes, corrections and other markings

 

Smart Document Engine allows you to detect, locate and recognize blots, strikethrough, corrections and other markings in a document in order to control its validity and extract additional information.

Logos and other graphic elements verification

 

Smart Document Engine allows you to detect, locate and verify graphic elements, such as company logos, as well as important document graphic fields, for example, a personal photo attached to a questionnaire.

Document chromaticity control

 

Smart Document Engine allows you to determine both the document colour as a whole and its individual elements (stamps and signatures), thus, it allows to recognize a black-and-white document copy even when re-scanning or photographing.

Text field attributes analysis

 

In addition to the text field recognition, Smart Document Engine allows you to analyze the attributes of text field and individualcharacters, such as font characteristics (presence of serifs, italics and boldness), estimation of uniformity, monotonicity, etc.

Send Request

Deliverables

With the Smart Document Engine SDK, you can add deep document layout analysis and recognition to your infrastructure solutions for back-office automation, as well as to mobile applications to maximally simplify remote automatic document.

 

Smart Document Engine is delivered as an autonomous document scanner SDK (software development kit) which contains all the necessary precompiled libraries, programming interface documentation and integration examples for various programming languages. Developers are provided with a simple but multifunctional API (application programming interface), which allows to integrate the document recognition solution into the client’s software using languages C++, C#, Java, Python and Objective-C for a wide range of systems: iOS, Android, Linux, Windows, MacOS, Sailfish Mobile, Astra Linux, etc. The following hardware platforms are supported: x86_64, ARM v7, v8 (Aarch32, Aarch64), MIPS. It is possible to connect to the popular RPA frameworks.

 

Out of the box document templates are PCR test results of some popular laboratories, tax certificates, payment orders, balance sheets, the profit and loss statement.

 

Smart Document Engine can be bundled with Smart ID Engine to provide functionality for entering ID documents data, the solution provides high quality and fast recognition of more than 1,600 document types. Recognition of codified objects such as machine-readable zones (MRZ) and barcodes is available through the use of Smart Code Engine.

 

 

Send Request

Features

GreenOCR®
Developed by our scientists and engineers within the framework of Green AI initiative, GreenOCR® technology provides superior recognition quality and speed with minimal energy consumption and environmental impact. The recognition process is performed on the CPU and does not require additional GPU accelerators. The technology allows you to recognize typed or printed texts (OCR), handwritten texts (ICR) and marks (OMR).

 

Speed
Innovative integer image processing pipeline, including 8-bit and 4-bit neural network architectures, allows you to use intelligent document recognition even on low-end devices due to efficient usage of the available hardware resources

 

Efficiency
The highest performance is achieved due to computer vision algorithms and compact deep neural networks. The full cycle from the type definition to the recognition of all the details may take only 2 seconds for an A4 document page.

 

Precision
Our latest achievements in computational intelligence and deep learning allowed us to create next-generation OCR technologies and set a new benchmark of computer vision quality. The recognition accuracy of document details reaches 99.5% without human involvement.

Reliability
To increase the reliability, the AI-based “I extract what I see” approach is implemented. This approach does not involve the use of any dictionaries and grammars and is based on responsible compact networks. In addition to the recognition results, the user can access the confidence rates for each document field and get other information about the recognition process.

 

Security
Thanks to the achievements of our scientists, all computations are performed on device, data is not transferred to servers, which is confirmed by independent international audit. Data is NOT transferred, NOT stored, Internet access is NOT required, and data processing is performed in the local RAM. The "rule of three NOTs" ensures a high security and privacy level for our customers.

 

Operating Systems
Smart Document Engine supports a wide class of operating systems, including specialized operating systems for personal data processing: Cent OS, Ubuntu, Red Hat Enterprise Linux, SUSE Linux Enterprise Server, Arch Linux and other Linux distributions, MS Windows, macOS, Aurora OS, iOS, Android, Sailfish Mobile OS.

Easy-to-use software
Our advanced AI algorithms automatically detect the document in the frame, automatically determine its type, find textual and graphical details and perform their recognition. The system is resistant to various geometric distortions, noise, inconsistent lighting, printing defects and low resolution.

 

Languages
The system supports recognition of documents in 100 languages worldwide, Cyrillic and Latin alphabets, and logographic scripts including Korean, Japanese, traditional and simplified Chinese. Besides, the system searches for tables, stamps, signatures, handwritten notes, and recognizes checkboxes and barcodes.

 

Product line
Within Smart Document Engine it is possible to use all features of codified objects recognition of Smart Code Engine, as well as recognition and document liveness verification of ID documents of Smart ID Engine.
Our customers enjoy the advantages of using all Smart Engines products, using one SDK interface and reducing integration costs.

 

RPA
Support for almost all operating systems and hardware platforms, speed, quality and a wide range of image devices (smartphones, tablets, smart cameras, webcams, document cameras, scanners), as well as flexible integration capabilities can be used for robotic data entry process automation (RPA).

Send Request

Technical Specifications

Supported CPU architectures:

  • x86
  • x86_64
  • ARMv7-v8 (AArch32 и AArch64)
  • MIPS (MIPS32 и MIPS64)

Supported mobile OS:

  • Android (5.1 and up)
  • iOS (10 and up)
  • Sailfish Mobile OS (2.2 and up)
  • Custom versions: upon request
  • Supported server and desktop OS:
  • MS Windows (all officially supported versions)
  • Linux kernel based OS, including Ubuntu, Red Hat (RHEL), Debian, CentOS, SUSE, Astra Linux, Oracle Linux, and others
  • macOS (all officially supported versions)
  • QNX (version 7.0 and up)
  • Solaris (version 11.3 and up)
  • Custom versions: upon request

Supported document types:

  • PCR test results
  • Tax certificates
  • Payment orders
  • Balance sheets
  • The profit and loss statement

Other document types can be added upon request.

Multilingual text OCR:

Abkhaz, Afrikaans, Albanian, Armenian, Aymara, Azerbaijanian, Belarusian, Berber (Latin alphabet), Bosnian, Bulgarian, Cantonese, Castilian, Catalan, Chewa (Latin alphabet), Chibarwe (Latin alphabet), Chichewa (Latin alphabet), Comorian (Latin alphabet), Croatian, Czech, Danish, Dutch, English, Estonian, Fiji Hindi (Latin script), Fijian, Filipino, Finnish, French, Georgian, German, Greek, Haitian Creole, Hebrew, Hungarian, Icelandic, Indonesian, Irish, Italian, Japanese, Kazakh, Kinyarwanda, Kirundi, Korean, Kyrgyz, Latvian, Lithuanian, Luxembourgish, Macedonian, Malay (Latin alphabet), Maltese, Mandarin, Maori, Moldovan, Mongolian (Cyrillic alphabet), Nauruan, Ndau, Ndebele, Northern Sotho, Norwegian, Ossetian, Palauan, Polish, Portuguese, Quechua, Romanian, Russian, Serbian, Seychellois Creole, Shangani, Shona, Slovak, Slovene, Somali (Latin alphabet), Sotho, Southern Ndebele, Spanish, Swahili (Latin alphabet), Swazi, Swedish, Thai, Tajik, Tamazight (Latin alphabet), Tetum, Tok Pisin, Tonga, Tsonga, Tswana, Turkish, Turkmen, Ukrainian, Uzbek, Woleaian, Xhosa, Zulu.

Supported programming languages:

  • Java (version 1.7 and higher)
  • Python (version 3.7.2 and higher)
  • C++ (standard C++ 11 and higher)
  • C (standard C99 and higher)
  • C# (version 6.0 and higher)
  • PHP (version 5 and higher)

Mobile SDK includes React and Flutter integration interfaces.

Integration with other programming languages and frameworks (such as Go, Perl, Xamarin, etc.) ​​is ensured by using the API in C

The SDK already contains examples of using Smart Document Engine in Java / Python / C++ / C / Swift / Objective C / C# / Java / PHP.

Send Request

Please fill out the form to get more information about the products,pricing and trial SDK for Android, iOS, Linux, Windows.

    Our customers

    Rosbank

    Rosbank has implemented an artificial intelligence solution for the paperwork

    Tessi

    Tessi use Smart Engines AI-based solutions into its business process services

    Oman Arab Bank

    Smart Engines helps to implement Digital User Onboarding at Oman Arab Bank

    Sum&Substance

    Sum&Substance, a global KYC/AML service provider, use Smart ID Engine for IDs scanning

    Send Request

    Please fill out the form to get more information about the products,pricing and trial SDK for Android, iOS, Linux, Windows.