Smart Engines 2021: Международная награда в ИТ-инновациях

Smart Document Engine SDK Overview

About Document Smart Engine

Smart Document Engine is a multi-platform, stand-alone SDK for recognizing structured documents, standard forms, from bills of payment to acts, invoices and transfer documents.

Supported operating systems

The following operating systems are supported:

  • Windows;
  • Linux-based systems;
  • Mac OS;
  • iOS;
  • Android.

Workflow

General workflow includes the following stages:

  1. creation of an DocEngine instance, see Creating an DocEngine instance;
  2. setting up a recognition session:
  1. creation of a processing settings object, see Creating a processing settings object;
  2. registering the image object in the session, see Registering the Image object in the session;
  3. processing the image, see Processing the image;
  4. getting the recognition result, see Recognition result.

Creating an DocEngine instance

Create a DocEngine instance as follows:

// C++
std::unique_ptr<se::doc::DocEngine> engine(se::doc::DocEngine::Create(
configuration_bundle_path));

// Java
import com.smartengines.doc.*;
DocEngine engine = DocEngine.Create(configuration_bundle_path);

Parameters:

  • configuration_bundle_path is a path to provided configuration file (usually a file with extension .se);
  • a boolean flag for enabling lazy configuration (true by default). It is an optional parameter. 
    • If lazy configuration is enabled, some of the internal structures will be allocated and initialized only when first needed. 
    • If you disable the lazy configuration, all the internal structures and components will be initialized  during DocEngine instance creation.

TIP

 Disable lazy configuration for:
• server applications for which the first recognition response time is more important than the total memory consumption;;
• measuring the maximum memory consumed by an application

Configuration process might take a while but it only needs to be performed once during the program lifetime. Configured DocEngine is used to spawn sessions which have actual recognition methods.

ATTENTION!

DocEngine::Create() is a factory method and returns an allocated pointer. You are responsible for deleting it.

See more about configuration bundles in Configuration Bundles.

Setting up a recognition session

Creating a session setting object

To create DocSessionSettings from configured DocEngine:

// C++
std::unique_ptr<se::doc::DocSessionSettings> settings(
engine->CreateSessionSettings());

// Java
import com.smartengines.doc.*;
DocSessionSettings settings = engine.CreateSessionSettings();

ATTENTION!

DocEngine::CreateSessionSettings() is a factory method and returns an allocated pointer. You are responsible for deleting it.

Enabling document types

Enable required document types as shown in the following examples:

// C++
settings->AddEnabledDocumentTypes("deu.*"); // All the documents of Germany

// Java
settings.AddEnabledDocumentTypes("deu.*"); // All the documents of Germany

See more about document types in Specifying document types for DocSession.

Spawning  a DocSession

A personal signature is provided to the customer when the Smart Document Engine product is delivered. This signature is located in the README.html file in the /doc directory. 

Each time a recognition Id Engine session instance is created, the signature must be passed as one of the arguments to the session creation function. It confirms that the caller is authorized to use the library and unlocks the library. 

Functionality is checked offline, the library does not access any external resources.

To spawn a DocSession:

// C++
const char* signature = "... YOUR SIGNATURE HERE ..."; //Your personal signature you use to start Smart Document Engine session
std::unique_ptr<se::doc::DocSession> session(
engine->SpawnSession(*settings, signature));

// Java
import com.smartengines.doc.*;

String signature = "... YOUR SIGNATURE HERE ..."; //Your personal signature you use to start Smart Document Engine session
DocSession session = engine.SpawnSession(settings, signature);

ATTENTION!

DocEngine::SpawnSession() is a factory method and returns an allocated pointer. You are responsible for deleting it.

Creating a processing settings object

Create a processing settings object as follows:

// C++
std::unique_ptr<se::doc::DocProcessingSettings> proc_settings(
session->CreateProcessingSettings());

// Java
import com.smartengines.doc.*;
DocProcessingSettings proc_settings = session.CreateProcessingSettings();

Creating a document image object

Create an Image object which will be used for processing:

// C++
std::unique_ptr<se::common::Image> image(
se::common::Image::FromFile(image_path)); // Loading from file

// Java
import com.smartengines.doc.*;
Image image = Image.FromFile(image_path); // Loading from file

ATTENTION!

Image::FromFile() is a factory method and returns an allocated pointer. You are responsible for deleting it.

Registering the Image object in the session

Register the Image object in the session and set it as a current source

// C++
int image_id = session->RegisterImage(*image);
proc_settings->SetCurrentSourceID(image_id);

// Java
int image_id = session.RegisterImage(image);
proc_settings.Process(proc_settings);

Processing the image

Call Process(…) method for launching the session’s processing routine 

// C++
session->Process(*proc_settings);

// Java
session.Process(proc_settings);

ATTENTION!

DocResult::Process() is not a factory method, but the returned result object is not independent. The result object lifetime does not exceed the session lifetime.

Recognition result

Obtaining the current result from the session

Obtain the current result from the session:

// C++
const se::doc::DocResult& result = session->GetCurrentResult();

// Java
import com.smartengines.doc.*;
DocResult result = session.GetCurrentResult();

Extracting the recognized information

Use DocResult fields to extract the recognized information:

// C++
// Going through the found documents
for (auto doc_it = result.DocumentsBegin();
     doc_it != result.DocumentsEnd();
     ++doc_it) {
    const se::doc::Document & doc = doc_it.GetDocument();
   
    // Going through the text fields
    for (auto it = doc.TextFieldsBegin();
         it != doc.TextFieldsEnd();
         ++it) {
        // Getting text field value (UTF-8 string representation)
        std::string field_value = it.GetField().GetOcrString().GetFirstString().GetCStr();
    }
}

// Java
import com.smartengines.doc.*;
// Going through the found documents
for (DocumentsIterator doc_it = result.DocumentsBegin();
     !doc_it.Equals(result.DocumentsEnd());
     doc_it.Advance()) {
    Document doc = doc_it.GetDocument();
    // Going through the text fields
    for (DocTextFieldsIterator it = doc.TextFieldsBegin();
         !it.Equals(doc.TextFieldsEnd());
         it.Advance()) {
        // Getting text field value (UTF-8 string representation)
        String field_value = it.GetField().GetOcrString().GetFirstString().GetCStr();
    }
}

Exporting the result to PDF

  1. Before SpawnSession enable pdf/a creation with enablePDF option in DocSessionSettings

    // C++

    settings->SetOption("enablePDF", "true");

    // Java
    settings.SetOption("enablePDF", "true");

  2. Obtain the mutable current result from the session

    // C++
    se::doc::DocResult& result = session->GetMutableCurrentResult();

    // Java
    import com.smartengines.doc.*;
    DocResult result = session.GetMutableCurrentResult();

  3. Check if a pdf/a buffer can be obtained for the processed document:

    // C++
    bool pdf_is_available = result.CanBuildPDFABuffer();

    // Java
    Boolean pdf_is_available = result.CanBuildPDFABuffer();

  4. Enable addition of text layer if needed (the default value is “image_only”)

    // C++
    result.SetAddTextMode("image_with_text");

    // Java
    result.SetAddTextMode("image_with_text");

  5. Change text addition mode if you need exact character to character geometrical correspondence (the default value is “words”). Usage of this option will make the resulting file heavier

    // C++
    result.SetAddTextMode("chars");

    // Java
    result.SetAddTextMode("chars");

  6. Build a pdf/a buffer

    // C++
    result.BuildPDFABuffer();

    // Java
    result.BuildPDFABuffer();

  7. Get the resulting buffer size and copy it to a caller-created buffer

    // C++
    const size_t pdf_size = result.GetPDFABufferSize();
    unsigned char* pdfb = new unsigned char[pdf_size];
    result.GetPDFABuffer(pdfb, pdf_size);

    // Java
    int pdf_size = result.GetPDFABufferSize();
    byte[] pdfb = new byte[pdf_size];
    result.GetPDFABuffer(pdfb);

Smart Document Engine SDK Overview

Delivery package structure

The basic Smart Document Engine delivery package includes: 

  • C++ prebuilt library;
  • documentation;
  • usage examples;
  • prebuilt wrappers for C#/Java/Python/PHP;
  • obj-C or/and React/Flutter wrapper source code for mobile platform;
  • the wrapper auto generators to build the wrappers for the client’s product environment (If the auto generator fails to start, use the client’s exact version of Python 3 or PHP).

The files are arranged in directories as shown in the table below:

Directory Contents Description
secommon C++ se::common namespace files Common classes, such as Point, OcrString, Image, etc. See Common classes
Files of integration, for example,
Java com.smartengines.common module (one compiled file)
doc Documentation See Code documentation
samples Complete compilable and runnable sample usage code  
data-zip  Bundle files in format: bundle_something.se  Configuration files See Configuration bundles

Common classes

Common classes, such as Point, OcrString, Image, etc. are located within se::common namespace and are located within a secommon directory:

For C++ these are such header as:

Header Description
#include <secommon/se_export_defs.h>   Contains export-related definitions of Smart Engines libraries
#include <secommon/se_exceptions_defs.h>   Contains the definition of exceptions used in Smart Engines libraries
#include <secommon/se_geometry.h>   Contains geometric classes and procedures (Point, Rectangle, etc.)
#include <secommon/se_image.h>   Contains the definition of the Image class 
#include <secommon/se_string.h>   Contains the string-related classes (MutableString, OcrString, etc.)
#include <secommon/se_string_iterator.h>   Contains the definition of string-targeted iterators
#include <secommon/se_serialization.h>   Contains auxiliary classes related to object serialization (not used in Smart Document Engine)
#include <secommon/se_common.h>   This is an auxiliary header which simply includes all of the above

The same common classes in Java API are located within com.smartengines.common module:

// Java
import com.smartengines.common.*; // Import all se::common classes

Main classes

The main Smart Document Engine classes are located within se::doc namespaces and are located within an docengine directory:

Header Description
#include <idengine/doc_document_info.h> Provides information about the document type (textual document description)
#include <docengine/doc_engine.h>   Contains docengine class definition
#include <docengine/doc_session_settings.h>  Contains DocSessionSettings class definition
#include <docengine/doc_session.h>   Contains DocSession class definition
#include <docengine/doc_video_session.h> Contains DocVideoSession class definition
#include <docengine/doc_processing_settings.h> Contains DocProcessingSettings class definition
#include <docengine/doc_result.h>   Contains DocResult class definition, as well as DocTemplateDetectionResult and DocTemplateSegmentationResult
#include <docengine/doc_document.h> Contains Document class definition
#include <docengine/doc_documents_iterator.h> Contains documents related iterators
#include <docengine/doc_fields.h>   Contains the definitions of classes representing Smart Document Engine fields
#include <docengine/doc_fields_iterator.h>  Contains fields related iterators
#include <docengine/doc_feedback.h>   Contains the DocFeedback interface and associated containers
#include <docengine/doc_external_processor.h> Contains the external document processing interface
#include <docengine/doc_graphical_structure.h> Contains DocGraphicalStructure class definition
#include <docengine/doc_tags_collection.h> Contains DocTagsCollection class definition
#include <docengine/doc_view.h> Contains DocView class definition
#include <docengine/doc_views_iterator.h> Contains DocView (document images) related iterators
#include <docengine/doc_views_collection.h> Contains DocViewsCollection class definition
#include <docengine/doc_basic_object.h> Contains DocBasicObject class definition
#include <docengine/doc_basic_objects_iterator.h> Contains DocBasicObject (basic document objects) related iterators
#include <docengine/doc_objects.h> Contains definitions of graphical object classes
#include <docengine/doc_objects_collection.h> Contains DocObjectsCollection class definition
#include <docengine/doc_objects_collection_iterator.h> Contains DocObjectsCollection-related iterators
#include <docengine/doc_forward_declarations.h>   Service header containing forward declarations of all classes

The same classes in Java API are located within com.smartengines.doc module:

// Java
import com.smartengines.doc.*; // Import all se::doc classes

Documentation

All the classes, their methods, the methods options and options values are described both in comments that are converted into docengine.pdf included into the documentation directory. 

The documentation is available at doc directory. The doc directory structure:

  • DOCUMENTS_REFERENCE.html – contains the list of the countries whose documents are supported, of the supported document types and their description;
  • README.html – brief description of Smart Document Engine SDK;
  • docengine.pdf – description of the classes, their methods etc.;
  • NOTICE.txt – contains information about the used external software;
  • WHATSNEW.txt – contains information about updates. Included if the bundle file contains forensics;
  • WHATSNEW_FORENSICS.txt – contains information about forensics updates.

Exceptions

The C++ API may throw se::common::BaseException subclasses when the user passes invalid input, makes bad state calls or if something else goes wrong. 

The following exception (se::common::BaseException) subclasses are implemented:

Exception name Description
FileSystemException Thrown if an attempt is made to read from a non-existent file, or other file system related IO error
internalException Thrown if an unknown error occurs or if the error occurs within internal system components
InvalidArgumentException Thrown if a method is called with invalid input parameters
InvalidKeyException Thrown if to an associative container the access is performed with an invalid or a non-existent key, or if the access to a list is performed with an invalid or out-of-range index
InvalidStateException Thrown if an error occurs within the system in relation to an incorrect internal state of the system objects
MemoryException Thrown if an allocation is attempted with insufficient RAM
NotSupportedException Thrown when trying to access a method which given the current state or given the passed arguments is not supported in the current version of the library or is not supported at all by design
Uninitialized Object Exception Thrown if an attempt is made to access a non-existent or non-initialized object

Exceptions contain useful human-readable information. Please read e.what() message if exception is thrown.

Note

se::common::BaseException is not a subclass of std::exception.
A Smart ID Engine interface does not have any dependency on the STL

The thrown exceptions are wrapped in general java.lang.Exception. In Java, the exception type is included in the corresponding message text.

If you face a problem, or contact us at sales@smartengines.com or support@smartengines.com.

Factory methods and memory ownership

Several Smart Document Engine SDK classes have factory methods which return pointers to heap-allocated objects. Caller is responsible for deleting such objects (a caller is probably the one who is reading this right now). 

TIP

In C++:
For simple memory management and avoiding memory leaks, use smart pointers, such as std::unique_ptr<T> or std::shared_ptr<T>.

In Java API:
For the objects which are no longer needed it is recommended to use the .delete() method to force the deallocation of the native heap memory.

Configuration bundles

Every delivery contains one or several configuration bundles – archives containing everything needed for Smart Document Engine to be created and configured. Usually they are named as bundle_something.se and located inside the data-zip directory.

Supported document types

A document type is simply a string encoding real world document type you want to recognize. Document types that Smart Document Engine SDK delivered to you can potentially recognize can be obtaining using the following procedure:

// C++
// Iterating through internal engines
for (int i_engine = 0;
     i_engine < settings->GetInternalEnginesCount();
     ++i_engine) {
    // Iterating through supported document types for this internal engine
    for (int i_doc = 0;
         i_doc < settings->GetSupportedDocumentTypesCount(i_engine);
         ++i_doc) {
        // Getting supported document type name
        std::string doctype = settings- >GetSupportedDocumentType(i_engine, i_doc);
    }
}

// Java
// Iterating through internal engines
for (int i_engine = 0;
     i_engine < settings.GetInternalEnginesCount();
     i_engine++) {
    // Iterating through supported document types for this internal engine
    for (int i_doc = 0;
         i_doc < settings.GetSupportedDocumentTypesCount(i_engine);
         i_doc++) {
        // Getting supported document type name
        String doctype = settings.GetSupportedDocumentType(i_engine, i_doc);
    }
}

ATTENTION!

In a single session you can only enable document types that belong to the same internal engine.

Wildcard expressions

Since all documents in settings are disabled by default you need to enable some of them. In order to do so you may use AddEnabledDocumentTypes(…) method of DocSessionSettings:

// C++
settings->AddEnabledDocumentTypes("usa.forms.fed.ss4.type1"); // Enables the form of the application for an employer identification number of the USA

// Java
settings.AddEnabledDocumentTypes("usa.forms.fed.ss4.type1"); // Enables the form of the application for an employer identification number of the USA

You may also use the RemoveEnabledDocumentTypes(…) method to remove already enabled document types.

For convenience it’s possible to use wildcards (using the asterisk symbol) while enabling or disabling document types. When using document types related methods, each passed document type is matched against all supported document types. All matches in supported document types are added to the enabled document types list.

// C++
settings->AddEnabledDocumentTypes("deu.*"); // Enables all supported documents of Germany

// Java
settings.AddEnabledDocumentTypes("deu.*"); // Enables all supported documents of Germany

ATTENTION!

You can only enable document types that belong to the same internal engine for a single session. If you do otherwise then an exception will be thrown during session spawning.

TIP

It’s always better to enable the minimum number of document types as possible if you know exactly what you are going to recognize because the system will spend less time deciding which document type out of all enabled ones has been presented to it. 

Session options

Some configuration bundle options can be overridden in runtime using DocSessionSettings methods. You can obtain all currently set option names and their values using the following procedure:

// C++
for (auto it = settings->OptionsBegin();
     it != settings- > OptionsEnd();
     ++it) {
    // it.GetKey() returns the option name
    // it.GetValue() returns the option value
}

// Java
for (StringsMapIterator it = settings.OptionsBegin();
     !it.Equals(settings.OptionsEnd());
     it.Advance()) {
    // it.GetKey() returns the option name
    // it.GetValue() returns the option value
}

You can change option values using the SetOption(…) method:

// C++
settings->SetOption("enableMultiThreading", "true");

// Java
settings.SetOption("enableMultiThreading", "true");

Option values are always represented as strings, so if you want to pass an integer or boolean it should be converted to string first.

Common options

Option name Value type Default Description
enableMultiThreading “true” or “false” “true” Enables parallel execution of internal algorithms
rgbPixelFormat String of characters R, G, B, and A “RGB” for 3-channel images, “BGRA” for 4-channel images Sequence of color channels for session.Process() method image interpretation

Java API Specifics

Smart Document Engine SDK has Java API which is automatically generated from C++ interface using the SWIG tool.

The Java interface is the same as C++ except minor differences, please see the provided Java sample.

Object deallocation

Even though garbage collection is present and works, it’s strongly advised to call obj.delete() functions for our API objects manually because they are wrappers to the heap-allocated memory and their heap size is unknown to the garbage collector, which may result in delayed deletion of objects and thus in high overall memory consumption.

DocEngine engine = DocEngine.Create(config_path); // or any other object
// ...
engine.delete(); // forces and immediately guarantees wrapped C++ object deallocation

FAQ

  1. How can I test the system?

    You can install demo apps from Apple Store and Google Play from the links below:

    Google Play

    Apple Store

  2. Which sample do I need?

    You need samples from docengine_sample directory. 

  3. How can I install and test your library?

    We provide the classic SDK as a library and its integration samples, you need to install one of the examples (you can find it in the /samples directory)

  4. What file formats are supported?

    Supported formats:

    • jpeg, tiff (except TIFF_ZIP and TIFF_JPEG), png, bmp;
    • base64 (the formats from the paragraph above);
    • a file buffer with preliminary indication of the color scheme, width\height\stride\number of channels.
  5. What should I do if I need to recognize an image in PDF format?

    We do not support PDF and recommend rasterizing it (converting it to a supported format) on our side before recognition.

  6. How can I update to the new version?

    You need to replace the libraries from /bin, bindings (from /bindings respectively) and configuration bundle (*.se file from /data-zip) You can find it in the provided SDK.

  7. I have the SDK version for Windows, but our production system will use any OS from the Linux family. How do I run in a Linux-based docker container?

    Our SDK is platform-dependent, so please contact us at sales@smartengines.com or support@smartengines.com and we will provide you with the required SDK.

  8. I have an SDK for Centos 7, I try to run it on Ubuntu/Debian/Alpine. I have a lot of “undefined symbol” errors.

    Please don’t run SDK for operating systems not intended for it and contact us at sales@smartengines.com or support@smartengines.com to provide you with the required SDK.

Common Errors

  1. Failed to verify static auth.

    Maybe, the signature you have specified is invalid.

  2. Found no single engine that supports settings-enabled document types.

    The bundle does not contain the specified documents mask or the specified mask matched with documents from multiple internal engines in the current mode. See Configuration bundles.

  3. Failed to initialize DocEngine: mismatching engine version in config.

    You cannot use a bundle version different from the library version. The files from the same SDK should be used.

  4. libdocengine.so:cannot open shared object file:no such file or directory.

    Many integrations require additional assembly of the wrapper to work with the C++ library. This error occurs when the wrapper cannot find the main library at the specified path. It must either be placed in the scope of the code through environment variables, or the wrapper must be compiled with the correct paths to the library.

  5. for PYTHON and PHP:libpython3.x.x:cannot open shared object file: No such file or directory.

    The module you are using is built for a different version of python, /samples/docengine_sample_*/ contains a script for building the module on your side. Don’t forget that you must have the dev packages installed for your language.

Smart Document Engine: best practices

  1. On mobile platforms, a common example of a recognition screen interface is a document drawing overlaid on top of the camera preview. This helps the user to align the document with the camera. We do not recommend cropping the image according to your drawing, since the search algorithm for this type of document works very quickly and the object can be recognized before you place it in the required area. Give a full image for recognition, cropping is applied for UX.
  2. In mobile SDKs, we recommend placing the Scan button on the camera preview screen – this will allow you to align the document better, wait for focusing and initialization of the engine before starting OCR. This trick allows you to speed up the capture process, since the engine will process fewer empty initial frames.
  3. We recommend judging the quality of recognition by the isAccepted field attribute. If you need a more precise understanding of the quality of the recognition result, you can focus on the Confidence attribute, also available for each text field.
  4. On mobile platforms, initialize the recognition engine asynchronously before making the document scanning button available. In this case, the user will not have to wait for the engine to initialize when it is time to scan the document.
  5. Don’t store your personalized signature in a readable form (i.e. in resources), always store it either encoded in the application binary or load it remotely.
  6. The bundle is located separately from the library, so to minimize the size of the application, it can be downloaded from your servers on demand if necessary. 
  7. On mobile platforms, initialization of the library (calling the Create() method) must be carried out strictly in one instance.This process is the most difficult operation (like the image analysis itself), so it should always be done outside the UI thread. Based on one instance, many recognition sessions can be generated. Sessions can be run in parallel and will not affect each other.
  8. All features of our SDK are described in the docengine.pdf document – the main C++ library reference. 
  9. On mobile platforms, it is advisable to enable lazy initialization. In server versions, where the library is initialized once at application startup, we recommend the false value.
  10. If you are downloading images from a file, make sure that the downloaded files are located on a high-performance disk device. On some devices, the time it takes to load an image from a file may be longer than the time of recognition itself.
  11. Avoid pre-processing any input images. Our products work best with images taken directly from the capture device (a camera or a scanner).
  12. Avoid extremely large images. Sometimes high image resolution does not increase quality, but only increases recognition time. 
  13. It is always better to specify the minimum set of documents that you intend to recognize. The library will spend less time searching.

Send Request

Please fill out the form to get more information about the products,pricing and trial SDK for Android, iOS, Linux, Windows.

    Our customers

    Emirates NBD

    Emirates NBD uses state-of-the-art technologies when opening a bank account via its Mobile Banking App

    Argos KYC

    Argos KYC partners AI-driven Smart Engines to reinforce identity verification

    Rosbank

    Rosbank has implemented an artificial intelligence solution for the paperwork

    Smaregi

    Japanese provider of IT services Smaregi has adopted mobile OCR by Smart Engines

    Send Request

    Please fill out the form to get more information about the products,pricing and trial SDK for Android, iOS, Linux, Windows.