Today, document recognition technologies have progressed enough that the MRZ can be easily read using even a simple smartphone.
The purpose of MRZ
The machine-readable zone serves two main purposes. First, MRZ allows to have personal details of the passport’s owner in a standardized format so that they could be quickly recognized/checked by a special machine. And second, MRZ provides automatic access to an RFID chip which is placed inside any biometric passports. This chip contains extended information about the holder of the document, and can be accessed only after entering the passport number, date of birth, and passport’s expiration date. Automatic recognition of MRZ helps to access the RFID data several times faster.
The MRZ was introduced with the intent to facilitate and speed up the identity check in places with controlled access, which are however very populous, such as airports. Prior to boarding the plane, each traveler needs to be identified, and more than once: their travel document gets checked at the flight registration desk, and then on various control points. Besides, the time for checking documents for the airport personnel is very limited as every single flight of every single day is strictly scheduled. To make this work, a well-cooperated, ordered and stable security organization has to be put in place. MRTD has proved to be an effective part of the identification process in cross-border travel since it got introduced, and has been universally recognized as such.
Eventually, large business centers and other places with a continuous flow of people and restricted access have naturally decided to replicate the practice of automated identification. Today, even small offices and health clinics are using the automatic document check systems, both for security and convenience reasons.
There are currently several types of ICAO standard machine-readable zones, which vary in the number of lines and characters in each line:
- MRP (all international passports), and MRV-A (machine-readable visas type A – issued by the USA, Japan, China, and others): consists of 2 lines, 44 characters each.
- TD-1 (e.g. citizen’s identification card, EU ID cards, US Green Card): consists of 3 lines, 30 characters each.
- TD-2 (e.g. Romania ID, old type of German ID), and MRV-B (machine-readable visas type B – e.g. Schengen visa): consists of 2 lines, 36 characters each.
Technically, only those documents listed in ICAO standard contain what we call MRZ. Other documents might also have machine-readable zones, however they may deviate from the ICAO standard – both in the number of lines and the content. Thus, MRZ-like codes can be also found in some national ID cards, driver’s licenses, vehicle registration certificates, and other documents.
MRZ on a passport
We have already established that a national passport is a document that allows cross-border travel, thus it has to be recognized equally correctly in all modern airports in the world, meaning the content and the structure of the identity page have to be standardized.
Thus, the identity page on the passport consists of two parts: the Visual Inspection Zone (VIZ), and the Machine Readable Zone (MRZ). The VIZ provides personal details of the passport owner, their photo, and the passport details, displayed in the format understandable to a human. The MRZ is located at the bottom of the page, and in its composition corresponds to the VIZ fields, but is meant to be read by machine.
In short, MRZ code of a passport always consists of two lines of characters, which, as mentioned above, correspond to the following data from the VIZ:
– Document code
– State code, or code of the government agency (organization) that issued the passport
– Full Name
– Document number
-Other data provided at the discretion of the document’s issuing authority.
MRZ on a passport also contains several check digits which allow to detect gross attempts to document falsification, as well as some machine recognition errors.
Now, based on the example of a national passport, let us take a closer look at the MRP (Machine-Readable Passport) type of MRZ.
The top line of the passport’s MRZ
The first character indicates the type of document: P – means machine-readable passport (as opposed to V in MRV-A type of MRZ, corresponding to a visa document). The state or organization that issued the passport can use the second character to determine the type of document (civil, official, diplomatic, service, etc.). By default, an international travel document is called a passport, however, passports of some countries also have official passports (or service passports), issued to government employees for cross-border work-related travel. In such cases, the first two letters indicate the type of passport – Service Passport (PS). If the passport type is not specified, then a placeholder (<) is inserted instead. The following three characters indicate the country that issued the passport in accordance with ISO 3166-1 alpha-3, or the organization that is authorized to issue passports and other machine-readable documents (for example, UN, Interpol, EU Council). The next 39 characters of the first line provide the name of the passport’s owner. First comes the primary identifier, or the last name. If the last name consists of several words, then a placeholder (<) is used between them. Punctuation marks – hyphens, apostrophes, commas, used in the VIZ, are not used in machine-readable lines. Therefore, instead of punctuation marks, a placeholder is used again.
In the machine-readable zone, the last name is separated from the given name(s) with two placeholder characters (<<). In the same way as in a last name, if there are several given names or if they consist of several words, they are separated by placeholder characters.
The number of characters per line is limited. For a passport, each MRZ line must contain exactly 44 characters. Therefore, if the full name is too long and does not fit into one line, the first name gets abbreviated, as it is the second identifier with respect to the last name.
In a machine-readable zone, only Latin characters without diacritics are used, thus specific transliteration rules have to be applied to names which are written with diacritical marks or using other alphabets.
The bottom line of the passport’s MRZ
The first 9 characters of the second line of the passport’s machine-readable zone is the document number. Despite the fact that in most countries that use machine-readable zones in their documents, passport numbers are converted to a 9-digit form, in some cases, the total number of characters may be more or less. If there are more characters in the number, those that did not fit in the allotted 9 places go into the “optional data” zone. The 10th character is there to verify the correctness of the number and is calculated using a special algorithm based on the first 9. The following three characters indicate the citizenship of the passport holder. The citizenship code is written in the ISO 3166-1 alpha-3 international format; there are additional codes such as for stateless persons (the characters would be XXA), or for refugees (XXB or XXC). The next 6 digits is the date of birth in the YYMMDD format, and the character following that date is the check digit, which is calculated by a special algorithm based on the date of birth. The next character indicates the gender of the passport holder: male (M), female (F), or a placeholder < in case the person has not decided on their gender or refused to provide it. The next 6 digits indicate the validity period of the passport in the YYMMDD format, followed by the check digit. The next 14 characters represent optional data such as personal numbers or other information, which can be used to more accurately identify the owner of the document, at the discretion of the issuer. If there is no personal number, and no other information, this entire field is filled with placeholders <<. If the personal number data is not available, then its check digit will be indicated either as 0 or as a placeholder. The last digit on the bottom line of the MRZ passport is a check digit calculated using all the characters in the bottom line, except for the characters indicating the gender and citizenship.
MRZ recognition technologies
Non-standard MRZ formats, as well as differences in the passport layout, create problems not just for the organizations which introduce automatic recognition systems, but in fact for the developers of such technologies. Thus our role as developers extends beyond just recognition of simple standardized document forms and meeting the current demands of the market in terms of identification.
Over the past two decades we at Smart Engines have been evolving our intelligent character recognition (ICR) technologies, and constantly surpassing all quality standards through the use of our latest achievements in computational intelligence and deep learning. Using the original neural network models, we were able to bring the quality of automatic document recognition to a new level. In terms of MRZ recognition, Smart Engines technology not only captures ICAO Doc 9303 international standard forms, but also supports a number of MRZ-alike codes, such as the ones used in French ID cards, Bulgarian vehicle registration certificates, Swiss driving license, and more, out-of-the-box. To learn more about MRZ recognition and other Smart Engines solutions visit OCR Engines page.