26.04.2024 г.

Smart Engines scientists have found a way to improve the efficiency of neural networks by 40%

Scientists from AI-company Smart Engines have found a way to increase the efficiency of neural networks. The method is based on a fundamentally new quantization scheme, which increases the speed of operations by 40%. The research results were published in MDPI «Mathematics» (Q1).

This is a breakthrough in the optimization of neural network execution. Currently, neural networks are mainly executed on specialized graphics cards, but not every device is equipped with them. At the same time, any user device has a central processor, the world standard for which is the use of 8-bit neural networks. However, deep neural networks become more complex, containing hundreds of millions or more coefficients, which require more computing power. This limits the possibility of using central processing units in artificial intelligence systems.

Smart Engines researchers have solved this problem by proposing a qualitative improvement of the 8-bit model – “4.6-bit” networks. It operates 40% faster than the 8-bit model, but with almost no decrease in quality due to more efficient use of the features of mobile devices’ central processing units.

For this purpose, the input data and coefficients of the model are quantized so that the results of their multiplication fit into 8-bit registers. Summarizing of the results is done using a two-level system of 16- and 32-bit accumulators to maximize efficiency. As a result, an average of 4.6 bits of information is assigned to each value.

This quantization scheme compares favorably with the existing ones, since it allows a flexible way to set the input data bit size depending on the task and is not bound to powers of two. Therefore, this development provides noticeably higher recognition quality than, for example, 4-bit models.

The invention is already being used in Smart Engines’ products for object detection and text recognition.

