AI

This AI Research from China Introduces 1-Bit FQT: Enhancing the Capabilities of Fully Quantized Training (FQT) to 1-bit

2 Mins read

Deep neural network training can be sped up by Fully Quantised Training (FQT), which transforms activations, weights, and gradients into lower precision formats. The training procedure is more effective with the help of the quantization process, which enables quicker calculation and lower memory utilization. FQT minimizes the numerical precision to the lowest possible level while preserving the training’s efficacy. Researchers have been studying the viability of 1-bit FQT in an endeavor to explore these constraints.

The study initially analyses FQT theoretically, concentrating on well-known optimization algorithms such as Adam and Stochastic Gradient Descent (SGD). A crucial finding emerges from the analysis, which is that the degree of FQT convergence is highly dependent on the variance of the gradients. Put differently, and especially when low-bitwidth precision is used, the variations in gradient values can impact the training process’s success. Building more efficient low-precision training techniques requires an understanding of the link between gradient variance and convergence.

Expanding upon these theoretical understandings, the researchers have introduced a unique approach known as Activation Gradient Pruning (AGP). The reality that not all gradients are equally significant is used by the AGP method. AGP is able to reallocate resources to improve the precision of the most critical gradients by identifying and pruning the less informative gradients or those that make less of a contribution to the model’s learning process. This method guarantees that the training process stays stable even at very low precision levels and helps to lessen the detrimental effects of gradient variance.

The researchers have also suggested a method known as Sample Channel joint Quantisation (SCQ) in addition to AGP. Weight gradients and activation gradients are computed using several quantization techniques in SCQ. This customized method significantly improves the training process efficiency by guaranteeing that both kinds of gradients are processed efficiently on low-bitwidth hardware.

In order to verify their methodology, the team has created a structure that enables the application of their algorithm in real-world situations. They have experimented with their approach by optimizing popular neural network models, like VGGNet-16 and ResNet-18, using various datasets. The algorithm’s accuracy gain over conventional per-sample quantization techniques was significant, averaging about 6%. Not only that but compared to full-precision training, the training process was about 5.13 times faster.

In conclusion, this study is a major advance in the field of fully quantized training, especially in terms of lowering the acceptable threshold for numerical precision without compromising performance. This study can eventually result in even more effective neural network training techniques, especially if low-bitwidth hardware becomes more widely used.


Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter..

Don’t Forget to join our 50k+ ML SubReddit

Here is a highly recommended webinar from our sponsor: ‘Building Performant AI Applications with NVIDIA NIMs and Haystack’


Tanya Malhotra is a final year undergrad from the University of Petroleum & Energy Studies, Dehradun, pursuing BTech in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning.
She is a Data Science enthusiast with good analytical and critical thinking, along with an ardent interest in acquiring new skills, leading groups, and managing work in an organized manner.



Source link

Related posts
AI

Unveiling Schrödinger’s Memory: Dynamic Memory Mechanisms in Transformer-Based Language Models

3 Mins read
LLMs exhibit remarkable language abilities, prompting questions about their memory mechanisms. Unlike humans, who use memory for daily tasks, LLMs’ “memory” is…
AI

Embedić Released: A Suite of Serbian Text Embedding Models Optimized for Information Retrieval and RAG

2 Mins read
Novak Zivanic has made a significant contribution to the field of Natural Language Processing with the release of Embedić, a suite of…
AI

Fine-tune Meta Llama 3.1 models using torchtune on Amazon SageMaker

13 Mins read
This post is co-written with Meta’s PyTorch team. In today’s rapidly evolving AI landscape, businesses are constantly seeking ways to use advanced…

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *