AI

This AI Research from China Explains How Common 7B Language Models Already Possess Strong Mathematical Capabilities

2 Mins read

Large Language Models (LLMs) have demonstrated impressive capabilities in almost every domain. From generating unique content just like humans, answering questions to summarizing massive textual paragraphs, completing codes and translating languages, LLMs are one of the best advancements in the field of Artificial Intelligence (AI). 

However, it is widely believed that in order for language models to have great mathematical capabilities, they are required to be very vast in scale or go through a rigorous pre-training process involving mathematics. A recent research challenges this idea by demonstrating that the LLaMA-2 7B model already displays outstanding mathematical abilities, even with standard pre-training. 

It can choose the optimum response from 256 random generations with remarkable accuracy rates of 97.7% and 72.0% on the GSM8K and MATH benchmarks, respectively. The main problem with the existing base model is that, although it can produce accurate answers with high accuracy, it cannot reliably evoke its innate mathematical capabilities. The considerable decline in accuracy to 49.5% and 7.9% on the GSM8K and MATH benchmarks, respectively, when focusing solely on the first response, emphasizes this discrepancy.

To address this issue, the team has suggested scaling up supervised fine-tuning (SFT) data. The accuracy of the responses generated can be greatly improved by increasing the amount of data used for fine-tuning. However, the lack of publicly available math problems limits the potential for large-scale scalability. The team has used synthetic data, which works almost as well as real data, to get around this restriction. 

The team has created fictitious math problems with the GPT-4 Turbo model and has found that utilizing GPT-4 Turbo for verification after implementing a basic generating approach yields incredibly efficient outcomes. Using artificially generated maths problems allows for large scaling of the supervised fine-tuning data, with nearly matching real-world accuracy.

By using this simple method, the team was able to enhance accuracy noticeably. They attained 82.6% accuracy on GSM8K and 40.6% accuracy on MATH using LLaMA-2 7B models, which exceeds the accuracy of earlier models by 14.2% and 20.8%, respectively.

The research has also offered insights into scaling behaviors across various mistake types and reasoning difficulties. This analysis clarifies methods to reduce errors during the scaling process and helps comprehend how the model’s performance changes as data volumes increase.

In conclusion, this study demonstrates that language models can attain excellent mathematical capabilities without requiring large-scale models or intensive pre-training. Considerable progress in the area of mathematical problem-solving with language models can be made by utilizing synthetic data and increasing the amount of supervised fine-tuning.


Check out the PaperAll credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our 38k+ ML SubReddit


Tanya Malhotra is a final year undergrad from the University of Petroleum & Energy Studies, Dehradun, pursuing BTech in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning.
She is a Data Science enthusiast with good analytical and critical thinking, along with an ardent interest in acquiring new skills, leading groups, and managing work in an organized manner.




Source link

Related posts
AI

PRISE: A Unique Machine Learning Method for Learning Multitask Temporal Action Abstractions Using Natural Language Processing (NLP)

2 Mins read
In the domain of sequential decision-making, especially in robotics, agents often deal with continuous action spaces and high-dimensional observations. These difficulties result…
AI

FLUTE: A CUDA Kernel Designed for Fused Quantized Matrix Multiplications to Accelerate LLM Inference

3 Mins read
Large Language Models (LLMs) face deployment challenges due to latency issues caused by memory bandwidth constraints. Researchers use weight-only quantization to address…
AI

Self-Route: A Simple Yet Effective AI Method that Routes Queries to RAG or Long Context LC based on Model Self-Reflection

3 Mins read
Large Language Models (LLMs) have revolutionized the field of natural language processing, allowing machines to understand and generate human language. These models,…

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *