AI

Applying RLAIF for Code Generation with API-usage in Lightweight LLMs

1 Mins read

This paper was accepted at the Natural Language Reasoning and Structured Explanations workshop at ACL 2024.

Reinforcement Learning from AI Feedback (RLAIF) has demonstrated significant potential across various domains, including mitigating harm in LLM outputs, enhancing text summarization, and mathematical reasoning. This paper introduces an RLAIF framework for improving the code generation abilities of lightweight (<1B parameters) LLMs. We specifically focus on code generation tasks that require writing appropriate API calls, which is challenging due to the well-known issue of hallucination in LLMs. Our framework extracts AI feedback from a larger LLM (e.g., GPT-3.5) through a specialized prompting strategy and uses this data to train a reward model towards better alignment from smaller LLMs. We run our experiments on the Gorilla dataset and meticulously assess the quality of the model-generated code across various metrics, including AST, ROUGE, and Code-BLEU, and develop a pipeline to compute its executability rate accurately. Our approach significantly enhances the fine-tuned LLM baseline’s performance, achieving a 4.5% improvement in executability rate. Notably, a smaller LLM model (780M parameters) trained with RLAIF surpasses a much larger fine-tuned baseline with 7B parameters, achieving a 1.0% higher code executability rate.


Source link

Related posts
AI

Samsung Researchers Introduce LoRA-Guard: A Parameter-Efficient Guardrail Adaptation Method that Relies on Knowledge Sharing between LLMs and Guardrail Models

3 Mins read
Large Language Models (LLMs) have demonstrated remarkable proficiency in language generation tasks. However, their training process, which involves unsupervised learning from extensive…
AI

Branch-and-Merge Method: Enhancing Language Adaptation in AI Models by Mitigating Catastrophic Forgetting and Ensuring Retention of Base Language Capabilities while Learning New Languages

3 Mins read
Language model adaptation is a crucial area in artificial intelligence, focusing on enhancing large pre-trained language models to work effectively across various…
AI

Arena Learning: Transforming Post-Training of Large Language Models with AI-Powered Simulated Battles for Enhanced Efficiency and Performance in Natural Language Processing

3 Mins read
Large language models (LLMs) have shown exceptional capabilities in understanding and generating human language, making substantial contributions to applications such as conversational…

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *