AI

Transfer Learning for Structured Pruning under Limited Task Data

1 Mins read

This paper was accepted at the Efficient Natural Language and Speech Processing (ENLSP-III) Workshop at NeurIPS.

Large, pre-trained models are problematic to use in resource constrained applications. Fortunately, task-aware structured pruning methods offer a solution. These approaches reduce model size by dropping structural units like layers and attention heads in a manner that takes into account the end-task. However, these pruning algorithms require more task-specific data than is typically available. We propose a framework which combines structured pruning with transfer learning to reduce the need for task-specific data. Our empirical results answer questions such as: How should the two tasks be coupled? What parameters should be transferred? And, when during training should transfer learning be introduced? Leveraging these insights, we demonstrate that our framework results in pruned models with improved generalization over strong baselines.


Source link

Related posts
AI

Samsung Researchers Introduce LoRA-Guard: A Parameter-Efficient Guardrail Adaptation Method that Relies on Knowledge Sharing between LLMs and Guardrail Models

3 Mins read
Large Language Models (LLMs) have demonstrated remarkable proficiency in language generation tasks. However, their training process, which involves unsupervised learning from extensive…
AI

Branch-and-Merge Method: Enhancing Language Adaptation in AI Models by Mitigating Catastrophic Forgetting and Ensuring Retention of Base Language Capabilities while Learning New Languages

3 Mins read
Language model adaptation is a crucial area in artificial intelligence, focusing on enhancing large pre-trained language models to work effectively across various…
AI

Arena Learning: Transforming Post-Training of Large Language Models with AI-Powered Simulated Battles for Enhanced Efficiency and Performance in Natural Language Processing

3 Mins read
Large language models (LLMs) have shown exceptional capabilities in understanding and generating human language, making substantial contributions to applications such as conversational…

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *