AI

Adaptive Training Distributions with Scalable Online Bilevel Optimization

1 Mins read

Large neural networks pretrained on web-scale corpora are central to modern machine learning. In this paradigm, the distribution of the large, heterogeneous pretraining data rarely matches that of the application domain. This work considers modifying the pretraining distribution in the case where one has a small sample of data reflecting the targeted test conditions. We propose an algorithm motivated by a recent formulation of this setting as an online, bilevel optimization problem. With scalability in mind, our algorithm prioritizes computing gradients at training points which are likely to most improve the loss on the targeted distribution. Empirically, we show that in some cases this approach is beneficial over existing strategies from the domain adaptation literature but may not succeed in other cases. We propose a simple test to evaluate when our approach can be expected to work well and point towards further research to address current limitations.


Source link

Related posts
AI

Key features & Benefits in 2025

7 Mins read
Network planning tools help businesses optimize performance, manage resources efficiently, and ensure scalable, reliable network designs for growth and stability. To help,…
AI

Major Providers Comparison in 2025

5 Mins read
We analyzed top 15 LLMs and their input/output pricing options along with their performance. LLM API pricing can be complex and depends…
AI

This artist collaborates with AI and robots

3 Mins read
“[Chung] comes from drawing, and then they start to work with AI, but not like we’ve seen in this generative AI movement…

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *