AI

This AI Research Introduces AstroLLaMA: A 7B Parameter Model Fine-Tuned from LLaMA-2 Using Over 300K Astronomy Abstracts From ArXiv

2 Mins read

[ad_1]

The arrival of Large Language Models (LLMs) has attracted attention from many fields because of several important factors coming together. These factors include the availability of huge amounts of data, improvements in computer power, and breakthroughs in the design of neural networks. Prominent models like GPT-4, PaLM, and LLaMA have shown that they can do many different tasks really well. These tasks often use methods like giving them prompts, fine-tuning their abilities, and getting feedback from humans to help them learn and improve. The astronomy discipline presents both a unique challenge and a fertile ground for the application of LLMs.

In the above image, we can notice each model is prompted with the same short text snippet, highlighted in their respective boxes. GPT-4 tends to produce more generic statements, lacking domain-specific nuance. AstroLLaMA demonstrates the most robust completion, offering more relevant concepts and deeper insights specific to the field of astronomy, thus significantly outperforming LLaMA-2 and GPT-4.

However, AstroLLaMA does have some limitations that need to be acknowledged. One significant limitation is the model’s lack of knowledge in specific areas of astronomy, where AstroLLaMA’s ability to estimate potential star candidates from Gaia-ESO data is notably inaccurate. To address these issues, researchers are currently working on enhancing AstroLLaMA’s training dataset. Instead of just using abstracts, researchers plan to incorporate the complete LaTeX sources of existing astronomy articles. This expansion will substantially increase the number of tokens the model can learn from.

AstroLLaMA serves as an impressive prototype for specialized Large Language Models (LLMs) designed for astronomy. It exhibits remarkable context-aware abilities, outperforming GPT-4 even though it has significantly fewer parameters. This advancement not only opens doors for enhanced performance in various tasks like answering questions, summarising scientific content, and generating hypotheses but also has implications for multi-modal models.


Check out the PaperAll Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 30k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..


Janhavi Lande, is an Engineering Physics graduate from IIT Guwahati, class of 2023. She is an upcoming data scientist and has been working in the world of ml/ai research for the past two years. She is most fascinated by this ever changing world and its constant demand of humans to keep up with it. In her pastime she enjoys traveling, reading and writing poems.


[ad_2]
Source link

Related posts
AI

GENAUDIT: A Machine Learning Tool to Assist Users in Fact-Checking LLM-Generated Outputs Against Inputs with Evidence

2 Mins read
[ad_1] With the recent progress made in the field of Artificial Intelligence (AI) and mainly Generative AI, the ability of Large Language…
AI

This AI Paper from the University of Oxford Proposes Magi: A Machine Learning Tool to Make Manga Accessible to the Visually Impaired

2 Mins read
[ad_1] In storytelling, Japanese comics, known as Manga, have carved out a significant niche, captivating audiences worldwide with their intricate plots and…
AI

The Dawn of Grok-1: A Leap Forward in AI Accessibility

2 Mins read
[ad_1] In an era where the democratization of artificial intelligence technology stands as a pivotal turning point for innovation across industries, xAI…

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *