AI

Construction of Paired Knowledge Graph – Text Datasets Informed by Cyclic Evaluation

1 Mins read

Datasets that pair Knowledge Graphs (KG) and text together (KG-T) can be used to train forward and reverse neural models that generate text from KG and vice versa. However models trained on datasets where KG and text pairs are not equivalent can suffer from more hallucination and poorer recall. In this paper, we verify this empirically by generating datasets with different levels of noise and find that noisier datasets do indeed lead to more hallucination. We argue that the ability of forward and reverse models trained on a dataset to cyclically regenerate source KG or text is a proxy for the equivalence between the KG and the text in the dataset. Using cyclic evaluation we find that manually created WebNLG is much better than automatically created TeKGen and T-REx. Informed by these observations, we construct a new, improved dataset called LAGRANGE using heuristics meant to improve equivalence between KG and text and show the impact of each of the heuristics on cyclic evaluation. We also construct two synthetic datasets using large language models (LLMs), and observe that these are conducive to models that perform significantly well on cyclic generation of text, but less so on cyclic generation of KGs, probably because of a lack of a consistent underlying ontology.


Source link

Related posts
AI

Best 15+ Automated Call Software & How to Choose in 2025

7 Mins read
Automated call center software dials phone numbers from a list and connects you to a live agent or a prerecorded message. This…
AI

MMR1-Math-v0-7B Model and MMR1-Math-RL-Data-v0 Dataset Released: New State of the Art Benchmark in Efficient Multimodal Mathematical Reasoning with Minimal Data

2 Mins read
Advancements in multimodal large language models have enhanced AI’s ability to interpret and reason about complex visual and textual information. Despite these…
AI

Comparison of Popular Platforms ['25]

8 Mins read
Serverless functions enable developers to run code without having to manage a server. This allows them to focus on writing and deploying…

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *