AI

Retrieval-Augmented Correction of Named Entity Speech Recognition Errors

1 Mins read

In recent years, end-to-end automatic speech recognition (ASR) systems have proven themselves remarkably accurate and performant, but these systems still have a significant error rate for entity names which appear infrequently in their training data. In parallel to the rise of end-to-end ASR systems, large language models (LLMs) have proven to be a versatile tool for various natural language processing (NLP) tasks. In NLP tasks where a database of relevant knowledge is available, retrieval augmented generation (RAG) has achieved impressive results when used with LLMs. In this work, we propose a RAG-like technique for correcting speech recognition entity name errors. Our approach uses a vector database to index a set of relevant entities. At runtime, database queries are generated from possibly errorful textual ASR hypotheses, and the entities retrieved using these queries are fed, along with the ASR hypotheses, to an LLM which has been adapted to correct ASR errors. Overall, our best system achieves 33%-39% relative word error rate reductions on synthetic test sets focused on voice assistant queries of rare music entities without regressing on the STOP test set, a publicly available voice assistant test set covering many domains.


Source link

Related posts
AI

Google DeepMind Research Introduces Diversity-Rewarded CFG Distillation: A Novel Finetuning Approach to Enhance the Quality-Diversity Trade-off in Generative AI Models

3 Mins read
Generative AI models, driven by Large Language Models (LLMs) or diffusion techniques, are revolutionizing creative domains like art and entertainment. These models…
AI

Salesforce AI Research Proposes Dataset-Driven Verifier to Improve LLM Reasoning Consistency

2 Mins read
Large language models (LLMs) often fail to consistently and accurately perform multi-step reasoning, especially in complex tasks like mathematical problem-solving and code…
AI

OpenR: An Open-Source AI Framework Enhancing Reasoning in Large Language Models

3 Mins read
Large language models (LLMs) have made significant progress in language generation, but their reasoning skills remain insufficient for complex problem-solving. Tasks such…

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *