AI

Contextualization of ASR with LLM Using Phonetic Retrieval-Based Augmentation

1 Mins read

Large language models (LLMs) have shown superb capability of modeling multimodal signals including audio and text, allowing the model to generate spoken or textual response given a speech input. However, it remains a challenge for the model to recognize personal named entities, such as contacts in a phone book, when the input modality is speech. In this work, we start with a speech recognition task and propose a retrieval-based solution to contextualize the LLM: we first let the LLM detect named entities in speech without any context, then use this named entity as a query to retrieve phonetically similar named entities from a personal database and feed them to the LLM, and finally run context-aware LLM decoding. In a voice assistant task, our solution achieved up to 30.2% relative word error rate reduction and 73.6% relative named entity error rate reduction compared to a baseline system without contextualization. Notably, our solution by design avoids prompting the LLM with the full named entity database, making it highly efficient and applicable to large named entity databases.


Source link

Related posts
AI

UC Berkeley Researchers Propose DocETL: A Declarative System that Optimizes Complex Document Processing Tasks using LLMs

3 Mins read
Large Language Models (LLMs) have gained significant attention in data management, with applications spanning data integration, database tuning, query optimization, and data…
AI

LongAlign: A Segment-Level Encoding Method to Enhance Long-Text to Image Generation

3 Mins read
The rapid progress of text-to-image (T2I) diffusion models has made it possible to generate highly detailed and accurate images from text inputs….
AI

Controllable Safety Alignment (CoSA): An AI Framework Designed to Adapt Models to Diverse Safety Requirements without Re-Training

4 Mins read
As large language models (LLMs) become increasingly capable and better day by day, their safety has become a critical topic for research….

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *