AI

SLiCK: Exploiting Subsequences for Length-Constrained Keyword Spotting

1 Mins read

User-defined keyword spotting on a resource-constrained edge device is challenging. However, keywords are often bounded by a maximum keyword length, which has been largely under-leveraged in prior works. Our analysis of keyword-length distribution shows that user-defined keyword spotting can be treated as a length-constrained problem, eliminating the need for aggregation over variable text length. This leads to our proposed method for efficient keyword spotting, SLiCK (exploiting Subsequences for Length-Constrained Keyword spotting). We further introduce a subsequence-level matching scheme to learn audio-text relations at a finer granularity, thus distinguishing similar-sounding keywords more effectively through enhanced context. In SLiCK, the model is trained with a multi-task learning approach using two modules: Matcher (utterance-level matching task, novel subsequence-level matching task) and Encoder (phoneme recognition task). The proposed method improves the baseline results on a Libriphrase hard dataset, increasing AUC from 88.52 to 94.9 and reducing EER from 18.82 to 11.1.


Source link

Related posts
AI

Cursor vs. Windsurf in 2025

3 Mins read
Making an app without coding skills is highly trending right now. But can these tools successfully build and deploy an app? To…
AI

What are Small Language Models (SLMs)?

2 Mins read
Large language models (LLMs) like GPT-4, PaLM, Bard, and Copilot have made a huge impact in natural language processing (NLP). They can…
AI

This AI Paper Introduces Toto: Autoregressive Video Models for Unified Image and Video Pre-Training Across Diverse Tasks

3 Mins read
Autoregressive pre-training has proved to be revolutionary in machine learning, especially concerning sequential data processing. Predictive modeling of the following sequence elements…

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *