AI

Generalizable Error Modeling for Human Data Annotation: Evidence from an Industry-Scale Search Data Annotation Program

1 Mins read

Machine learning (ML) and artificial intelligence (AI) systems rely heavily on human-annotated data for training and evaluation. A major challenge in this context is the occurrence of annotation errors, as their effects can degrade model performance. This paper presents a predictive error model trained to detect potential errors in search relevance annotation tasks for three industry-scale ML applications (music streaming, video streaming, and mobile apps). Drawing on real-world data from an extensive search relevance annotation program, we demonstrate that errors can be predicted with moderate model performance (AUC=0.65-0.75) and that model performance generalizes well across applications (i.e., a global, task-agnostic model performs on par with task-specific models). In contrast to past research, which has often focused on predicting annotation labels from task-specific features, our model is trained to predict errors directly from a combination of task features and behavioral features derived from the annotation process, in order to achieve a high degree of generalizability. We demonstrate the usefulness of the model in the context of auditing, where prioritizing tasks with high predicted error probabilities considerably increases the amount of corrected annotation errors (e.g., 40% efficiency gains for the music streaming application). These results highlight that behavioral error detection models can yield considerable improvements in the efficiency and quality of data annotation processes. Our findings reveal critical insights into effective error management in the data annotation process, thereby contributing to the broader field of human-in-the-loop ML.


Source link

Related posts
AI

Summarize call transcriptions securely with Amazon Transcribe and Amazon Bedrock Guardrails

8 Mins read
Given the volume of meetings, interviews, and customer interactions in modern business environments, audio recordings play a crucial role in capturing valuable…
AI

Use Amazon SageMaker Studio with a custom file system in Amazon EFS

13 Mins read
Amazon SageMaker Studio is the latest web-based experience for running end-to-end machine learning (ML) workflows. SageMaker Studio offers a suite of integrated…
AI

Katanemo Open Sources Arch-Function: A Set of Large Language Models (LLMs) Promising Ultra-Fast Speeds at Function-Calling Tasks for Agentic Workflows

2 Mins read
One of the biggest hurdles organizations face is implementing Large Language Models (LLMs) to handle intricate workflows effectively. Issues of speed, flexibility,…

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *