Welcome to the Nexus of Ethics, Psychology, Morality, Philosophy and Health Care

Welcome to the nexus of ethics, psychology, morality, technology, health care, and philosophy

Friday, July 8, 2022

AI bias can arise from annotation instructions

K. Wiggers & D. Coldeway
Originally posted 8 MAY 22

Here is an excerpt:

As it turns out, annotators’ predispositions might not be solely to blame for the presence of bias in training labels. In a preprint study out of Arizona State University and the Allen Institute for AI, researchers investigated whether a source of bias might lie in the instructions written by dataset creators to serve as guides for annotators. Such instructions typically include a short description of the task (e.g., “Label all birds in these photos”) along with several examples.

The researchers looked at 14 different “benchmark” datasets used to measure the performance of natural language processing systems, or AI systems that can classify, summarize, translate and otherwise analyze or manipulate text. In studying the task instructions provided to annotators that worked on the datasets, they found evidence that the instructions influenced the annotators to follow specific patterns, which then propagated to the datasets. For example, over half of the annotations in Quoref, a dataset designed to test the ability of AI systems to understand when two or more expressions refer to the same person (or thing), start with the phrase “What is the name,” a phrase present in a third of the instructions for the dataset.

The phenomenon, which the researchers call “instruction bias,” is particularly troubling because it suggests that systems trained on biased instruction/annotation data might not perform as well as initially thought. Indeed, the co-authors found that instruction bias overestimates the performance of systems and that these systems often fail to generalize beyond instruction patterns.

The silver lining is that large systems, like OpenAI’s GPT-3, were found to be generally less sensitive to instruction bias. But the research serves as a reminder that AI systems, like people, are susceptible to developing biases from sources that aren’t always obvious. The intractable challenge is discovering these sources and mitigating the downstream impact.