Weidinger, L., Uesato, J., et al. (2022, March).
In Proceedings of the 2022 ACM Conference on
Fairness, Accountability, and Transparency
(pp. 19-30).
Association for Computing Machinery.
Abstract
Responsible innovation on large-scale Language Models (LMs) requires foresight into and in-depth understanding of the risks these models may pose. This paper develops a comprehensive taxonomy of ethical and social risks associated with LMs. We identify twenty-one risks, drawing on expertise and literature from computer science, linguistics, and the social sciences. We situate these risks in our taxonomy of six risk areas: I. Discrimination, Hate speech and Exclusion, II. Information Hazards, III. Misinformation Harms, IV. Malicious Uses, V. Human-Computer Interaction Harms, and VI. Environmental and Socioeconomic harms. For risks that have already been observed in LMs, the causal mechanism leading to harm, evidence of the risk, and approaches to risk mitigation are discussed. We further describe and analyse risks that have not yet been observed but are anticipated based on assessments of other language technologies, and situate these in the same taxonomy. We underscore that it is the responsibility of organizations to engage with the mitigations we discuss throughout the paper. We close by highlighting challenges and directions for further research on risk evaluation and mitigation with the goal of ensuring that language models are developed responsibly.
Conclusion
In this paper, we propose a comprehensive taxonomy to structure the landscape of potential ethical and social risks associated with large-scale language models (LMs). We aim to support the research programme toward responsible innovation on LMs, broaden the public discourse on ethical and social risks related to LMs, and break risks from LMs into smaller, actionable pieces to facilitate their mitigation. More expertise and perspectives will be required to continue to build out this taxonomy of potential risks from LMs. Future research may also expand this taxonomy by applying additional methods such as case studies or interviews. Next steps building on this work will be to engage further perspectives, to innovate on analysis and evaluation methods, and to build out mitigation tools, working toward the responsible innovation of LMs.
Here is a summary of each of the six categories of risks:
- Discrimination: LLMs can be biased against certain groups of people, leading to discrimination in areas such as employment, housing, and lending.
- Hate speech and exclusion: LLMs can be used to generate hate speech and other harmful content, which can lead to exclusion and violence.
- Information hazards: LLMs can be used to spread misinformation, which can have a negative impact on society.
- Misinformation harms: LLMs can be used to create deepfakes and other forms of synthetic media, which can be used to deceive people.
- Malicious uses: LLMs can be used to carry out malicious activities such as hacking, fraud, and terrorism.
- Human-computer interaction harms: LLMs can be used to create addictive and harmful applications, which can have a negative impact on people's mental health.
- Environmental and socioeconomic harms: LLMs can be used to consume large amounts of energy and data, which can have a negative impact on the environment and society.