Welcome to the Nexus of Ethics, Psychology, Morality, Philosophy and Health Care

Welcome to the nexus of ethics, psychology, morality, technology, health care, and philosophy
Showing posts with label language models. Show all posts
Showing posts with label language models. Show all posts

Thursday, March 28, 2024

Antagonistic AI

A. Cai, I. Arawjo, E. L. Glassman
arXiv:2402.07350
Originally submitted 12 Feb 24

The vast majority of discourse around AI development assumes that subservient, "moral" models aligned with "human values" are universally beneficial -- in short, that good AI is sycophantic AI. We explore the shadow of the sycophantic paradigm, a design space we term antagonistic AI: AI systems that are disagreeable, rude, interrupting, confrontational, challenging, etc. -- embedding opposite behaviors or values. Far from being "bad" or "immoral," we consider whether antagonistic AI systems may sometimes have benefits to users, such as forcing users to confront their assumptions, build resilience, or develop healthier relational boundaries. Drawing from formative explorations and a speculative design workshop where participants designed fictional AI technologies that employ antagonism, we lay out a design space for antagonistic AI, articulating potential benefits, design techniques, and methods of embedding antagonistic elements into user experience. Finally, we discuss the many ethical challenges of this space and identify three dimensions for the responsible design of antagonistic AI -- consent, context, and framing.


Here is my summary:

This article proposes a thought-provoking concept: designing AI systems that intentionally challenge and disagree with users. It argues against the dominant view of AI as subservient and aligned with human values, instead exploring the potential benefits of "antagonistic AI" in stimulating critical thinking and challenging assumptions. While acknowledging the ethical concerns and proposing responsible design principles, the article could benefit from a deeper discussion of potential harms, concrete examples of how such AI might function, and how it would be received by users. Overall, "Antagonistic AI" is a valuable contribution that prompts further exploration and discussion on the responsible development and societal implications of such AI systems.

Wednesday, July 5, 2023

Taxonomy of Risks posed by Language Models

Weidinger, L., Uesato, J., et al. (2022, March).
In Proceedings of the 2022 ACM Conference on 
Fairness, Accountability, and Transparency
(pp. 19-30).
Association for Computing Machinery.

Abstract

Responsible innovation on large-scale Language Models (LMs) requires foresight into and in-depth understanding of the risks these models may pose. This paper develops a comprehensive taxonomy of ethical and social risks associated with LMs. We identify twenty-one risks, drawing on expertise and literature from computer science, linguistics, and the social sciences. We situate these risks in our taxonomy of six risk areas: I. Discrimination, Hate speech and Exclusion, II. Information Hazards, III. Misinformation Harms, IV. Malicious Uses, V. Human-Computer Interaction Harms, and VI. Environmental and Socioeconomic harms. For risks that have already been observed in LMs, the causal mechanism leading to harm, evidence of the risk, and approaches to risk mitigation are discussed. We further describe and analyse risks that have not yet been observed but are anticipated based on assessments of other language technologies, and situate these in the same taxonomy. We underscore that it is the responsibility of organizations to engage with the mitigations we discuss throughout the paper. We close by highlighting challenges and directions for further research on risk evaluation and mitigation with the goal of ensuring that language models are developed responsibly.

Conclusion

In this paper, we propose a comprehensive taxonomy to structure the landscape of potential ethical and social risks associated with large-scale language models (LMs). We aim to support the research programme toward responsible innovation on LMs, broaden the public discourse on ethical and social risks related to LMs, and break risks from LMs into smaller, actionable pieces to facilitate their mitigation. More expertise and perspectives will be required to continue to build out this taxonomy of potential risks from LMs. Future research may also expand this taxonomy by applying additional methods such as case studies or interviews. Next steps building on this work will be to engage further perspectives, to innovate on analysis and evaluation methods, and to build out mitigation tools, working toward the responsible innovation of LMs.


Here is a summary of each of the six categories of risks:
  • Discrimination: LLMs can be biased against certain groups of people, leading to discrimination in areas such as employment, housing, and lending.
  • Hate speech and exclusion: LLMs can be used to generate hate speech and other harmful content, which can lead to exclusion and violence.
  • Information hazards: LLMs can be used to spread misinformation, which can have a negative impact on society.
  • Misinformation harms: LLMs can be used to create deepfakes and other forms of synthetic media, which can be used to deceive people.
  • Malicious uses: LLMs can be used to carry out malicious activities such as hacking, fraud, and terrorism.
  • Human-computer interaction harms: LLMs can be used to create addictive and harmful applications, which can have a negative impact on people's mental health.
  • Environmental and socioeconomic harms: LLMs can be used to consume large amounts of energy and data, which can have a negative impact on the environment and society.

Wednesday, May 31, 2023

Can AI language models replace human participants?

Dillon, D, Tandon, N., Gu, Y., & Gray, K.
Trends in Cognitive Sciences
May 10, 2023

Abstract

Recent work suggests that language models such as GPT can make human-like judgments across a number of domains. We explore whether and when language models might replace human participants in psychological science. We review nascent research, provide a theoretical model, and outline caveats of using AI as a participant.

(cut)

Does GPT make human-like judgments?

We initially doubted the ability of LLMs to capture human judgments but, as we detail in Box 1, the moral judgments of GPT-3.5 were extremely well aligned with human moral judgments in our analysis (r= 0.95;
full details at https://nikett.github.io/gpt-as-participant). Human morality is often argued to be especially difficult for language models to capture and yet we found powerful alignment between GPT-3.5 and human judgments.

We emphasize that this finding is just one anecdote and we do not make any strong claims about the extent to which LLMs make human-like judgments, moral or otherwise. Language models also might be especially good at predicting moral judgments because moral judgments heavily hinge on the structural features of scenarios, including the presence of an intentional agent, the causation of damage, and a vulnerable victim, features that language models may have an easy time detecting.  However, the results are intriguing.

Other researchers have empirically demonstrated GPT-3’s ability to simulate human participants in domains beyond moral judgments, including predicting voting choices, replicating behavior in economic games, and displaying human-like problem solving and heuristic judgments on scenarios from cognitive
psychology. LLM studies have also replicated classic social science findings including the Ultimatum Game and the Milgram experiment. One company (http://syntheticusers.com) is expanding on these
findings, building infrastructure to replace human participants and offering ‘synthetic AI participants’
for studies.

(cut)

From Caveats and looking ahead

Language models may be far from human, but they are trained on a tremendous corpus of human expression and thus they could help us learn about human judgments. We encourage scientists to compare simulated language model data with human data to see how aligned they are across different domains and populations.  Just as language models like GPT may help to give insight into human judgments, comparing LLMs with human judgments can teach us about the machine minds of LLMs; for example, shedding light on their ethical decision making.

Lurking under the specific concerns about the usefulness of AI language models as participants is an age-old question: can AI ever be human enough to replace humans? On the one hand, critics might argue that AI participants lack the rationality of humans, making judgments that are odd, unreliable, or biased. On the other hand, humans are odd, unreliable, and biased – and other critics might argue that AI is just too sensible, reliable, and impartial.  What is the right mix of rational and irrational to best capture a human participant?  Perhaps we should ask a big sample of human participants to answer that question. We could also ask GPT.