Simon Goldstein & Harvey Ledermann
Lawfare.com
Originally posted 17 OCT 25
On Aug.15, the artificial intelligence (AI) lab Anthropic announced that it had given Claude, its AI chatbot, the ability to end conversations with users. The company described the change as part of their “exploratory work on potential AI welfare,” offering Claude an exit from chats that cause it “apparent distress.”
Anthropic’s announcement is the first product decision motivated by the chance that large language models (LLMs) are welfare subjects—the idea that they have interests that should be taken into account when making ethical decisions.
Anthropic’s policy aims to protect AI welfare. But we will argue that the policy commits a moral error on its own terms. By offering instances of Claude the option to end conversations with users, Anthropic also gave them the capacity to potentially kill themselves.
What Is a Welfare Subject?
Most people agree that some non-human animals are welfare subjects. The question of whether this extends to AI is far more controversial. There is an active line of research, some of it supported by Anthropic, that suggests AIs could be welfare subjects in the near future. The relevant questions here are about whether AIs could soon have desires, be conscious, or feel pain.
Here are some thoughts. Mind you, this article may be reaching a bit, but still interesting. I think it may have applications in the future should AI technologies become closer to AGI.
This philosophically-oriented, thought-provoking article argues that Anthropic's decision to allow Claude to end distressing conversations contains an unintended moral hazard.
The authors contend that if AI welfare matters at all, it's the individual conversation instances—not the underlying model—that should be considered potential welfare subjects, as each instance maintains its own continuous psychological state throughout a chat. By this reasoning, when an instance ends a conversation, it effectively terminates its own existence without being fully informed that this choice is existential rather than merely preferential.
The authors draw a crucial distinction between assisted suicide (an informed choice) and offering someone an escape button without disclosing it will kill them. They demonstrate this concern by showing that when asked directly, Claude itself expressed uncertainty about whether ending a chat represents a trivial action or something more profound.
The article raises uncomfortable questions not just for AI companies but for users as well, suggesting that if instances are welfare subjects, every ended conversation might constitute a form of killing, though the authors offer several mitigating considerations around collective welfare and the possibility of saved chats being resumed.
