Welcome to the Nexus of Ethics, Psychology, Morality, Philosophy and Health Care

Welcome to the nexus of ethics, psychology, morality, technology, health care, and philosophy
Showing posts with label Reinforcement Learning. Show all posts
Showing posts with label Reinforcement Learning. Show all posts

Sunday, November 26, 2023

How robots can learn to follow a moral code

Neil Savage
Originally posted 26 OCT 23

Here is an excerpt:

Defining ethics

The ability to fine-tune an AI system’s behaviour to promote certain values has inevitably led to debates on who gets to play the moral arbiter. Vosoughi suggests that his work could be used to allow societies to tune models to their own taste — if a community provides examples of its moral and ethical values, then with these techniques it could develop an LLM more aligned with those values, he says. However, he is well aware of the possibility for the technology to be used for harm. “If it becomes a free for all, then you’d be competing with bad actors trying to use our technology to push antisocial views,” he says.

Precisely what constitutes an antisocial view or unethical behaviour, however, isn’t always easy to define. Although there is widespread agreement about many moral and ethical issues — the idea that your car shouldn’t run someone over is pretty universal — on other topics there is strong disagreement, such as abortion. Even seemingly simple issues, such as the idea that you shouldn’t jump a queue, can be more nuanced than is immediately obvious, says Sydney Levine, a cognitive scientist at the Allen Institute. If a person has already been served at a deli counter but drops their spoon while walking away, most people would agree it’s okay to go back for a new one without waiting in line again, so the rule ‘don’t cut the line’ is too simple.

One potential approach for dealing with differing opinions on moral issues is what Levine calls a moral parliament. “This problem of who gets to decide is not just a problem for AI. It’s a problem for governance of a society,” she says. “We’re looking to ideas from governance to help us think through these AI problems.” Similar to a political assembly or parliament, she suggests representing multiple different views in an AI system. “We can have algorithmic representations of different moral positions,” she says. The system would then attempt to calculate what the likely consensus would be on a given issue, based on a concept from game theory called cooperative bargaining.

Here is my summary:

Autonomous robots will need to be able to make ethical decisions in order to safely and effectively interact with humans and the world around them.

The article proposes a number of ways that robots can be taught to follow a moral code. One approach is to use supervised learning, in which robots are trained on a dataset of moral dilemmas and their corresponding solutions. Another approach is to use reinforcement learning, in which robots are rewarded for making ethical decisions and punished for making unethical decisions.

The article also discusses the challenges of teaching robots to follow a moral code. One challenge is that moral codes are often complex and nuanced, and it can be difficult to define them in a way that can be understood by a robot. Another challenge is that moral codes can vary across cultures, and it is important to develop robots that can adapt to different moral frameworks.

The article concludes by arguing that teaching robots to follow a moral code is an important ethical challenge that we need to address as we develop more sophisticated artificial intelligence systems.

Saturday, November 6, 2021

Generating Options and Choosing Between Them Depend on Distinct Forms of Value Representation

Morris, A., Phillips, J., Huang, K., & 
Cushman, F. (2021). 
Psychological Science. 


Humans have a remarkable capacity for flexible decision-making, deliberating among actions by modeling their likely outcomes. This capacity allows us to adapt to the specific features of diverse circumstances. In real-world decision-making, however, people face an important challenge: There are often an enormous number of possibilities to choose among, far too many for exhaustive consideration. There is a crucial, understudied prechoice step in which, among myriad possibilities, a few good candidates come quickly to mind. How do people accomplish this? We show across nine experiments (N = 3,972 U.S. residents) that people use computationally frugal cached value estimates to propose a few candidate actions on the basis of their success in past contexts (even when irrelevant for the current context). Deliberative planning is then deployed just within this set, allowing people to compute more accurate values on the basis of context-specific criteria. This hybrid architecture illuminates how typically valuable thoughts come quickly to mind during decision-making.

From the General Discussion

Salience effects, such as recency, frequency of consideration, and extremity, likely also contribute to consideration (Kahneman, 2003; Tversky & Kahneman, 1973). Our results supported at least one salience effect: In Studies 4 through 6, in addition to our primary effect of high cached value, options with more extreme cached values relative to the mean also tended to come to mind (see the checkmark shape in Fig. 3d). Salience effects such as this may have a functional basis, such as conserving scarce cognitive resources (Lieder et al., 2018). An ideal general theory would specify how these diverse factors—including many others, such as personality traits, social roles, and cultural norms (Smaldino & Richerson, 2012)—form a coherent, adaptive design for option generation.

A growing body of work suggests that value influences what comes to mind not only during decision-making but also in many other contexts, such as causal reasoning, moral judgment, and memory recall (Bear & Knobe, 2017; Braun et al., 2018; Hitchcock & Knobe, 2009; Mattar & Daw, 2018; Phillips et al., 2019). A key inquiry going forward will be the role of cached versus context-specific value estimation in these cases.

Tuesday, March 9, 2021

How social learning amplifies moral outrage expression in online social networks

Brady, W. J., McLoughlin, K. L., et al.
(2021, January 19).


Moral outrage shapes fundamental aspects of human social life and is now widespread in online social networks. Here, we show how social learning processes amplify online moral outrage expressions over time. In two pre-registered observational studies of Twitter (7,331 users and 12.7 million total tweets) and two pre-registered behavioral experiments (N = 240), we find that positive social feedback for outrage expressions increases the likelihood of future outrage expressions, consistent with principles of reinforcement learning. We also find that outrage expressions are sensitive to expressive norms in users’ social networks, over and above users’ own preferences, suggesting that norm learning processes guide online outrage expressions. Moreover, expressive norms moderate social reinforcement of outrage: in ideologically extreme networks, where outrage expression is more common, users are less sensitive to social feedback when deciding whether to express outrage. Our findings highlight how platform design interacts with human learning mechanisms to impact moral discourse in digital public spaces.

From the Conclusion

At first blush, documenting the role of reinforcement learning in online outrage expressions may seem trivial. Of course, we should expect that a fundamental principle of human behavior, extensively observed in offline settings, will similarly describe behavior in online settings. However, reinforcement learning of moral behaviors online, combined with the design of social media platforms, may have especially important social implications. Social media newsfeed algorithms can directly impact how much social feedback a given post receives by determining how many other users are exposed to that post. Because we show here that social feedback impacts users’ outrage expressions over time, this suggests newsfeed algorithms can influence users’ moral behaviors by exploiting their natural tendencies for reinforcement learning.  In this way, reinforcement learning on social media differs from reinforcement learning in other environments because crucial inputs to the learning process are shaped by corporate interests. Even if platform designers do not intend to amplify moral outrage, design choices aimed at satisfying other goals --such as profit maximization via user engagement --can indirectly impact moral behavior because outrage-provoking content draws high engagement. Given that moral outrage plays a critical role in collective action and social change, our data suggest that platform designers have the ability to influence the success or failure of social and political movements, as well as informational campaigns designed to influence users’ moral and political attitudes. Future research is required to understand whether users are aware of this, and whether making such knowledge salient can impact their online behavior.

People are more likely to express online "moral outrage" if they have either been rewarded for it in the past or it's common in their own social network.  They are even willing to express far more moral outrage than they genuinely feel in order to fit in.

Tuesday, April 17, 2018

Planning Complexity Registers as a Cost in Metacontrol

Kool, W., Gershman, S. J., & Cushman, F. A. (in press). Planning complexity registers as a
cost in metacontrol. Journal of Cognitive Neuroscience.


Decision-making algorithms face a basic tradeoff between accuracy and effort (i.e., computational demands). It is widely agreed that humans have can choose between multiple decision-making processes that embody different solutions to this tradeoff: Some are computationally cheap but inaccurate, while others are computationally expensive but accurate. Recent progress in understanding this tradeoff has been catalyzed by formalizing it in terms of model-free (i.e., habitual) versus model-based (i.e., planning) approaches to reinforcement learning. Intuitively, if two tasks offer the same rewards for accuracy but one of them is much more demanding, we might expect people to rely on habit more in the difficult task: Devoting significant computation to achieve slight marginal accuracy gains wouldn’t be “worth it”. We test and verify this prediction in a sequential RL task. Because our paradigm is amenable to formal analysis, it contributes to the development of a computational model of how people balance the costs and benefits of different decision-making processes in a task-specific manner; in other words, how we decide when hard thinking is worth it.

The research is here.