Welcome to the Nexus of Ethics, Psychology, Morality, Philosophy and Health Care

Welcome to the nexus of ethics, psychology, morality, technology, health care, and philosophy
Showing posts with label Alignment. Show all posts
Showing posts with label Alignment. Show all posts

Friday, September 1, 2023

Building Superintelligence Is Riskier Than Russian Roulette

Tam Hunt & Roman Yampolskiy
nautil.us
Originally posted 2 August 23

Here is an excerpt:

The precautionary principle is a long-standing approach for new technologies and methods that urges positive proof of safety before real-world deployment. Companies like OpenAI have so far released their tools to the public with no requirements at all to establish their safety. The burden of proof should be on companies to show that their AI products are safe—not on public advocates to show that those same products are not safe.

Recursively self-improving AI, the kind many companies are already pursuing, is the most dangerous kind, because it may lead to an intelligence explosion some have called “the singularity,” a point in time beyond which it becomes impossible to predict what might happen because AI becomes god-like in its abilities. That moment could happen in the next year or two, or it could be a decade or more away.

Humans won’t be able to anticipate what a far-smarter entity plans to do or how it will carry out its plans. Such superintelligent machines, in theory, will be able to harness all of the energy available on our planet, then the solar system, then eventually the entire galaxy, and we have no way of knowing what those activities will mean for human well-being or survival.

Can we trust that a god-like AI will have our best interests in mind? Similarly, can we trust that human actors using the coming generations of AI will have the best interests of humanity in mind? With the stakes so incredibly high in developing superintelligent AI, we must have a good answer to these questions—before we go over the precipice.

Because of these existential concerns, more scientists and engineers are now working toward addressing them. For example, the theoretical computer scientist Scott Aaronson recently said that he’s working with OpenAI to develop ways of implementing a kind of watermark on the text that the company’s large language models, like GPT-4, produce, so that people can verify the text’s source. It’s still far too little, and perhaps too late, but it is encouraging to us that a growing number of highly intelligent humans are turning their attention to these issues.

Philosopher Toby Ord argues, in his book The Precipice: Existential Risk and the Future of Humanity, that in our ethical thinking and, in particular, when thinking about existential risks like AI, we must consider not just the welfare of today’s humans but the entirety of our likely future, which could extend for billions or even trillions of years if we play our cards right. So the risks stemming from our AI creations need to be considered not only over the next decade or two, but for every decade stretching forward over vast amounts of time. That’s a much higher bar than ensuring AI safety “only” for a decade or two.

Skeptics of these arguments often suggest that we can simply program AI to be benevolent, and if or when it becomes superintelligent, it will still have to follow its programming. This ignores the ability of superintelligent AI to either reprogram itself or to persuade humans to reprogram it. In the same way that humans have figured out ways to transcend our own “evolutionary programming”—caring about all of humanity rather than just our family or tribe, for example—AI will very likely be able to find countless ways to transcend any limitations or guardrails we try to build into it early on.


Here is my summary:

The article argues that building superintelligence is a risky endeavor, even more so than playing Russian roulette. Further, there is no way to guarantee that we will be able to control a superintelligent AI, and that even if we could, it is possible that the AI would not share our values. This could lead to the AI harming or even destroying humanity.

The authors propose that we should pause our current efforts to develop superintelligence and instead focus on understanding the risks involved. He argues that we need to develop a better understanding of how to align AI with our values, and that we need to develop safety mechanisms that will prevent AI from harming humanity.  (See Shelley's Frankenstein as a literary example.)

Thursday, April 7, 2022

How to Prevent Robotic Sociopaths: A Neuroscience Approach to Artificial Ethics

Christov-Moore, L., Reggente, N.,  et al.
https://doi.org/10.31234/osf.io/6tn42

Abstract

Artificial intelligence (AI) is expanding into every niche of human life, organizing our activity, expanding our agency and interacting with us to an increasing extent. At the same time, AI’s efficiency, complexity and refinement are growing quickly. Justifiably, there is increasing concern with the immediate problem of engineering AI that is aligned with human interests.

Computational approaches to the alignment problem attempt to design AI systems to parameterize human values like harm and flourishing, and avoid overly drastic solutions, even if these are seemingly optimal. In parallel, ongoing work in service AI (caregiving, consumer care, etc.) is concerned with developing artificial empathy, teaching AI’s to decode human feelings and behavior, and evince appropriate, empathetic responses. This could be equated to cognitive empathy in humans.

We propose that in the absence of affective empathy (which allows us to share in the states of others), existing approaches to artificial empathy may fail to produce the caring, prosocial component of empathy, potentially resulting in superintelligent, sociopath-like AI. We adopt the colloquial usage of “sociopath” to signify an intelligence possessing cognitive empathy (i.e., the ability to infer and model the internal states of others), but crucially lacking harm aversion and empathic concern arising from vulnerability, embodiment, and affective empathy (which permits for shared experience). An expanding, ubiquitous intelligence that does not have a means to care about us poses a species-level risk.

It is widely acknowledged that harm aversion is a foundation of moral behavior. However, harm aversion is itself predicated on the experience of harm, within the context of the preservation of physical integrity. Following from this, we argue that a “top-down” rule-based approach to achieving caring, aligned AI may be unable to anticipate and adapt to the inevitable novel moral/logistical dilemmas faced by an expanding AI. It may be more effective to cultivate prosociality from the bottom up, baked into an embodied, vulnerable artificial intelligence with an incentive to preserve its real or simulated physical integrity. This may be achieved via optimization for incentives and contingencies inspired by the development of empathic concern in vivo. We outline the broad prerequisites of this approach and review ongoing work that is consistent with our rationale.

If successful, work of this kind could allow for AI that surpasses empathic fatigue and the idiosyncrasies, biases, and computational limits of human empathy. The scaleable complexity of AI may allow it unprecedented capability to deal proportionately and compassionately with complex, large-scale ethical dilemmas. By addressing this problem seriously in the early stages of AI’s integration with society, we might eventually produce an AI that plans and behaves with an ingrained regard for the welfare of others, aided by the scalable cognitive complexity necessary to model and solve extraordinary problems.

Thursday, February 6, 2020

Taking Stock of Moral Approaches to Leadership: An Integrative Review of Ethical, Authentic, and Servant Leadership

FIGURE 2G. James Lemoine, Chad A. Hartnell,
and Hannes Leroy
Academy of Management AnnalsVol. 13, No. 1
Published Online:16 Jan 2019
https://doi.org/10.5465/annals.2016.0121

Abstract

Moral forms of leadership such as ethical, authentic, and servant leadership have seen a surge of interest in the 21st century. The proliferation of morally based leadership approaches has resulted in theoretical confusion and empirical overlap that mirror substantive concerns within the larger leadership domain. Our integrative review of this literature reveals connections with moral philosophy that provide a useful framework to better differentiate the specific moral content (i.e., deontology, virtue ethics, and consequentialism) that undergirds ethical, authentic, and servant leadership, respectively. Taken together, this integrative review clarifies points of integration and differentiation among moral approaches to leadership and delineates avenues for future research that promise to build complementary rather than redundant knowledge regarding how moral approaches to leadership inform the broader leadership domain.

From the Conclusion section

Although morality’s usefulness in the leadership domain has often been questioned (e.g., Mumford & Fried, 2014), our comparative review of the three dominant moral approaches (i.e., ethical, authentic, and servant leadership) clearly indicates that moral leadership behaviors positively impact a host of desirable organizationally relevant outcomes. This conclusion counters old critiques that issues of morality in leadership are unimportant (e.g., England & Lee, 1974; Rost, 1991; Thompson, 1956). To the contrary, moral forms of leadership have much potential to explain leadership’s influence in a manner substantially distinct from classical forms of leadership such as task-oriented, relationship-oriented, and change-oriented leadership (DeRue, Nahrgang, Wellman, & Humphrey, 2011; Yukl, Gordon, & Taber, 2002).

Thursday, November 30, 2017

Why We Should Be Concerned About Artificial Superintelligence

Matthew Graves
Skeptic Magazine
Originally published November 2017

Here is an excerpt:

Our intelligence is ultimately a mechanistic process that happens in the brain, but there is no reason to assume that human intelligence is the only possible form of intelligence. And while the brain is complex, this is partly an artifact of the blind, incremental progress that shaped it—natural selection. This suggests that developing machine intelligence may turn out to be a simpler task than reverse- engineering the entire brain. The brain sets an upper bound on the difficulty of building machine intelligence; work to date in the field of artificial intelligence sets a lower bound; and within that range, it’s highly uncertain exactly how difficult the problem is. We could be 15 years away from the conceptual breakthroughs required, or 50 years away, or more.

The fact that artificial intelligence may be very different from human intelligence also suggests that we should be very careful about anthropomorphizing AI. Depending on the design choices AI scientists make, future AI systems may not share our goals or motivations; they may have very different concepts and intuitions; or terms like “goal” and “intuition” may not even be particularly applicable to the way AI systems think and act. AI systems may also have blind spots regarding questions that strike us as obvious. AI systems might also end up far more intelligent than any human.

The last possibility deserves special attention, since superintelligent AI has far more practical significance than other kinds of AI.

AI researchers generally agree that superintelligent AI is possible, though they have different views on how and when it’s likely to be developed. In a 2013 survey, top-cited experts in artificial intelligence assigned a median 50% probability to AI being able to “carry out most human professions at least as well as a typical human” by the year 2050, and also assigned a 50% probability to AI greatly surpassing the performance of every human in most professions within 30 years of reaching that threshold.

The article is here.