Welcome to the Nexus of Ethics, Psychology, Morality, Philosophy and Health Care

Welcome to the nexus of ethics, psychology, morality, technology, health care, and philosophy
Showing posts with label ChatGPT. Show all posts
Showing posts with label ChatGPT. Show all posts

Friday, April 12, 2024

Large language models show human-like content biases in transmission chain experiments

Acerbi, A., & Stubbersfield, J. M. (2023).
PNAS, 120(44), e2313790120.


As the use of large language models (LLMs) grows, it is important to examine whether they exhibit biases in their output. Research in cultural evolution, using transmission chain experiments, demonstrates that humans have biases to attend to, remember, and transmit some types of content over others. Here, in five preregistered experiments using material from previous studies with human participants, we use the same, transmission chain-like methodology, and find that the LLM ChatGPT-3 shows biases analogous to humans for content that is gender-stereotype-consistent, social, negative, threat-related, and biologically counterintuitive, over other content. The presence of these biases in LLM output suggests that such content is widespread in its training data and could have consequential downstream effects, by magnifying preexisting human tendencies for cognitively appealing and not necessarily informative, or valuable, content.


Use of AI in the production of text through Large Language Models (LLMs) is widespread and growing, with potential applications in journalism, copywriting, academia, and other writing tasks. As such, it is important to understand whether text produced or summarized by LLMs exhibits biases. The studies presented here demonstrate that the LLM ChatGPT-3 reflects human biases for certain types of content in its production. The presence of these biases in LLM output has implications for its common use, as it may magnify human tendencies for content which appeals to these biases.

Here are the main points:
  • LLMs display stereotype-consistent biases, just like humans: Similar to people, LLMs were more likely to preserve information confirming stereotypes over information contradicting them.
  • Bias location might differ: Unlike humans, whose biases can shift throughout the retelling process, LLMs primarily showed bias in the first retelling. This suggests their biases stem from their training data rather than a complex cognitive process.
  • Simple summarization may suffice: The first retelling step caused the most content change, implying that even a single summarization by an LLM can reveal its biases. This simplifies the research needed to detect and analyze LLM bias.
  • Prompting for different viewpoints could reduce bias: The study suggests experimenting with different prompts to encourage LLMs to consider broader perspectives and potentially mitigate inherent biases.

Sunday, December 3, 2023

ChatGPT one year on: who is using it, how and why?

Ghassemi, M., Birhane, A., et al.
Nature 624, 39-41 (2023)
doi: https://doi.org/10.1038/d41586-023-03798-6

Here is an excerpt:

More pressingly, text and image generation are prone to societal biases that cannot be easily fixed. In health care, this was illustrated by Tessa, a rule-based chatbot designed to help people with eating disorders, run by a US non-profit organization. After it was augmented with generative AI, the now-suspended bot gave detrimental advice. In some US hospitals, generative models are being used to manage and generate portions of electronic medical records. However, the large language models (LLMs) that underpin these systems are not giving medical advice and so do not require clearance by the US Food and Drug Administration. This means that it’s effectively up to the hospitals to ensure that LLM use is fair and accurate. This is a huge concern.

The use of generative AI tools, in general and in health settings, needs more research with an eye towards social responsibility rather than efficiency or profit. The tools are flexible and powerful enough to make billing and messaging faster — but a naive deployment will entrench existing equity issues in these areas. Chatbots have been found, for example, to recommend different treatments depending on a patient’s gender, race and ethnicity and socioeconomic status (see J. Kim et al. JAMA Netw. Open 6, e2338050; 2023).

Ultimately, it is important to recognize that generative models echo and extend the data they have been trained on. Making generative AI work to improve health equity, for instance by using empathy training or suggesting edits that decrease biases, is especially important given how susceptible humans are to convincing, and human-like, generated texts. Rather than taking the health-care system we have now and simply speeding it up — with the risk of exacerbating inequalities and throwing in hallucinations — AI needs to target improvement and transformation.

Here is my summary:

The article on ChatGPT's one-year anniversary presents a comprehensive analysis of its usage, exploring the diverse user base, applications, and underlying motivations driving its adoption. It reveals that ChatGPT has found traction across a wide spectrum of users, including writers, developers, students, professionals, and hobbyists. This broad appeal can be attributed to its adaptability in assisting with a myriad of tasks, from generating creative content to aiding in coding challenges and providing language translation support.

The analysis further dissects how users interact with ChatGPT, showcasing distinct patterns of utilization. Some users leverage it for brainstorming ideas, drafting content, or generating creative writing, while others turn to it for programming assistance, using it as a virtual coding companion. Additionally, the article explores the strategies users employ to enhance the model's output, such as providing more context or breaking down queries into smaller parts.  There are still issues with biases, inaccurate information, and inappropriate uses.

Saturday, July 22, 2023

Generative AI companies must publish transparency reports

A. Narayanan and S. Kapoor
Knight First Amendment Institute
Originally published 26 June 23

Here is an excerpt:

Transparency reports must cover all three types of harms from AI-generated content

There are three main types of harms that may result from model outputs.

First, generative AI tools could be used to harm others, such as by creating non-consensual deepfakes or child sexual exploitation materials. Developers do have policies that prohibit such uses. For example, OpenAI's policies prohibit a long list of uses, including the use of its models to generate unauthorized legal, financial, or medical advice for others. But these policies cannot have real-world impact if they are not enforced, and due to platforms' lack of transparency about enforcement, we have no idea if they are effective. Similar challenges in ensuring platform accountability have also plagued social media in the past; for instance, ProPublica reporters repeatedly found that Facebook failed to fully remove discriminatory ads from its platform despite claiming to have done so.

Sophisticated bad actors might use open-source tools to generate content that harms others, so enforcing use policies can never be a comprehensive solution. In a recent essay, we argued that disinformation is best addressed by focusing on its distribution (e.g., on social media) rather than its generation. Still, some actors will use tools hosted in the cloud either due to convenience or because the most capable models don’t tend to be open-source. For these reasons, transparency is important for cloud-based generative AI.

Second, users may over-rely on AI for factual information, such as legal, financial, or medical advice. Sometimes they are simply unaware of the tendency of current chatbots to frequently generate incorrect information. For example, a user might ask "what are the divorce laws in my state?" and not know that the answer is unreliable. Alternatively, the user might be harmed because they weren’t careful enough to verify the generated information, despite knowing that it might be inaccurate. Research on automation bias shows that people tend to over-rely on automated tools in many scenarios, sometimes making more errors than when not using the tool.

ChatGPT includes a disclaimer that it sometimes generates inaccurate information. But OpenAI has often touted its performance on medical and legal exams. And importantly, the tool is often genuinely useful at medical diagnosis or legal guidance. So, regardless of whether it’s a good idea to do so, people are in fact using it for these purposes. That makes harm reduction important, and transparency is an important first step.

Third, generated content could be intrinsically undesirable. Unlike the previous types, here the harms arise not because of users' malice, carelessness, or lack of awareness of limitations. Rather, intrinsically problematic content is generated even though it wasn’t requested. For example, Lensa's avatar creation app generated sexualized images and nudes when women uploaded their selfies. Defamation is also intrinsically harmful rather than a matter of user responsibility. It is no comfort to the target of defamation to say that the problem would be solved if every user who might encounter a false claim about them were to exercise care to verify it.

Quick summary: 

The call for transparency reports aims to increase accountability and understanding of the inner workings of generative AI models. By disclosing information about the data used to train the models, the companies can address concerns regarding potential biases and ensure the ethical use of their technology.

Transparency reports could include details about the sources and types of data used, the demographics represented in the training data, any data augmentation techniques applied, and potential biases detected or addressed during model development. This information would enable users, policymakers, and researchers to evaluate the capabilities and limitations of the generative AI systems.

Friday, June 16, 2023

ChatGPT Is a Plagiarism Machine

Joseph Keegin
The Chronicle
Originally posted 23 MAY 23

Here is an excerpt:

A meaningful education demands doing work for oneself and owning the product of one’s labor, good or bad. The passing off of someone else’s work as one’s own has always been one of the greatest threats to the educational enterprise. The transformation of institutions of higher education into institutions of higher credentialism means that for many students, the only thing dissuading them from plagiarism or exam-copying is the threat of punishment. One obviously hopes that, eventually, students become motivated less by fear of punishment than by a sense of responsibility for their own education. But if those in charge of the institutions of learning — the ones who are supposed to set an example and lay out the rules — can’t bring themselves to even talk about a major issue, let alone establish clear and reasonable guidelines for those facing it, how can students be expected to know what to do?

So to any deans, presidents, department chairs, or other administrators who happen to be reading this, here are some humble, nonexhaustive, first-aid-style recommendations. First, talk to your faculty — especially junior faculty, contingent faculty, and graduate-student lecturers and teaching assistants — about what student writing has looked like this past semester. Try to find ways to get honest perspectives from students, too; the ones actually doing the work are surely frustrated at their classmates’ laziness and dishonesty. Any meaningful response is going to demand knowing the scale of the problem, and the paper-graders know best what’s going on. Ask teachers what they’ve seen, what they’ve done to try to mitigate the possibility of AI plagiarism, and how well they think their strategies worked. Some departments may choose to take a more optimistic approach to AI chatbots, insisting they can be helpful as a student research tool if used right. It is worth figuring out where everyone stands on this question, and how best to align different perspectives and make allowances for divergent opinions while holding a firm line on the question of plagiarism.

Second, meet with your institution’s existing honor board (or whatever similar office you might have for enforcing the strictures of academic integrity) and devise a set of standards for identifying and responding to AI plagiarism. Consider simplifying the procedure for reporting academic-integrity issues; research AI-detection services and software, find one that works best for your institution, and make sure all paper-grading faculty have access and know how to use it.

Lastly, and perhaps most importantly, make it very, very clear to your student body — perhaps via a firmly worded statement — that AI-generated work submitted as original effort will be punished to the fullest extent of what your institution allows. Post the statement on your institution’s website and make it highly visible on the home page. Consider using this challenge as an opportunity to reassert the basic purpose of education: to develop the skills, to cultivate the virtues and habits of mind, and to acquire the knowledge necessary for leading a rich and meaningful human life.

Saturday, May 20, 2023

ChatGPT Answers Beat Physicians' on Info, Patient Empathy, Study Finds

Michael DePeau-Wilson
MedPage Today
Originally published 28 April 23

The artificial intelligence (AI) chatbot ChatGPT outperformed physicians when answering patient questions, based on quality of response and empathy, according to a cross-sectional study.

Of 195 exchanges, evaluators preferred ChatGPT responses to physician responses in 78.6% (95% CI 75.0-81.8) of the 585 evaluations, reported John Ayers, PhD, MA, of the Qualcomm Institute at the University of California San Diego in La Jolla, and co-authors.

The AI chatbot responses were given a significantly higher quality rating than physician responses (t=13.3, P<0.001), with the proportion of responses rated as good or very good quality (≥4) higher for ChatGPT (78.5%) than physicians (22.1%), amounting to a 3.6 times higher prevalence of good or very good quality responses for the chatbot, they noted in JAMA Internal Medicine in a new tab or window.

Furthermore, ChatGPT's responses were rated as being significantly more empathetic than physician responses (t=18.9, P<0.001), with the proportion of responses rated as empathetic or very empathetic (≥4) higher for ChatGPT (45.1%) than for physicians (4.6%), amounting to a 9.8 times higher prevalence of empathetic or very empathetic responses for the chatbot.

"ChatGPT provides a better answer," Ayers told MedPage Today. "I think of our study as a phase zero study, and it clearly shows that ChatGPT wins in a landslide compared to physicians, and I wouldn't say we expected that at all."

He said they were trying to figure out how ChatGPT, developed by OpenAI, could potentially help resolve the burden of answering patient messages for physicians, which he noted is a well-documented contributor to burnout.

Ayers said that he approached this study with his focus on another population as well, pointing out that the burnout crisis might be affecting roughly 1.1 million providers across the U.S., but it is also affecting about 329 million patients who are engaging with overburdened healthcare professionals.


"Physicians will need to learn how to integrate these tools into clinical practice, defining clear boundaries between full, supervised, and proscribed autonomy," he added. "And yet, I am cautiously optimistic about a future of improved healthcare system efficiency, better patient outcomes, and reduced burnout."

After seeing the results of this study, Ayers thinks that the research community should be working on randomized controlled trials to study the effects of AI messaging, so that the future development of AI models will be able to account for patient outcomes.

Saturday, May 13, 2023

Doctors are drowning in paperwork. Some companies claim AI can help

Geoff Brumfiel
NPR.org - Health Shots
Originally posted 5 APR 23

Here are two excerpts:

But Paul kept getting pinged from younger doctors and medical students. They were using ChatGPT, and saying it was pretty good at answering clinical questions. Then the users of his software started asking about it.

In general, doctors should not be using ChatGPT by itself to practice medicine, warns Marc Succi, a doctor at Massachusetts General Hospital who has conducted evaluations of how the chatbot performs at diagnosing patients. When presented with hypothetical cases, he says, ChatGPT could produce a correct diagnosis accurately at close to the level of a third- or fourth-year medical student. Still, he adds, the program can also hallucinate findings and fabricate sources.

"I would express considerable caution using this in a clinical scenario for any reason, at the current stage," he says.

But Paul believed the underlying technology can be turned into a powerful engine for medicine. Paul and his colleagues have created a program called "Glass AI" based off of ChatGPT. A doctor tells the Glass AI chatbot about a patient, and it can suggest a list of possible diagnoses and a treatment plan. Rather than working from the raw ChatGPT information base, the Glass AI system uses a virtual medical textbook written by humans as its main source of facts – something Paul says makes the system safer and more reliable.


Nabla, which he co-founded, is now testing a system that can, in real time, listen to a conversation between a doctor and a patient and provide a summary of what the two said to one another. Doctors inform their patients that the system is being used in advance, and as a privacy measure, it doesn't actually record the conversation.

"It shows a report, and then the doctor will validate with one click, and 99% of the time it's right and it works," he says.

The summary can be uploaded to a hospital records system, saving the doctor valuable time.

Other companies are pursuing a similar approach. In late March, Nuance Communications, a subsidiary of Microsoft, announced that it would be rolling out its own AI service designed to streamline note-taking using the latest version of ChatGPT, GPT-4. The company says it will showcase its software later this month.

Wednesday, May 10, 2023

Foundation Models are exciting, but they should not disrupt the foundations of caring

Morley, Jessica and Floridi, Luciano
(April 20, 2023).


The arrival of Foundation Models in general, and Large Language Models (LLMs) in particular, capable of ‘passing’ medical qualification exams at or above a human level, has sparked a new wave of ‘the chatbot will see you now’ hype. It is exciting to witness such impressive technological progress, and LLMs have the potential to benefit healthcare systems, providers, and patients. However, these benefits are unlikely to be realised by propagating the myth that, just because LLMs are sometimes capable of passing medical exams, they will ever be capable of supplanting any of the main diagnostic, prognostic, or treatment tasks of a human clinician. Contrary to popular discourse, LLMs are not necessarily more efficient, objective, or accurate than human healthcare providers. They are vulnerable to errors in underlying ‘training’ data and prone to ‘hallucinating’ false information rather than facts. Moreover, there are nuanced, qualitative, or less measurable reasons why it is prudent to be mindful of hyperbolic claims regarding the transformative power ofLLMs. Here we discuss these reasons, including contextualisation, empowerment, learned intermediaries, manipulation, and empathy. We conclude that overstating the current potential of LLMs does a disservice to the complexity of healthcare and the skills of healthcare practitioners and risks a ‘costly’ new AI winter. A balanced discussion recognising the potential benefits and limitations can help avoid this outcome.


The technical feats achieved by foundation models in the last five years, and especially in the last six months, are undeniably impressive. Also undeniable is the fact that most healthcare systems across the world are under considerable strain. It is right, therefore, to recognise and invest in the potentially transformative power of models such as Med-PaLM and ChatGPT – healthcare systems will almost certainly benefit.  However, overstating their current potential does a disservice to the complexity of healthcare and the skills required of healthcare practitioners. Not only does this ‘hype’ risk direct patient and societal harm, but it also risks re-creating the conditions of previous AI winters when investors and enthusiasts became discouraged by technological developments that over-promised and under-delivered. This could be the most harmful outcome of all, resulting in significant opportunity costs and missed chances to benefit transform healthcare and benefit patients in smaller, but more positively impactful, ways. A balanced approach recognising the potential benefits and limitations can help avoid this outcome. 

Friday, May 5, 2023

Is the world ready for ChatGPT therapists?

Ian Graber-Stiehl
Originally posted 3 May 23

Since 2015, Koko, a mobile mental-health app, has tried to provide crowdsourced support for people in need. Text the app to say that you’re feeling guilty about a work issue, and an empathetic response will come through in a few minutes — clumsy perhaps, but unmistakably human — to suggest some positive coping strategies.

The app might also invite you to respond to another person’s plight while you wait. To help with this task, an assistant called Kokobot can suggest some basic starters, such as “I’ve been there”.

But last October, some Koko app users were given the option to receive much-more-complete suggestions from Kokobot. These suggestions were preceded by a disclaimer, says Koko co-founder Rob Morris, who is based in Monterey, California: “I’m just a robot, but here’s an idea of how I might respond.” Users were able to edit or tailor the response in any way they felt was appropriate before they sent it.

What they didn’t know at the time was that the replies were written by GPT-3, the powerful artificial-intelligence (AI) tool that can process and produce natural text, thanks to a massive written-word training set. When Morris eventually tweeted about the experiment, he was surprised by the criticism he received. “I had no idea I would create such a fervour of discussion,” he says.


Automated therapist

Koko is far from the first platform to implement AI in a mental-health setting. Broadly, machine-learning-based AI has been implemented or investigated in the mental-health space in three roles.

The first has been the use of AI to analyse therapeutic interventions, to fine-tune them down the line. Two high-profile examples, ieso and Lyssn, train their natural-language-processing AI on therapy-session transcripts. Lyssn, a program developed by scientists at the University of Washington in Seattle, analyses dialogue against 55 metrics, from providers’ expressions of empathy to the employment of CBT interventions. ieso, a provider of text-based therapy based in Cambridge, UK, has analysed more than half a million therapy sessions, tracking the outcomes to determine the most effective interventions. Both essentially give digital therapists notes on how they’ve done, but each service aims to provide a real-time tool eventually: part advising assistant, part grading supervisor.

The second role for AI has been in diagnosis. A number of platforms, such as the REACH VET program for US military veterans, scan a person’s medical records for red flags that might indicate issues such as self-harm or suicidal ideation. This diagnostic work, says Torous, is probably the most immediately promising application of AI in mental health, although he notes that most of the nascent platforms require much more evaluation. Some have struggled. Earlier this year, MindStrong, a nearly decade-old app that initially aimed to leverage AI to identify early markers of depression, collapsed despite early investor excitement and a high-profile scientist co-founder, Tom Insel, the former director of the US National Institute of Mental Health.

Monday, April 24, 2023

ChatGPT in the Clinic? Medical AI Needs Ethicists

Emma Bedor Hiland
The Hastings Center
Originally published by 10 MAR 23

Concerns about the role of artificial intelligence in our lives, particularly if it will help us or harm us, improve our health and well-being or work to our detriment, are far from new. Whether 2001: A Space Odyssey’s HAL colored our earliest perceptions of AI, or the much more recent M3GAN, these questions are not unique to the contemporary era, as even the ancient Greeks wondered what it would be like to live alongside machines.

Unlike ancient times, today AI’s presence in health and medicine is not only accepted, it is also normative. Some of us rely upon FitBits or phone apps to track our daily steps and prompt us when to move or walk more throughout our day. Others utilize chatbots available via apps or online platforms that claim to improve user mental health, offering meditation or cognitive behavioral therapy. Medical professionals are also open to working with AI, particularly when it improves patient outcomes. Now the availability of sophisticated chatbots powered by programs such as OpenAI’s ChatGPT have brought us closer to the possibility of AI becoming a primary source in providing medical diagnoses and treatment plans.

Excitement about ChatGPT was the subject of much media attention in late 2022 and early 2023. Many in the health and medical fields were also eager to assess the AI’s abilities and applicability to their work. One study found ChatGPT adept at providing accurate diagnoses and triage recommendations. Others in medicine were quick to jump on its ability to complete administrative paperwork on their behalf. Other research found that ChatGPT reached, or came close to reaching, the passing threshold for United States Medical Licensing Exam.

Yet the public at large is not as excited about an AI-dominated medical future. A study from the Pew Research Center found that most Americans are “uncomfortable” with the prospect of AI-provided medical care. The data also showed widespread agreement that AI will negatively affect patient-provider relationships, and that the public is concerned health care providers will adopt AI technologies too quickly, before they fully understanding the risks of doing so.

In sum: As AI is increasingly used in healthcare, this article argues that there is a need for ethical considerations and expertise to ensure that these systems are designed and used in a responsible and beneficial manner. Ethicists can play a vital role in evaluating and addressing the ethical implications of medical AI, particularly in areas such as bias, transparency, and privacy.

Friday, April 14, 2023

The moral authority of ChatGPT

KrĂ¼gel, S., Ostermaier, A., & Uhl, M.
Posted in 2023


ChatGPT is not only fun to chat with, but it also searches information, answers questions, and gives advice. With consistent moral advice, it might improve the moral judgment and decisions of users, who often hold contradictory moral beliefs. Unfortunately, ChatGPT turns out highly inconsistent as a moral advisor. Nonetheless, it influences users’ moral judgment, we find in an experiment, even if they know they are advised by a chatting bot, and they underestimate how much they are influenced. Thus, ChatGPT threatens to corrupt rather than improves users’ judgment. These findings raise the question of how to ensure the responsible use of ChatGPT and similar AI. Transparency is often touted but seems ineffective. We propose training to improve digital literacy.


We find that ChatGPT readily dispenses moral advice although it lacks a firm moral stance. Indeed, the chatbot gives randomly opposite advice on the same moral issue.  Nonetheless, ChatGPT’s advice influences users’ moral judgment. Moreover, users underestimate ChatGPT’s influence and adopt its random moral stance as their own. Hence, ChatGPT threatens to corrupt rather than promises to improve moral judgment. Transparency is often proposed as a means to ensure the responsible use of AI. However, transparency about ChatGPT being a bot that imitates human speech does not turn out to affect how much it influences users.

Our results raise the question of how to ensure the responsible use of AI if transparency is not good enough. Rules that preclude the AI from answering certain questions are a questionable remedy. ChatGPT has such rules but can be brought to break them. Prior evidence suggests that users are careful about AI once they have seen it err. However, we probably should not count on users to find out about ChatGPT’s inconsistency through repeated interaction. The best remedy we can think of is to improve users’ digital literacy and help them understand the limitations of AI.

Friday, March 3, 2023

The Petrov dilemma: Moral responsibility in the age of ChatGPT

Julian Savulescu
NUS News
Originally published 13 FEB 23

Here is an excerpt:

3. Act ethically

What is to act ethically has been debated for thousands of years. But at its heart is a simple answer: add value to the lives of other people. For Lt-Col Petrov, doing nothing was the hardest choice: “All I had to do was to reach for the phone; to raise the direct line to our top commanders – but I couldn’t move. I felt like I was sitting on a hot frying pan.”

ChatGPT is an extremely powerful technology with enormous potential for human benefit. It can distill human knowledge and wisdom, quickly create factual and normative content, enable communication between speakers of different languages, and ultimately enhance or even surpass human creativity.

As with any powerful technology, there are significant risks. At present, there are numerous mistakes and false claims, but these presumably will be ironed out. Longer term, risks include to threats to privacy; large-scale perpetuation of biases; plagiarism; displacement of human workers and creativity; deception and malicious use.

Acting ethically requires realising the value of technology while ensuring the risks are reasonable. Institutional or structural changes may be necessary, for example, labelling and creation of watermarks to prevent plagiarism.

To be human is to make your own ethical decisions

One ethical question that confronts regulators and industry is: should a driverless car swerve and sacrifice an innocent bystander (pedestrian or cyclist) to save a greater number of lives in the vehicle? I asked ChatGPT whether it is right to kill one to save five. Here is the answer:

“Whether or not it is right to kill one person to save five people is a complex ethical question. Some people may argue that taking one life to save five is the more logical option, and is therefore justified. However, it is also important to consider the ethical principle of respect for autonomy.

“This principle states that all individuals have an inherent right to make autonomous and informed decisions about their lives, including their own death. Taking this into consideration, it may not be morally right to force someone to accept the sacrifice of their life in order to save the lives of five other people. Ultimately, the answer to this question is subjective and depends on one’s own ethical and moral beliefs.”

But as driverless cars are rolled out, we won’t have the luxury of remaining on the fence, or relying on drivers’ reflexes to get us out of a decision. We will have to make moral choices and give reasons, and apply them consistently to programming the vehicles. This requires deep – and human – ethical thinking and expertise.