Sudarshan, M., Shih, S, et al. (2024).
arXiv.org
Abstract
The application of Large Language Models (LLMs) in healthcare is expanding rapidly, with one potential use case being the translation of formal medical re-ports into patient-legible equivalents. Currently, LLM outputs often need to be edited and evaluated by a human to ensure both factual accuracy and comprehensibility, and this is true for the above use case. We aim to minimize this step by proposing an agentic workflow with the Reflexion framework, which uses iterative self-reflection to correct outputs from an LLM. This pipeline was tested and compared to zero-shot prompting on 16 randomized radiology reports. In our multi-agent approach, reports had an accuracy rate of 94.94% when looking at verification of ICD-10 codes, compared to zero-shot prompted reports, which had an accuracy rate of 68.23%. Additionally, 81.25% of the final reflected reports required no corrections for accuracy or readability, while only 25% of zero-shot prompted reports met these criteria without needing modifications. These results indicate that our approach presents a feasible method for communicating clinical findings to patients in a quick, efficient and coherent manner whilst also retaining medical accuracy. The codebase is available for viewing at http://github.com/ malavikhasudarshan/Multi-Agent-Patient-Letter-Generation.
Here are some thoughts:
The article focuses on using Large Language Models (LLMs) in healthcare to create patient-friendly versions of medical reports, specifically in the field of radiology. The authors present a new multi-agent workflow that aims to improve the accuracy and readability of these reports compared to traditional methods like zero-shot prompting. This workflow involves multiple steps: extracting ICD-10 codes from the original report, generating multiple patient-friendly reports, and using a reflection model to select the optimal version.
The study highlights the success of this multi-agent approach, demonstrating that it leads to higher accuracy in terms of including correct ICD-10 codes and produces reports that are more concise, structured, and formal compared to zero-shot prompting. The authors acknowledge that while their system significantly reduces the need for human review and editing, it doesn't completely eliminate it. The article emphasizes the importance of clear and accessible medical information for patients, especially as they increasingly gain access to their own records. The goal is to reduce patient anxiety and confusion, ultimately enhancing their understanding of their health conditions.