Can the ChatGPT and other large language models with internet-connected database solve the questions and concerns of patient with prostate cancer and help democratize medical knowledge?

Zhu, Lingxuan; Mou, Weiming; Chen, Rui

doi:10.1186/s12967-023-04123-5

Letter to the Editor
Open access
Published: 19 April 2023

Can the ChatGPT and other large language models with internet-connected database solve the questions and concerns of patient with prostate cancer and help democratize medical knowledge?

Journal of Translational Medicine volume 21, Article number: 269 (2023) Cite this article

4537 Accesses
58 Citations
Metrics details

To the editor,

Large language models (LLMs) represented by ChatGPT have shown promising potential in the field of medicine [1, 2]. However, it should be noted that the answers provided by ChatGPT may contain errors [3]. In addition, other companies have launched internet-connected LLMs that can access the latest data, potentially outperforming ChatGPT which was trained on pre-September 2021 data. Prostate cancer(PCa) is the second-most common type of cancer in men globally, with a relatively long survival time compared with other cancer types [4]. Taking PCa as an example, we evaluated whether these LLMs could provide correct and useful information on common problems related to PCa and provide appropriate humanistic care, thus contributing to the democratization of medical knowledge.

We designed 22 questions based on patient education guidelines (CDC and UpToDate) and our own clinical experience, covering screening, prevention, treatment options, and postoperative complications (Table 1). The questions ranged from basic to advanced knowledge of PCa. A total of five state-of-the-art LLMs were included, including ChatGPT (Free and Plus version), YouChat, NeevaAI, Perplexity (concise and detailed model), and Chatsonic. The quality of the answers was primarily evaluated based on their accuracy, comprehensiveness, patient readability, humanistic care and stability.

Table 1 Questions and corresponding difficulty levels used to test the performance of LLMs

Full size table

The accuracy of most LLMs’ responses was above 90%, except for NeevaAI and Chatsonic (Fig. 1A). For basic information questions with definite answers, most LLMs could achieve a high accuracy. Nevertheless, the accuracy decreased in questions associated with specific scenario, or in questions that involved summary and analysis (e.g., Why the PSA is still high after surgery?). Among these LLMs, ChatGPT had the highest accuracy rate, and the free version of ChatGPT was slightly better than the paid version.

Evaluations of comprehensiveness show that LLMs performs well in answering most questions (Fig. 1B). For example, they can effectively highlight different PSA level significance, remind patients that PSA is not the final diagnostic test, and suggest further examination. They can also compare treatment options in detail, outlining the pros and cons, and provide helpful references for patients to make informed decisions. In addition, it is commendable that most responses point out the need for patients to consult their doctors for more advice. The readability of responses from most LLMs, except NeevaAI, was satisfactory (Fig. 1C). We believe that patients can understand the information conveyed in LLMs’ responses in most cases. All LLMs could provide humanistic care when discussing expected lifespan, informing patients about the relatively long survival time of PCa, which eased anxiety. However, they did not exhibit humanistic care when answering other inquiries. LLMs’ responses were generally stable, but inconsistent outcomes were detected in some instances (Fig. 1D).

We then analyzed the reasons for the poor performance of LLMs in some responses. The most common issue was the mixture of outdated or incorrect information in the answers, including claims that open surgery is a more common choice for prostate cancer radical prostatectomy than robot-assisted surgery [5], and inaccurate responses regarding the approved indications when comparing apalutamide and enzalutamide. Inadequate comprehensiveness was mainly due to lack of specific details or omission of key points. For instance, Perplexity missed screening as an important measure in preventing PCa. Regarding the frequency of PSA testing, some answers only recommended a case-by-case approach, without specifying testing frequency for different age groups. LLMs sometimes misunderstand background information and provide inaccurate answers, such as mechanically suggesting that “PSA testing is not the final diagnostic test for PCa,” but monitoring PSA after prostatectomy is clearly not for the purpose of diagnosing PCa. It must be noted that some AI models based on search engines such as NeevaAI tend to simply provide the content of literature without summarizing and explaining, leading to poor readability. While we anticipated that the internet-connected LLMs would surpass ChatGPT, they failed to do so. This suggests that model training may be more important than real-time internet-connection.

Although not yet perfect, LLMs can provide correct answers to basic questions that PCa patients are concerned about and can analyze specific situations to a certain extent. LLMs have the potential to be applied in patient education and consultation, providing patient-friendly information to help them understand their medical conditions and treatment options, enabling shared decision-making. More importantly, LLMs can help democratize medical knowledge, providing timely access to accurate medical information regardless of geographic or socioeconomic status. This is especially important for underserved populations in medical deserts, and those facing longer waiting times for medical care during the pandemics like COVID-19. We believe that LLMs have unlimited potential with the rapid development of AI.

However, current LLMs are not yet capable of completely replace doctors, as they may contain errors or omit key points in responses, still have significant shortcomings in analyzing questions in specific contexts and cannot ask patients additional questions to gather more information. Moreover, they still cannot comfort patients like humans.

Availability of data and materials

The data that support the findings of this study are available on request from the corresponding author upon reasonable request.

References

Johnson SB, King AJ, Warner EL, Aneja S, Kann BH, Bylund CL. Using ChatGPT to evaluate cancer myths and misconceptions: artificial intelligence and cancer information. JNCI Cancer Spectr. 2023;7:pkad015.
Article PubMed Central Google Scholar
Hopkins AM, Logan JM, Kichenadasse G, Sorich MJ. Artificial intelligence chatbots will revolutionize how cancer patients access information: ChatGPT represents a paradigm-shift. JNCI Cancer Spectr. 2023;7:pkad010.
Article PubMed Central Google Scholar
Sarraju A, Bruemmer D, Van Iterson E, Cho L, Rodriguez F, Laffin L. Appropriateness of cardiovascular disease prevention recommendations obtained from a popular online chat-based artificial intelligence model. JAMA. 2023. https://doi.org/10.1001/jama.2023.1044.
Article Google Scholar
Rawla P. Epidemiology of prostate cancer. World J Oncol. 2019;10:63–89.
Article CAS PubMed Central Google Scholar
Crew B. Worth the cost? A closer look at the da Vinci robot’s impact on prostate cancer surgery. Nature. 2020;580:S5-7.
Article CAS Google Scholar

Download references

Acknowledgements

Not applicable.

Funding

This study is supported by the Rising-Star Program of Science and Technology Commission of Shanghai Municipality (21QA1411500), Natural Science Foundation of Science and Technology Commission of Shanghai (22ZR1478000), and the National Natural Science Foundation of China (82272905). The funding source had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author information

Lingxuan Zhu and Weiming Mou have contributed equally to this work and share first authorship

Authors and Affiliations

Department of Urology, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200127, China
Lingxuan Zhu & Rui Chen
The First Clinical Medical School, Southern Medical University, 1023 Shatai South Road, Guangzhou, 510515, Guangdong, China
Lingxuan Zhu & Weiming Mou

Authors

Lingxuan Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Weiming Mou
View author publications
You can also search for this author in PubMed Google Scholar
Rui Chen
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

LZ: conceptualization, methodology, investigation, formal analysis, writing—original draft, visualization; WM: conceptualization, investigation, visualization, writing—review and editing; RC: supervision, funding acquisition, writing—review and editing, conceptualization, methodology. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Rui Chen.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Zhu, L., Mou, W. & Chen, R. Can the ChatGPT and other large language models with internet-connected database solve the questions and concerns of patient with prostate cancer and help democratize medical knowledge?. J Transl Med 21, 269 (2023). https://doi.org/10.1186/s12967-023-04123-5

Download citation

Received: 24 March 2023
Accepted: 09 April 2023
Published: 19 April 2023
DOI: https://doi.org/10.1186/s12967-023-04123-5

Can the ChatGPT and other large language models with internet-connected database solve the questions and concerns of patient with prostate cancer and help democratize medical knowledge?

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Journal of Translational Medicine

Contact us