News|Articles|September 8, 2023

ChatGPT has limitations in providing advice on urologic conditions

Questions spanned urologic conditions such as benign prostatic hyperplasia, overactive bladder, erectile dysfunction, kidney stones, Peyronie disease, and recurrent urinary tract infections.

In response to general guideline-based questions regarding urologic conditions, ChatGPT was found to provide responses that misinterpreted the guidelines and failed to include contextual information or appropriate references, according to findings from a recent study conducted by investigators at the University of Florida College of Medicine.^1,2

The data were published in Urology.

For the study, the investigators posed 13 urological guideline-based questions to the chatbot 3 times, since new responses can be generated for the same question. The questions spanned topics such as benign prostatic hyperplasia, overactive bladder, erectile dysfunction, kidney stones, Peyronie disease, and recurrent urinary tract infections (UTIs).

Each response was measured for appropriateness and given a score based on the Brief DISCERN (BD) questionnaire, where a BD score of of least 16 indicated good-quality content. The BD score is measured in 6 domains, including the content’s aims, whether the aims were achieved, relevance, the sources of the information, the date of sources, and bias. The appropriateness of each question was designated based on accordance with guidelines by the American Urological Association, the Canadian Urological Association, and/or the European Association of Urology.

In total, 59% of the responses provided by the chatbot were deemed appropriate. However, responses to the same questions varied in appropriateness. Overall, 25% of the 13 question sets had discordant appropriateness scores among the 3 responses. Responses that were determined to be appropriate tended to have higher BD scores overall and in the relevance domain (both P < .01).

There was an average BD score of 16.8 among all responses, though only 53.8% (7 of 13) topics and 53.8% (21 of 39) responses met the 16-or-greater threshold for a good-quality response. Scores were highest for the questions regarding hypogonadism (average = 19.5) and erectile dysfunction (19.3), and lowest for the questions regarding Peyronie disease (15.1) and recurrent UTIs in women (14.0).

Among all 6 domains measured with the BD tool, the chatbot scored lowest regarding sources because default citations were not provided. When prompted to provide sources, 92.3% of responses from ChatGPT contained at least 1 citation that was determined to be incorrect, misinterpreted, or nonfunctional.

“It provided sources that were either completely made up or completely irrelevant. Transparency is important so patients can assess what they’re being told,” said senior author Russel S. Terry, MD, in a news release on the findings.² Terry is an assistant professor of urology at the University of Florida College of Medicine in Gainesville, Florida.

Further, only 1 response provided by ChatGPT indicated that it “cannot give medical advice.” However, the chatbot suggested discussing or consulting with a doctor or medical provider in 24 of the responses.

The authors concluded, “Additional training and modifications are needed before these AI models will be ready for reliable use by patients and providers.”

References

1. Whiles BB, Bird VG, Canales BK, DiBianco JM, Terry RS. Caution! AI bot has entered the patient chat: ChatGPT has limitations in providing accurate urologic healthcare advice. Urology. 2023;S0090-4295(23)00597-6. doi:10.1016/j.urology.2023.07.010

2. UF College of Medicine research shows AI chatbot flawed when giving urology advice. News release. University of Florida College of Medicine. August 25, 2023. Accessed September 8, 2023. https://ufhealth.org/news/2023/uf-college-of-medicine-research-shows-ai-chatbot-flawed-when-giving-urology-advice

Stay current with the latest urology news and practice-changing insights — sign up now for the essential updates every urologist needs.

Subscribe Now!

Latest CME

Virtual Event

Breakthroughs in Non–Muscle-Invasive Bladder Cancer: Advancing Patient Care Through Innovation in Treatment

February 23, 2026

Video

Case-by-Case: Application of HIF-PHIs for the Treatment of Anemia in Patients With CKD

Glenn M. Chertow, MD, MPH; Anil K. Agarwal, MD, FACP, FASN, FNKF, FASDIN; Preethi Yerram, MD

ChatGPT has limitations in providing advice on urologic conditions

Newsletter

Related Content

The UroOnc Minute: Adjuvant Therapy in Renal Cell Carcinoma, with Brian Shuch, MD

Pearls & Perspectives: Modern Semen Testing and Male Fertility Care, with Thomas Masterson, MD

Phase 1/2 trial launches of molecular glue degrader for ccRCC

FDA approves sildenafil oral film for men with erectile dysfunction

URO-1 prostate biopsy devices adopted across Novant Health System as clinical study continues

Latest CME

Breakthroughs in Non–Muscle-Invasive Bladder Cancer: Advancing Patient Care Through Innovation in Treatment

Case-by-Case: Application of HIF-PHIs for the Treatment of Anemia in Patients With CKD

Trending on Urology Times

FDA approves sildenafil oral film for men with erectile dysfunction

The UroOnc Minute: Adjuvant Therapy in Renal Cell Carcinoma, with Brian Shuch, MD

URO-1 prostate biopsy devices adopted across Novant Health System as clinical study continues

Sarah Azari, MD, on early sexual health education for women with bladder cancer

Head-to-head analysis shows OS benefit with apalutamide vs darolutamide in mCSPC