Machine Learning in Female Pelvic Medicine and Reconstructive Surgery, Urology, and Beyond

By: Glenn T. Werneburg, MD, PhD, Cleveland Clinic, Ohio | Posted on: 17 Mar 2023

The artificial intelligence (AI) algorithm Deep Blue (IBM) defeated the reigning chess champion, Gary Kasparov, 25 years ago. More recently, AlphaGo (DeepMind, Alphabet Inc) outperformed a champion in “Go,” a game with more board combinations than atoms in the universe, and demonstrated proficiency in the game “Diplomacy,” which requires natural language negotiation among 7 players. Beyond games, AI algorithms are also being developed and deployed with increasing success to accomplish or assist in high-stakes tasks including automobile driving, warplane operation, optimization of food production, and prediction of financial market outcomes.

Unsurprisingly, machine learning (ML), the branch of AI wherein computers learn from experience, is poised to transform medicine, too. Applications include image interpretation, documentation automation, and clinical decision support. One of the most proximal medical applications is accurate treatment prognostication. In the field of female pelvic medicine and reconstructive surgery (FPMRS), multiple effective treatment options often exist for a given condition, and thus the specialty is an optimal area for ML implementation to improve counseling and treatment selection.

Overactive bladder (OAB) is a common constellation of symptoms including urinary urgency, frequency, and incontinence. Onabotulinumtoxin-A (OBTX-A) and sacral neuromodulation (SNM) are both effective options for medically refractory OAB, but their procedures and side effect profiles differ. Some patients have a clear preference or indication for one treatment over the other. For example, those with fecal incontinence may benefit from SNM (which has been approved for both urinary and fecal incontinence). However, the majority of patients with refractory OAB would be potential candidates for either OBTX-A or SNM. A subset of 16%-17% of patients who undergo OBTX-A or SNM does not respond to treatment.¹ The clinical identification of individuals who would respond vs not respond to a given OAB treatment remains elusive. Sparing a patient an unsuccessful therapeutic modality for OAB by preprocedural identification of nonresponders may result in lower health care expenditures, lower radiation exposure, fewer antibiotic doses, and improved outcomes.

ML methods were recently applied to predict treatment response to OBTX-A or SNM, and compare predictions with those of clinicians with expertise in OAB diagnosis and management (Figure 1). Algorithms developed by our group were trained on the ROSETTA trial cohort,¹ and were accurate in predicting treatment response for both OBTX-A (AUC 0.95) and SNM (AUC 0.88) in a subset of the ROSETTA patients they had never seen before.² Algorithms were superior to expert clinicians in prediction of OBTX-A response (Figure 2), and noninferior to clinicians in predicting SNM response (Figure 3). In a follow-up investigation, the algorithms demonstrated success in predicting patients’ perceived improvement in symptoms.³ One of the limitations of these studies was that the OBTX-A dosing regimen in the ROSETTA trial (200 U) was greater than the 100 U dosing commonly used and recommended for nonneurogenic OAB.⁴ Hendrickson et al used a combination of data sets for training that included both the 200 U and 100 U OBTX-A regimens.⁵ They generated predictive models for treatment outcomes using a series of statistical techniques including ML. Interestingly, the group found that the higher dose was associated with reduced time to recurrence and less reduction in urge incontinence episodes. Algorithms for OAB medical therapy prognostication have also been reported. Sheyn et al performed a study wherein they generated algorithms to predict likelihood of anticholinergic medication failure in individuals with OAB.⁶ Importantly, the group also validated their results on a separate, prospectively collected data set from an outside institution, in which it had greater than 80% accuracy.

Figure 1. The cover image of Neurourology & Urodynamics journal from March 2022 highlights the algorithms developed for the prediction of refractory overactive bladder treatment response. The cover image is based on the Clinical Article, Machine learning provides an accurate prognostication model for refractory overactive bladder treatment response and is noninferior to human experts by Glenn T. Werneburg et al (https://doi.org/10.1002/nau.24881).² The image, a balance between structure and randomness, was produced by the neural network described in the paper after training on uniform randomness. Cover Credit: Krein Space Interpolation Neural Network, Eric A. Werneburg, Glenn T. Werneburg. Reproduced with permission.

Figure 2. Machine learning algorithms accurately predict treatment response in individuals with medically refractory overactive bladder following onabotulinumtoxin-A (OBTX-A) injection and are superior to human experts. Algorithms include non-urodynamics (non-UDS) + urodynamics (UDS) variables (A), non-urodynamics only variables (B), and urodynamics only variables (C). Receiver operating characteristic curves are shown. Gold indicates the machine learning algorithm prediction, and blue and purple indicate the human expert predictions. The line of ignorance, which corresponds to random guessing, is indicated in red. Treatment response was defined as a greater than 50% reduction in urge urinary incontinence episodes on a voiding diary following treatment. Areas under the curves (AUC) and confidence intervals (CI) are shown in the legends. Reproduced with permission from Werneburg et al, Neurourol Urodyn. 2022;41(3):813-819.²

Figure 3. Machine learning algorithms accurately predict treatment response in individuals with medically refractory overactive bladder following sacral neuromodulation (SNM) and are noninferior to human experts. Algorithms include non-urodynamics (non-UDS) + urodynamics (UDS) variables (A), non-urodynamics only variables (B), and urodynamics only variables (C). Receiver operating characteristic curves are shown. Gold indicates the machine learning algorithm prediction, and blue and purple indicate the human expert predictions. The line of ignorance, which corresponds to random guessing, is indicated in red. Treatment response was defined as a greater than 50% reduction in urge urinary incontinence episodes on a voiding diary following treatment. Areas under the curves (AUC) and confidence intervals (CI) are shown in the legends. Reproduced with permission from Werneburg et al, Neurourol Urodyn. 2022;41(3):813-819.²

These compelling studies also highlight the need for additional work. In most of the investigations, the patient populations were homogenous with well-defined inclusion and exclusion criteria. In reality, patients with OAB are a heterogenous group with varied symptoms and histories, and different data obtained during workup. Further investigation is needed to determine whether results are generalizable beyond the training and validation cohorts used in each study. The most realistic challenge, necessary to determine true generalizability and clinical utility, is to task human experts and algorithms to make predictions regarding outcomes for patients in the prospective clinical setting. Here, expert humans would take a targeted history and obtain ancillary data according to clinical judgment, and make predictions in the context of their typical practice patterns.

ML is being effectively used to solve other problems in FPMRS as well. Two groups have recently shown the utility of ML for the identification of detrusor overactivity on urodynamics tracings. Wang et al trained algorithms on 799 traces from pediatrics patients, and demonstrated good performance on 5-fold cross-validation of their training data (AUC 0.84).⁷ Hobbs et al trained their algorithms on a series of 805 urodynamics tracings, also in a pediatric population, and demonstrated high performance accuracy in the prediction of detrusor overactivity based on the tracings in their validation set (AUC 0.92).⁸ Notably, the most common false-positive results were movement related, and thus such algorithms may be found to have even greater accuracy in the adult population.

Weaver et al recently reviewed the literature on urodynamics predictors of upper tract deterioration, and noted several shortcomings.⁹ They called for utilization of ML for classification of patients with spina bifida into those with preserved vs decompensated bladder function. They have an ongoing investigation regarding use of deep learning on urodynamics for prediction of hydronephrosis and chronic kidney disease in patients with spina bifida.

ML approaches have the potential to optimize diagnosis and treatment regimens for complex functional conditions including interstitial cystitis, neurogenic lower urinary tract dysfunction, and recurrent urinary tract infections. ML could have utility in classifying lower urinary tract symptoms.¹⁰ Precision treatment based on symptom classification may translate to improved outcomes, and is an exciting future area for investigation in FPMRS. We believe that ML will soon effectively integrate available clinical and laboratory data together with genotypic data, and make accurate predictions across conditions and therapeutic modalities. Such capabilities will be applicable beyond FPMRS and urology. They will transform diverse areas throughout medicine by optimizing diagnosis and management, and reducing health care expenses for patients and society.

Amundsen CL, Richter HE, Menefee SA, et al. OnabotulinumtoxinA vs sacral neuromodulation on refractory urgency urinary incontinence in women: a randomized clinical trial. JAMA. 2016;316(13):1366-1374.
Werneburg GT, Werneburg EA, Goldman HB, Mullhaupt AP, Vasavada SP. Machine learning provides an accurate prognostication model for refractory overactive bladder treatment response and is noninferior to human experts. Neurourol Urodyn. 2022;41(3):813-819.
Werneburg GT, Werneburg EA, Goldman HB, Mullhaupt AP, Vasavada SP. Neural networks outperform expert humans in predicting patient impressions of symptomatic improvement following overactive bladder treatment. Int Urogynecol J. 2022;https://doi.org/10.1007/s00192-022-05291-6.
Lightner DJ, Gomelsky A, Souter L, Vasavada SP. Diagnosis and treatment of overactive bladder (non-neurogenic) in adults: AUA/SUFU guideline amendment 2019. J Urol. 2019;202(3):558-563.
Hendrickson WK, Xie G, Rahn DD, et al. Predicting outcomes after intradetrusor onabotulinumtoxina for non-neurogenic urgency incontinence in women. Neurourol Urodyn. 2022;41(1):432-447.
Sheyn D, Ju M, Zhang S, et al. Development and validation of a machine learning algorithm for predicting response to anticholinergic medications for overactive bladder syndrome. Obstet Gynecol. 2019;134(5):946-957.
Wang HHS, Cahill D, Panagides J, Nelson CP, Wu HT, Estrada C. Pattern recognition algorithm to identify detrusor overactivity on urodynamics. Neurourol Urodyn. 2021;40(1):428-434.
Hobbs KT, Choe N, Aksenov LI, et al. Machine learning for urodynamic detection of detrusor overactivity. Urology. 2022;159:247-254.
Weaver J, Weiss D, Aghababian A, et al. Why are pediatric urologists unable to predict renal deterioration using urodynamics? A focused narrative review of the shortcomings of the literature. J Pediatr Urol. 2022;18(4):493-498.
Dallas KB, Chiang JN, Caron AT, Anger JT, Kaufman MR, Ackerman AL. Development and validation of machine learning algorithms to classify lower urinary tract symptoms. medRxiv. 2022;https://doi.org/10.1101/2022.12.25.22283168.

Machine Learning in Female Pelvic Medicine and Reconstructive Surgery, Urology, and Beyond

American Urological Association

About AUA

Quick Links