Jung In Park, University of California, Irvine – Survival Machine Learning for Breast Cancer Patients

On University of California, Irvine Week:  Racial disparities can exist even in machine learning.

Jung In Park, assistant professor in the Sue & Bill Gross School of Nursing, finds a way to tackle biases and make a difference for vulnerable populations.

Dr. Jung In Park is an assistant professor at UCI’s school of nursing since joining in 2019.  Her research area focuses on biomedical informatics using large datasets and AI/machine learning approaches to provide scientific evidence for predicting patient outcomes. Prior to UCI, Dr. Park was a postdoctoral Scholar at Stanford University in Biomedical Informatics. Dr. Park received her Ph.D. from the University of Minnesota in Nursing Informatics and her B.S. from Seoul National University in Nursing.

Survival Machine Learning for Breast Cancer Patients

Survival machine learning is widely accepted as a useful approach for forecasting future events. But there is one caveat – machine learning models can potentially create racial disparities through the data used to train them.

In healthcare, machine learning can be used to predict the survival outcomes of patients, which is a critical component of treatment. Determining survival outcomes helps practitioners determine proper treatment options and appropriate cancer care.

My team and I conducted this study to develop race and ethnicity-specific survival machine learning models for Hispanic and Black women diagnosed with breast cancer. The goal of this study was to examine whether race and ethnicity-specific machine learning models outperform the general model trained with all race and ethnicity data.

Using the data from the National Cancer Institute’s cancer registries, we were able to develop the Hispanic-specific and Black-specific models and compare them with the general model using the Cox proportional-hazards model, Gradient Boost Tree, survival tree and survival support vector machines.

After analyzing the results, we identified over 300,000 female patients who had breast cancer diagnoses between 2000 and 2017. When comparing the race and ethnicity specific models to the general model we found that Hispanic and Black women models outperformed when predicting the outcomes of specific race and ethnicity.

Predicting the individualized survival outcomes of breast cancer patients can further provide the evidence needed for identifying treatment options and high-quality cancer care for minority populations. Additionally, race and ethnicity-specific machine learning models can help tackle representation bias and contribute to mitigating health disparities.