AI as a complementary diagnostic tool for women’s health

16 min read

Professor Aftab Ahmad

CUNY John Jay College of Criminal Justice, New York NY, USA

Summary

In this article, we review the current research in artificial intelligence (AI), especially in deep learning network algorithms, that can be instrumental in providing research data, privacy measures, early warning, detection and management of breast cancer. AI can complement the work of radiologists, oncologists, care takers, and researchers, among others. With the proposed use of cloud storage with AI, a life-threatening disease like breast cancer can get much needed assistance in handling the data by patients, providers, hospitals and researchers, in protecting patients’ privacy, and even in shortening the time for research findings to loop back to the patient.

Introduction

It will not be incorrect to say that medical research has seen a biased emphasis on gender-related health issues [18]. While it is easy to consider prevalence of male gender in research funding as a factor for this bias, the truth is that women's representation in medical research remains lower than the proportion of their population [23]. Studies on gender disparities against women reveal clearly a gap in not only them being seen by the same-gender physicians, but also in same diagnosis for the same symptoms, leading to denials of social security benefits for health-related disabilities [4]. However, the good news is that the number of female physicians continues to rise, especially in developed countries [21]. There is awareness on funding gap and research initiatives on women's health, particularly in breast cancer, but the results are neither too encouraging, nor applicable to women at all stages of life. Among menopausal women, mortality due to breast cancer has not decreased, as reported by the Women Health Initiative (WHI) [5]. With artificial intelligence (AI) making ripples in every field of life, one may wonder if its benefits for women's health will get a boost in retrospect to the traditional negligence. Studies have already been conducted, e.g., on ChatGPT [26] on suitability of AI chatbots, in this paper regarding two of its versions (GPT3.5 and 4), on analyzing mammograms. While this study finds that both versions were short on meeting the gold standard for diagnosis, we leave this question open and instead address the more general question: what can be done to put AI in the service of women's health? There is an underlying assumption here that AI is beneficial. The reason for this assumption is that AI, particularly deep learning, has passed the initial test of its usability in the form of success of large language models (LLM)s [22]. What really needs to be determined is how it can benefit humanity in general, and women's health in particular. In this article, the focus is on the application of deep learning as a complementary diagnostic tool for women's health with a focus on breast cancer.

In women, breast cancer constitutes 30% of all cancer diagnoses, making it the most prevalent form of cancer. A large number of women get regular mammograms because of family history. Sometimes controversaries arise due to false positives [3,27], variation in detection abilities and interpretations of physicians [7], and inability of computers to provide accurate diagnosis [8], or just outright the efficacy of mammograms to correctly predict the onset of breast cancer [29] due to any of the four reasons cited in this paper. Deep learning may already have advanced to the point that it can be helpful in early and accurate diagnosis.

Of research, privacy, and early warning

In this section, we will discuss the continuing need for using mammogram data for research, the problem of possible breaches in patient privacy, and the need for warning systems to alert potential patients and their providers of the new research findings. In the subsequent sections, we will discuss how deep learning can contribute as an assistive technology by helping in research, protecting patient privacy, and generating alerts to get early attention yet minimizing false positives.

Research data for breast cancer

Data for breast cancer can be collected in many ways, such as through a general trial, from specific groups, in which women from a race, or with given symptoms, or family history participate in the study, or even from an individual, in which case mammograms of the same person are used for research into pre-cursors, onset, detection, and prognosis of the disease. Data centers can be used for sharing data with researchers. A data breach in the data from public or specific population group can result in leakage of personally identifiable information (PII) unless data is properly anonymized. Differential privacy is the state-of-the-art technology for privacy protection proposed to be applied to genome data [13]. However, differential privacy is not immune to data breaches [11]. For a single patient, it may be less challenging to protect the PII, but the amount of data may not be nearly as sufficient as needed to make accurate conclusions as reported in the individual participant data analysis for a study on the efficacy of digital breast tomosynthesis (DBT) in [17]. For each of the three cases, deep learning brings something new to the diagnostic methods and analysis. Thus, AI, in collaboration with cloud storage and cloud computing (a relatively mature technology) delivers solutions to make it easy for the patient and physician to handle data, to automate supply of research data with embedded privacy measures and create a warning system for immediate notification to the relevant parties. In the next Section, we present a scheme of integrating these technologies by identifying at which point AI integration happens.

An AI-native, cloud-based data system for research and warning generation

AI is nothing new. In fact, the American Association of Artificial Intelligence (AAAI) has been active since 1980 in the form of contributions to original research, AI applications and education. What has changed in the 21st century is a boost in the processing hardware and introduction of new algorithms in the realm of artificial neural networks (ANN)s, of which deep learning is a subset. Three types of ANN algorithms have particularly revolutionized the field. These are, generative adversarial network (GAN) popularly known from deep fakes, variational autoencoder (VAE) used to detect anomalies (outliers) in datasets, and the transformer used in natural language processing (NLP). These and other algorithms have made it possible to make AI integral to the design of medical data systems, be it related to storage of electronic medical records (EMR)s, privacy provision in data centers, design of alarm systems for the onset of diseases, or diagnoses. Figure 1 shows a schematic in which AI-native systems can serve multiple parties, including patients, physicians, researchers, and public health officials.

As seen from Figure 1, data originating facilities (hospital/lab/clinic), patients, and researchers can all securely upload and access patient data to and on the cloud. Deep learning can be employed on the stored data to generate synthetic data (the so-called fake data). The synthetic data has the same statistical properties as the original data but without any personally identifiable information (PII) in the data. This synthetic data can be shared with researchers. Patients and providers can access the actual data on the cloud that is not shared with researchers, thus protecting data from privacy breaches as well as eliminating the need for storing data on portable media, such as a USB or a CD. The data users for research may be required to share the results of their research by uploading the new research findings back to the cloud storage – as a payback for using data. Alarm systems can then inform providers about the potential impact of the new findings relevant to their patients either automatically or through secure access. In this schematic, (i) the data is always available to the patients, and all they need to do when changing providers is let their unique patient ID be known to the next provider, (ii) PII is never disclosed to data users who need data for research, and (iii) latest results of research loop back to the providers and through them to patients. Figure 1 also shows where exactly the AI algorithms will be used in this arrangement.

Relevance to mammograms

The system depicted in Figure 1 is particularly beneficial for working with mammograms because of several reasons. First, a large number of women are advised to get mammograms on a yearly basis, thus generating a series of data spread over many years. For such a series of mammograms year after year, the proposed system can automatically look for precursors of cancer and make it possible to set a warning using AI-based detection based on anomaly detection algorithms before it is too late for the patient. It must be noted though that more research is needed in determining the onset of an anomaly from precursor dataset, but the good news is that research is ongoing in this area. See for example [12] for the efforts going on in this direction and its applications in various fields. In this paper, the authors present their deep learning solutions for an irregular time series data, which in fact may be a good fit for mammograms because of variation happening due to aging. What cloud storage provides is the cumulative data from a large number of subjects and all this (synthetic) data can be applied to a single AI model to make general conclusions. Secondly, in case of a single subject, deep learning provides methods to augment data by applying various techniques, as pointed out in [31]. These include commonly used techniques, such as regeneration of data using GANs, rotating, scaling, flipping, and cropping, and customizable algorithms that apply to given datasets (for example, augmenting part of image instead of the whole image, focus of various resolutions, etc.). One benefit of automating data feed to research is to first apply all data to reach a conclusion and then apply the single subject’s augmented data to determine the effectiveness of the multiple-subject data inference on a single subject. In these two scenarios, the use of synthetic data can completely mask private information. Thirdly, the success rate of cancer therapies can be determined by storing the therapeutic techniques against results as part of cloud storage. As pointed out in [28], at each stage, there are multiple treatment choices. With the help of storage of datasets (images, biomarkers, techniques, frequencies, etc.) along with subject-specific data (age, race, family history, other biomarkers), the AI techniques can provide a customized treatment plan for a given set of conditions. Lastly, it is not very easy to manage mammograms of many years for a patient due to inconvenience and for the fact that people are moving a lot more due to jobs and other reasons. Having access to their mammograms, women do not have to go through the process of safe keeping and complex permission procedures. All they have to do is provide a special access key designed just for being given to the providers. The provider, by combining this access key with their own credentials, can access the images easily and securely.

Next, we will say something regarding the relevant breakthroughs in deep learning.

Leading relevant deep learning algorithms

As mentioned earlier, hardware, in particular graphics processing unit (GPU), and the aforementioned new algorithms are vital to revolutionary surge in AI applications. These algorithms can be categorized into (i) generative AI, (ii) variational autoencoder (VAE), and (iii) transformer. VAE algorithms are types of generative too, but we keep them in a separate class due to their ability to detect outliers. Here is a brief description of these algorithms and their relevance to the topic of this article.

Generative AI for privacy and data augmentation

Generative AI has been made popular by the generative adversarial networks (GAN)s, but it is not limited to them. GAN was introduced by Goodfellow et al (see their later version in [9]) by demonstrating that we can use noise to generate a target image by repetitively removing the error in the noise-generated image by comparing it with the real image (see Figure 2 for an illustration).

As seen in Figure 2, a GAN is a cascade operation of two algorithms, a generator initially generating a random image from noise, and a discriminator comparing the generated image with the real image and generating the difference between the two as loss. In turn, the generator uses this loss to fix the generated images repeatedly until the discriminator can’t tell the difference between the real and the generated images. In the diffusion model, the real data is converted into noise with known diffusion (embedding) of noise at each time step of the algorithm. When the data starts appearing like complete noise, the reverse process (denoising) is possible to regenerate the original data. For mammograms, the data consists of images with some important properties, such as it is black and white, and the images from successive years must be treated as a time series, thus making the data a multivariate time series. In order to provide privacy to patients, the original data is used in a generative set up to generate synthetic data with the same statistical properties as the original data but without containing any personally identifiable information. Consequently, in any analysis except by human eye, synthetic data can be used, while the provider, such as a radiologist, can access the real data to confirm the results. In this way, generative AI can be employed in protecting patient privacy and for generating synthetic data for research. Synthetic data algorithms can also be used for data augmentation as described in [6]. Work on both aspects of generative AI for breast cancer research is ongoing. See for example [19] for its application in data augmentation and [1] for application of synthetic data in breast cancer research.

Deep learning for anomaly detection (diagnosis)

If a deep neural network algorithm is trained on a specific data type to simply recognize the input data on which it is trained, interesting things can happen. Such an algorithm, called autoencoder (AE) has many applications to identify new data in the form of hand-written text, audio-visual data and images. It also has the ability to identify anomalies by indicating that the input data is slightly different from what it was trained on! This ability can be generalized by employing statistical properties of input data and by using generated probabilistic data to match the training data. Such an algorithm can tell that something is abnormal in the input data. An example of such training is to create an VAE model trained on normal breast mammograms, so that when it is used with abnormal mammograms, it raises a red flag. Variational AE (VAE) was proven to be a valuable generative model by Kingma, (see, for example [14]) that can be instrumental in identifying outliers in a dataset. Due to its utility in many areas, VAE work has been extended substantially and is also the subject of research for mammogram analysis, such as reported in [32]. Figure 3 shows the function of a generic VAE algorithm for anomaly detection.

After this algorithm has been trained on normal data, the induced noise characteristics match those of the data. When it is used with abnormal data, the matching does not work and data points outside the ‘normal’ can be identified. This can lead to detection of abnormalities in mammograms.

Detection versus prediction

Using data outliers is an old and mature technique and does not owe itself to AI or deep learning (see for example [15] for a nice presentation on the topic). Deep learning has the promise to get very high accuracy and to reduce false positives. Besides, research is headed to using precursors instead of easily detectable outliers in not just mammograms, but also proteins and other molecular markers, including genetic data. Early warning could be made possible by analyzing datasets and training VAEs to identify possibility of onset, something on which a number of women invest a lot of time and money. See, for example [2] for an account of various factors possibly linked to breast cancer that can assist in creating precursor datasets. Having mammograms, genetic and other molecular biomarkers stored as synthetic data in the cloud, being channelized for constant research can ultimately provide the required fuel to kickstart and produce results for VAE algorithms that will be able to send early warnings to providers and patients so that treatments can be planned before the cancer onset.

Autogenerated reports

The third major algorithm mentioned above, that is, the transformer, has resulted in advancements in Large Language Models (LLM)s that have already made AI a household name. All major AI companies have transformer-based application programming interfaces (API)s used in a number of natural-language related applications. The innovation of transformer got a big boost from the simple idea that word- or sub-word-contexts are very important in natural languages [30] resulting in modern LLMs. We are heading towards applying the LLMs to individual sectors including healthcare [20]. Its ultimate impact will be that the AI-based reporting systems will be customized to individual subjects and create alerts and on-demand reports for doctors and patients in any language. Work is ongoing specifically for breast cancer applications in this regard, as reported in [16, 24, 25].

Final word

Application of AI, particularly of deep learning, in healthcare is unstoppable as it has a humongous potential. The question is ‘can we make sure that women’s health is not ignored this time around?’. The awareness of investing in research on women’s health, coupled with the fact that more and more women are joining the medical professions, means that it may be impossible to ignore their health. Granted that, how beneficial can AI technology be in providing relief to women in handling their data, keeping their privacy, and receiving timely warnings from new research. We contend through this article that all components are there for researchers, funding agencies and governments to invest in women’s health with AI developments. We took the example of breast cancer, a disease killing the largest proportion of women due to cancer worldwide, and second largest in USA. The goal is to show specific deep learning capabilities that relate to providing secure access to data through cloud, the ability to protect their privacy through generative AI, the ability to provide early warnings using precursor-based research, and detection capability along with convenience of auto-generated reports based on the natural language models. There is enough research work already in practice for investors and governments to fund AI for this purpose, and enough business potential for healthcare industry to rise to the occasion.

Despite the recent strides in machine intelligence, human intelligence is irreplaceable. AI is an assistive technology with far-reaching benefits and its timely application can be a life saver. It cannot, however, replace human intelligence and the need for qualified supervision by a human is always going to be there.

Acknowledgment

I would like to express my sincere gratitude to Sharif Ahmad, Asad Khan and Dr. Qaiser Qayyum for their thoughtful review and invaluable feedback. Their expertise and guidance have played a pivotal role in refining and enhancing this article.

References

[1] Aytar, B., & Gündüç, S. (2024). Generation of Synthetic Data Using Breast Cancer Dataset and Classification with Resnet18. Karaelmas Fen ve Mühendislik Dergisi, 14(3), 74-85.

[2] Berlin, L., & Hall, F. M. (2010). More mammography muddle: emotions, politics, science, costs, and polarization. Radiology, 255(2), 311-316.

[3] Brewer, N. T., Salz, T., & Lillie, S. E. (2007). Systematic review: the long-term effects of false-positive mammograms. Annals of internal medicine, 146(7), 502-510.

[4] Cabral, M., & Dillender, M. (2024). Gender Differences in Medical Evaluations: Evidence from Randomly Assigned Doctors. American Economic Review, 114(2), 462-499.

[5] Chlebowski, R. T., Aragaki, A. K., Pan, K., Simon, M. S., Neuhouser, M. L., Haque, R., ... & Manson, J. E. (2024). Breast cancer incidence and mortality by metabolic syndrome and obesity: The Women’s Health Initiative. Cancer.

[6] Dijkstra, R. (2024). The Effects of Data Augmentation and Synthetic Data in Breast Cancer Detection.

[7] Elmore, J. G., Wells, C. K., Lee, C. H., Howard, D. H., & Feinstein, A. R. (1994). Variability in radiologists' interpretations of mammograms. New England Journal of Medicine, 331(22), 1493-1499.

[8] Ganesan, K., Acharya, U. R., Chua, C. K., Min, L. C., Abraham, K. T., & Ng, K. H. (2012). Computer-aided breast cancer detection using mammograms: a review. IEEE Reviews in biomedical engineering, 6, 77-98.

[9] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio, Y. (2020). Generative adversarial networks. Communications of the ACM, 63(11), 139-144.

[10] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio, Y. (2014). Generative adversarial nets. Advances in neural information processing systems, 27.

[11] Haeberlen, A., Pierce, B. C., & Narayan, A. (2011). Differential privacy under fire. In 20th USENIX Security Symposium (USENIX Security 11).

[12] Jhin, S. Y., Lee, J., & Park, N. (2023, August). Precursor-of-anomaly detection for irregular time series. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp. 917-929).

[13] Johnson, A., & Shmatikov, V. (2013, August). Privacy-preserving data exploration in genome-wide association studies. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 1079-1087).

[14] Kingma, D. P., & Welling, M. (2019). An introduction to variational autoencoders. Foundations and Trends® in Machine Learning, 12(4), 307-392.

[15] Kriegel, H. P., Kröger, P., & Zimek, A. (2010). Outlier detection techniques. Tutorial at KDD, 10, 1-76.

[16] Lee, J. J., Zepeda, A., Arbour, G., Isaac, K. V., Ng, R. T., & Nichol, A. M. (2024). Automated Identification of Breast Cancer Relapse in Computed Tomography Reports Using Natural Language Processing. JCO Clinical Cancer Informatics, 8, e2400107.

[17] Libesman, S., Zackrisson, S., Hofvind, S., Seidler, A. L., Bernardi, D., Lång, K., ... & Houssami, N. (2022). An individual participant data meta-analysis of breast cancer detection and recall rates for digital breast tomosynthesis versus digital mammography population screening. Clinical Breast Cancer, 22(5), e647-e654.

[18] Merone, L., Tsey, K., Russell, D., & Nagle, C. (2022). Sex inequalities in medical research: a systematic scoping review of the literature. Women's Health Reports, 3(1), 49-59.

[19] Moreno-Barea, F. J., Jerez, J. M., Ribelles, N., Alba, E., & Franco, L. (2024, June). Data Augmentation to Improve Molecular Subtype Prognosis Prediction in Breast Cancer. In International Conference on Computational Science (pp. 19-27). Cham: Springer Nature Switzerland.

[20] Nerella, S., Bandyopadhyay, S., Zhang, J., Contreras, M., Siegel, S., Bumin, A., ... & Rashidi, P. (2024). Transformers and large language models in healthcare: A review. Artificial Intelligence in Medicine, 102900.

[21] Pickel, L., & Sivachandran, N. (2024). Gender trends in Canadian medicine and surgery: the past 30 years. BMC Medical Education, 24(1), 100.

[22] Raiaan, M. A. K., Mukta, M. S. H., Fatema, K., Fahad, N. M., Sakib, S., Mim, M. M. J., ... & Azam, S. (2024). A review on large Language Models: Architectures, applications, taxonomies, open issues and challenges. IEEE Access.

[23] Rankin, J., Bedrava, J., Covington, E., Johnson, J. L., Pollard-Larkin, J., Schipper, M. J., ... & Paradis, K. C. (2024). Women in the Medical Physics Workforce: Insights from Membership Trends of the American Association of Physicists in Medicine, 1993 to 2023. International Journal of Radiation Oncology* Biology* Physics.

[24] Sorin, V., Glicksberg, B. S., Barash, Y., Konen, E., Nadkarni, G., & Klang, E. (2023). Applications of Large Language Models (LLMs) in Breast Cancer Care. medRxiv, 2023-11.

[25] Solarte-Pabón, O., Montenegro, O., García-Barragán, A., Torrente, M., Provencio, M., Menasalvas, E., & Robles, V. (2023). Transformers for extracting breast cancer information from Spanish clinical narratives. Artificial Intelligence in Medicine, 143, 102625.

[26] Spuur, K., Currie, G., Al-Mousa, D., & Pape, R. (2024). Suitability of ChatGPT as a Source of Patient Information for Screening Mammography. Health Promotion Practice, 15248399241285060.

[27] Tosteson, A. N., Fryback, D. G., Hammond, C. S., Hanna, L. G., Grove, M. R., Brown, M., ... & Pisano, E. D. (2014). Consequences of false-positive screening mammograms. JAMA internal medicine, 174(6), 954-961.

[28] Trayes, K. P., & Cokenakes, S. E. (2021). Breast cancer treatment. American family physician, 104(2), 171-178.

[29] Van Dijck, J. A., Verbeek, A. L., Hendriks, J. H., & Holland, R. (1993). The current detectability of breast cancer in a mammographic screening program. A review of the previous mammograms of interval and screen‐detected cancers. Cancer, 72(6), 1933-1938.

[30] Vaswani, A. (2017). Attention is all you need. Advances in Neural Information Processing Systems.

[31] Wang, J., & Perez, L. (2017). The effectiveness of data augmentation in image classification using deep learning. Convolutional Neural Networks Vis. Recognit, 11(2017), 1-8.

[32] Zhang, Z., Patel, B., Patel, B., & Banerjee, I. (2024). Unsupervised Hybrid framework for ANomaly Detection (HAND)--applied to Screening Mammogram. arXiv preprint arXiv:2409.11534.

AI as a complementary diagnostic tool for women’s health

1 Comment