Using hybrid pre-trained models for breast cancer detection

Sameh Zarif, Hatem Abdulkader, Ibrahim Elaraby, Abdullah Alharbi, Wail S. Elkilani, Paweł PławiakSameh Zarif, Hatem Abdulkader, Ibrahim Elaraby, Abdullah Alharbi, Wail S. Elkilani, Paweł Pławiak


Breast cancer is a prevalent and life-threatening disease that affects women globally. Early detection and access to top-notch treatment are crucial in preventing fatalities from this condition. However, manual breast histopathology image analysis is time-consuming and prone to errors. This study proposed a hybrid deep learning model (CNN+EfficientNetV2B3). The proposed approach utilizes convolutional neural networks (CNNs) for the identification of positive invasive ductal carcinoma (IDC) and negative (non-IDC) tissue using whole slide images (WSIs), which use pre-trained models to classify breast cancer in images, supporting pathologists in making more accurate diagnoses.


Breast cancer represents the most commonly diagnosed form of cancer and ranks as the second leading cause of cancer-related deaths on a global scale. [1]. Currently, there are more than 3.8 million women who suffer from breast cancer diagnoses. Breast cancer is the most common type diagnosed among women in the United States, indicating its high prevalence [2]. According to the American Cancer Society in the United States, in 2023, increasing by about 0.5% per year, there will be approximately 297,790 diagnoses of invasive breast cancer and 43,700 deaths due to breast cancer [3].

Materials and method

We focus on the dataset and proposed method used for breast cancer classification. The dataset used for the model is discussed in detail to provide insights into the model’s training and testing data. Additionally, the method suggested is presented comprehensively, including an overview of the integration of multiple CNN models and the approach used to overcome previous limitations. A graphical representation of the proposed method is also illustrated in Fig 1.


The execution of the models was carried out in Python, utilizing the TensorFlow and Keras libraries, which provided high-level tools for various layers. The three models were evaluated on the BC-IDC dataset, and the best-performing one was deemed our proposed model; this was then compared to other models previously established in this field. This experiment was performed on Google Colab, a product of Google Research, which was utilized to conduct these experiments. It enables individuals to write and run Python code in a browser environment. Using Colab is free, and the experiment setup utilized a Tesla K80 GPU and 12 GB of RAM, allowing for efficient and time-saving model building.


In section 4, we saw that the accuracy of the first suggested approach, a combination of CNN and EfficientNetV2B3, on the histopathology images was higher than that of the second, a combination of MobileNet and DenseNet121, and the third, a combination of MobileNetV2 and EfficientNetV2_b0. Thus, we established the first approach as our own, making it a more stable and effective deep learning strategy.


In conclusion, this study highlights the novel CNN+EfficientNetV2B3 model, achieving an outstanding 96.3% accuracy by unifying multiple robust models. This approach demonstrated remarkable performance by integrating a hybrid model, combining a novel combination of custom CNN with EfficientNetV2B3 for feature extraction from histopathology images. The utilization of EfficientNetV2B3, trained on a large ImageNet dataset, offers a valuable resource for researchers and practitioners, allowing for the use of pre-trained model weights and insights obtained from ImageNet.

Citation: Zarif S, Abdulkader H, Elaraby I, Alharbi A, Elkilani WS, Pławiak P (2024) Using hybrid pre-trained models for breast cancer detection. PLoS ONE 19(1): e0296912.

Editor: Samrat Kumar Dey, BOU: Bangladesh Open University, BANGLADESH

Received: August 21, 2023; Accepted: December 21, 2023; Published: January 22, 2024

Copyright: © 2024 Zarif et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The dataset used in this study was obtained from the Kaggle open access database.

Funding: Research Supporting Project number (RSP2024R444), King Saud University, Riyadh, Saudi Arabia.

Competing interests: The authors have declared that no competing interests exist.

Abbreviations: CNN, Convolutional Neural Networks; IDC, Invasive Ductal Carcinoma; WSI, Whole Slide Images; BC, Breast Cancer; MCC, Matthew’s Correlation Coefficient; ROC-AUC, The Area Under the Curve of a Receiver Operating Characteristic; AUPRC, The Area Under the Curve of the Precision-Recall Curve; CAD, Computer Aided Detection; FDA, Food and Drug Administration; AI, Artificial Intelligent; ML, Machine Learning; DL, Deep Learning; TTA, Test Time Augmentation; GRU, Gated Recurrent Unit; MSRCNN, Multi-Scale Residual Convolutional Neural Network.



Harvard Medical School - Leadership in Medicine Southeast Asia47th IHF World Hospital CongressHealthcare CNO Summit - USAHealthcare CMO Summit - USA