TransGAN-DX: A Hybrid Transformer-GAN Approach for Enhanced Cardiovascular Disease Diagnosis

^{1, 2} Ali Bayani, ^{1, 2} Masoud Kargar, ²Parmida Kargar, ³Ehsan Samadian, ¹ Department of Computer Engineering, Islamic Azad University, Tabriz Branch, Tabriz, Iran, ²Robotics and Soft Technologies Research Center, Islamic Azad University, Tabriz Branch, Tabriz, Iran, ³Department of Computer Engineering, Azerbaijan Charkhe Niloofari Higher Education Institute, Tabriz, Iran, Email: alibayani@iaut.ac.ir, kargar@iaut.ac.ir, p.kargar@iaut.ir, ehsan.samadian@chnaihe.ac.ir

Abstract

This research presents TransGAN-DX, a hybrid approach combining Transformer and Generative Adversarial Networks (GANs) for enhanced cardiovascular disease diagnosis. The model achieves high accuracy and robust interpretability by leveraging synthetic data generation and feature importance analysis. Calibration and clinical validation further ensure reliability, offering a scalable AI solution for healthcare.

I. INTRODUCTION

Problem Definition
Cardiovascular diseases (CVD) remain the leading cause of mortality worldwide, with nearly 18 million deaths each year. Early diagnosis is essential to improving patient outcomes, yet conventional diagnostic tools often fail to provide consistent precision [1]. These traditional methods face several limitations, such as an over-reliance on manual interpretation, inefficiencies in processing large volumes of complex clinical data, and a lack of capability to capture intricate relationships among various patient attributes [2]. Machine learning (ML) has emerged as a promising alternative, offering the potential to uncover hidden patterns in clinical data and deliver more accurate, rapid diagnostics. However, the widespread adoption of ML in healthcare faces significant hurdles. Two critical challenges are data imbalance and the lack of model interpretability [3].

Data imbalance arises when there are far fewer examples of diseased patients compared to healthy individuals, leading to biased model training that diminishes the accuracy of predictions for underrepresented classes [4]. Additionally, many ML models, particularly deep learning systems, are often seen as "black boxes" due to their complex and opaque nature, which hinders their acceptance in clinical environments where transparency and trust are paramount. In this context, there is a pressing need for models that can not only overcome these challenges but also offer interpretability to aid clinicians in making informed data-driven decisions [5].

Purpose

This research introduces TransGAN-DX, a hybrid deep learning framework designed to address these challenges. GANs have shown remarkable success in generating realistic synthetic data [6], while Transformers are well-known for their self-attention mechanisms that effectively capture relationships in sequential and tabular data [7]. Leveraging these strengths, our approach aims to improve diagnostic precision, reliability, and interpretability.

Innovation

Unlike traditional classifiers, TransGAN-DX tackles data imbalance by augmenting datasets with synthetic samples generated through GANs. It also employs Transformers for attention-based feature selection, making the decision-making process interpretable for clinicians. Furthermore, calibration techniques align the model's predictions with true probabilities, enhancing its reliability. This hybrid approach integrates the best of data augmentation, advanced classification, and interpretability for a robust diagnostic framework.

II. METHODOLOGY

A. Dataset Preparation
The research utilised the UCI Heart Disease dataset [8], which contains 303 patient records with 14 attributes, including age, cholesterol levels, and the presence of chest pain. These attributes are clinically significant as they represent primary risk factors and symptoms associated with CVD. For example, cholesterol levels correlate with arterial plaque buildup, while chest pain is a direct indicator of ischemic heart conditions. Data preprocessing steps included normalisation using a standard scaler and one-hot encoding for categorical variables. Missing values were handled via imputation, ensuring the dataset was clean and balanced. Figure 1 demonstrates our methodology structure.

Figure 1. Schema of the Proposed Method.

B. Model Architecture

Generative Adversarial Network (GAN):
The GAN consists of two neural networks: the Generator and the Discriminator. The Generator creates synthetic samples designed to replicate the real data distribution, while the Discriminator evaluates these samples to distinguish between real and generated data [9]. This adversarial training process, where the Generator continuously improves in response to the Discriminator’s feedback, results in synthetic data that closely resembles the real data, effectively addressing the issue of class imbalance [10]. By generating more samples for underrepresented classes, the GAN ensures a more diverse and balanced dataset. This not only boosts model accuracy for minority classes but also reduces the risk of overfitting, a common challenge in imbalanced datasets [11]. Moreover, the iterative nature of this process allows the GAN to generate high-quality, realistic data, providing an enhanced and robust dataset for training the subsequent classifier, which further improves the model’s overall predictive power.

Transformer-Based Classifier:
Transformers, initially designed for natural language processing, excel in modelling sequential dependencies [12]. In TransGAN-DX, a Transformer processes enriched data, focusing on key attributes to predict CVD presence. The self-attention mechanism not only highlights the most relevant features but also provides a transparent explanation of why certain attributes influence the prediction [13]. This interpretability allows clinicians to understand the decision-making process of the model, enabling them to trust the model’s reasoning and incorporate its predictions into their clinical workflows. By identifying and prioritizing the most impactful factors, such as cholesterol levels, age, or exercise-induced angina, TransGAN-DX offers actionable insights that align with established clinical guidelines, facilitating more informed and accurate diagnostic decisions. Moreover, this clarity supports model validation and ensures that clinicians can address potential biases, further reinforcing the reliability and usability of the model in real-world healthcare settings.

C. Model Training
GAN training involved generating synthetic data batches and evaluating them against real samples to ensure diversity and authenticity. The Transformer classifier was trained using cross-entropy loss, optimised through the Adam optimiser with a learning rate of 0.001. A validation set was used to monitor performance and prevent overfitting. Both models underwent extensive fine-tuning, ensuring optimal hyperparameter configurations for reliable outcomes.

D. Calibration and Interpretability
To enhance reliability, the model's predictions were calibrated using Expected Calibration Error (ECE) and visualized with calibration curves. Additionally, feature importance analysis was conducted to identify the most influential predictors of cardiovascular disease, aiding in improved clinical interpretability. This analysis provided actionable insights into how patient attributes like cholesterol levels and age contributed to predictions, further validating the framework’s alignment with clinical knowledge.

E. Evaluation Metrics
The performance of the model was rigorously assessed using industry-standard metrics to ensure its effectiveness and reliability in detecting cardiovascular disease:

Accuracy: Measures the proportion of correct classifications among all cases, providing an overall indication of model performance.
F1-Score: Represents the harmonic mean of precision and recall, making it particularly useful for evaluating models in imbalanced datasets by balancing sensitivity and specificity.
ROC-AUC: Captures the area under the Receiver Operating Characteristic curve, reflecting the model's ability to discriminate between positive and negative cases effectively.

III. RESULTS

Performance Metrics
TransGAN-DX demonstrated exceptional performance on the UCI Heart Disease dataset, achieving an accuracy of 89%, an F1-score of 86%, and a ROC-AUC of 88%. Table I shows that these metrics underscore the framework's ability to balance precision and recall effectively, outperforming traditional classifiers that often struggle with imbalanced datasets. For instance, benchmark classifiers like logistic regression and random forests achieved ROC-AUC values below 80%, highlighting the superiority of our methodology in handling minority class predictions.

Table I. Empirical Results of the Proposed Method Compared to Traditional Machine Learning Methods.

Models	Accuracy	F1-Score
Decision Tree Classifier	0.73	0.75
GaussianNB	0.78	0.79
K-Nearest Neighbors (KNN)	0.78	0.78
XGBoost Classifier	0.78	0.78
AdaBoost Classifier	0.78	0.79
Gradient Boosting Classifier	0.78	0.79
LGBM Classifier	0.80	0.80
CatBoost Classifier	0.82	0.81
Random Forest Classifier	0.82	0.82
Logistic Regression	0.84	0.84
Support Vector Machine (SVM)	0.85	0.85
TransGAN-DX	0.89	0.86

Feature Importance Analysis
The analysis of feature importance revealed that attributes such as cholesterol levels, age, and exercise-induced angina were the most influential in predicting disease presence. This insight was derived using statistical and model-based importance techniques, aligning with established clinical knowledge and validating the model’s interpretability.

Calibration Curves
The calibration curves indicated a strong alignment between predicted probabilities and actual outcomes, confirming the reliability of the predictions. Figure 2 illustrates that the Expected Calibration Error (ECE) was significantly reduced, demonstrating well-calibrated confidence levels. This ensures that when the model predicts a 70% likelihood of disease, the actual probability is close to 70%, enhancing clinical trust in its outputs.

Figure 2. Calibration Curve for TransGAN-DX.

IV. DISCUSSION

TransGAN-DX addresses two pivotal challenges in cardiovascular diagnostics: data imbalance and lack of interpretability. By combining the synthetic data generation capabilities of GANs with the attention mechanisms of Transformers, the framework achieves a unique balance of accuracy and explainability.

The novelty lies in its hybrid approach: while GANs enhance dataset diversity, Transformers provide actionable insights into feature importance. This dual capability makes our approach suitable for real-world clinical applications, where interpretability is as critical as precision.

TransGAN-DX stands out from existing approaches by combining synthetic data generation and attention-based classification into a unified framework, addressing critical challenges in cardiovascular diagnostics. The integration of GANs enables the generation of high-quality synthetic samples, effectively mitigating class imbalance and improving the classification precision for underrepresented cases, a limitation in traditional machine learning models.

Furthermore, the Transformer’s self-attention mechanism enhances interpretability by identifying key predictors, such as cholesterol levels and exercise-induced angina, thereby empowering clinicians to make informed decisions based on actionable insights.

The model's computational efficiency is also a key advantage, as Transformers, compared to other deep learning models, require fewer parameters and are less computationally intensive for similar performance levels. Finally, the model’s reliability is demonstrated through calibration curves, which align predicted probabilities with actual outcomes, ensuring trustworthy and consistent predictions. These unique advantages position the model as a cutting-edge tool that balances accuracy, transparency, and practical utility in clinical applications.

By identifying key predictors of cardiovascular disease, such as cholesterol levels and exercise-induced angina, TransGAN-DX provides insights that align with clinical guidelines. This synergy enhances its potential for integration into healthcare workflows.

Despite its strengths, the framework's reliance on synthetic data introduces potential biases that require careful validation. Further research is necessary to evaluate its scalability across diverse patient populations and disease types. Additionally, future work could explore the integration of multimodal data, including imaging and genomic information, to enhance diagnostic capabilities.

V. CONCLUSION

Our model represents a significant advancement in AI-driven cardiovascular diagnostics. Its hybrid architecture bridges the gap between data augmentation and interpretability, offering a scalable solution for clinical applications. By achieving high accuracy, robust reliability, and actionable insights, the framework paves the way for next-generation diagnostic tools in healthcare.
As the healthcare industry continues to embrace AI, frameworks like TransGAN-DX will play a pivotal role in transforming patient outcomes. Future directions include integrating real-time patient monitoring data and exploring its application in rare cardiovascular conditions, broadening its utility in clinical practice.

References

Taylan, O., et al., Early prediction in classification of cardiovascular diseases with machine learning, neuro-fuzzy and statistical methods. Biology, 2023. 12(1): p. 117.
Ghosh, P., et al., Efficient prediction of cardiovascular disease using machine learning algorithms with relief and LASSO feature selection techniques. IEEE Access, 2021. 9: p. 19304-19326.
Imrie, F., R. Davis, and M. van der Schaar, Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nature Machine Intelligence, 2023. 5(8): p. 824-829.
Bayani, A. and M. Kargar, LDCNN: A new arrhythmia detection technique with ECG signals using a linear deep convolutional neural network. Physiological Reports, 2024. 12(17): p. e16182.
Jin, D., et al., Explainable deep learning in healthcare: A methodological survey from an attribution view. WIREs Mechanisms of Disease, 2022. 14(3): p. e1548.
Goodfellow, I., et al., Generative adversarial networks. Communications of the ACM, 2020. 63(11): p. 139-144.
Zongren, L., et al., Focal cross transformer: Multi-view brain tumor segmentation model based on cross window and focal self-attention. Frontiers in Neuroscience, 2023. 17: p. 1192867.
Janosi, A., Steinbrunn, William, Pfisterer, Matthias, and Detrano, Robert, Heart Disease. UCI Machine Learning Repository, 1989.
Song, Y., et al., Computational discovery of new 2D materials using deep learning generative models. ACS Applied Materials & Interfaces, 2021. 13(45): p. 53303-53313.
Yang, H., et al., SPE-ACGAN: A resampling approach for class imbalance problem in network intrusion detection systems. Electronics, 2023. 12(15): p. 3323.
Buda, M., A. Maki, and M.A. Mazurowski, A systematic study of the class imbalance problem in convolutional neural networks. Neural networks, 2018. 106: p. 249-259.
Du, H., et al., Feature-Aware Contrastive Learning With Bidirectional Transformers for Sequential Recommendation. IEEE Transactions on Knowledge and Data Engineering, 2023.
Rudin, C., Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature machine intelligence, 2019. 1(5): p. 206-215.