11email: eexmli@ust.hk
22institutetext: Sun Yat-Sen University
33institutetext: Sun Yat-Sen Memorial Hospital
44institutetext: Chinese Academy of Sciences
Radiomics-Informed Deep Learning for Classification of Atrial Fibrillation Sub-Types from Left-Atrium CT Volumes
Abstract
Atrial Fibrillation (AF) is characterized by rapid, irregular heartbeats, and can lead to fatal complications such as heart failure. The disease is divided into two sub-types based on severity, which can be automatically classified through CT volumes for disease screening of severe cases. However, existing classification approaches rely on generic radiomic features that may not be optimal for the task, whilst deep learning methods tend to over-fit to the high-dimensional volume inputs. In this work, we propose a novel radiomics-informed deep-learning method, RIDL, that combines the advantages of deep learning and radiomic approaches to improve AF sub-type classification. Unlike existing hybrid techniques that mostly rely on naïve feature concatenation, we observe that radiomic feature selection methods can serve as an information prior, and propose supplementing low-level deep neural network (DNN) features with locally computed radiomic features. This reduces DNN over-fitting and allows local variations between radiomic features to be better captured. Furthermore, we ensure complementary information is learned by deep and radiomic features by designing a novel feature de-correlation loss. Combined, our method addresses the limitations of deep learning and radiomic approaches and outperforms state-of-the-art radiomic, deep learning, and hybrid approaches, achieving 86.9% AUC for the AF sub-type classification task. Code is available at https://github.com/xmed-lab/RIDL.
Keywords:
Atrial Fibrillation Radiomics CT Imaging Analysis1 Introduction
Atrial fibrillation (AF) is a cardiac disease characterized by rapid, irregular heartbeats [4]. The disease can lead to stroke and heart failure, and has a mortality rate of almost 20% [13, 5, 10]. AF is classified as either persistent atrial fibrillation (PeAF), where abnormal heart rhythms occur continuously for more than seven days, or paroxysmal atrial fibrillation (PaAF), where the heart rhythm returns to normal within seven days. Although AF can be treated through a procedure called catheter ablation, PeAF cases have high recurrence rates and often require re-intervention [8]. Accurate knowledge of the disease type is therefore highly valuable for treatment planning and has high prognostic value [22].
Clinical studies have discovered a strong relationship between AF and epicardial adipose tissue (EAT), a fat depot layer on the surface of the myocardium that can cause inflammation and disrupt cardiac function [15, 3]. Recent works have shown that automatic classification of AF sub-types can be done using CT volumes of the left atrium and surrounding EAT, which can be used to screen for patients with high risk of PeAF. Huber et al. [7] showed that EAT volume, approximated from left-atrium CT images, can be used as a predictor for AF recurrence. Yang et al. [22] trained a random forest model to classify AF sub-type based on radiomic features and volume measurements, achieving 85.3% AUC. Although these methods demonstrate the usefulness of radiomic features for AF sub-type classification, such features are generic and not specific to the task, which can limit model performance [12]. Radiomic features also rely on summary statistics such as entropy or homogeneity to obtain global descriptors, and these have limited effectiveness when capturing local feature variations [16].
Deep learning has achieved outstanding results on medical imaging analysis tasks, largely due to its ability to learn task-specific features and complex relations between them [17]. Naïvely using deep neural networks (DNNs) to predict AF sub-types from CT volumes yields poor results however due to over-fitting on high-dimensional volume inputs (see results for DNN in Table 1). Existing works have attempted to combine deep and radiomic features through methods such as direct concatenation [2, 19], attention modules [14], or contrastive learning between feature types [24]. Although these methods propose different ways of using both approaches, they do not explicitly address the limitations of either approach or explore ways to combine their complementary advantages.

In this work, we propose a novel approach to atrial fibrillation sub-type classification from CT volumes by integrating radiomic and deep learning methods. We note that textural radiomic features identified by feature selection methods can serve as an information prior to supplement low-level features from DNNs, since they are designed to capture low-level context and have predictive power [23]. To this end, we locally calculate radiomic features based on patches surrounding each voxel, and perform feature fusion with low-level DNN features. This provides the DNN with pre-defined features known to be relevant to the task to reduce over-fitting, and also allows spatial relations between radiomic features to be learned. Furthermore, we encourage the DNN to learn features complementary to radiomic features to obtain more comprehensive signals and design a novel feature de-correlation loss. The overall framework, which we term Radiomics-Informed Deep Learning (RIDL), is illustrated in Fig. 1. Unlike existing works, our method is designed to directly addresses the limitations of both deep learning and radiomic approaches and achieves state-of-the-art performance on AF sub-type classification. To summarize our key contributions:
-
•
We propose a novel radiomics-informed deep learning (RIDL) method for AF sub-type classification from CT volumes, which achieves state-of-the-art results and can be used to screen for patients with high risk of PeAF.
-
•
Our method uses a novel approach of fusing locally computed radiomic features with low-level DNN features to improve capturing of local context.
-
•
Furthermore, we enforce feature de-correlation using a novel feature-bank design to ensure complementary deep and radiomic features are extracted.
2 Methodology
We combine radiomic and deep learning approaches using two novel components: 1) feature fusion of local radiomic features and low-level DNN features to improve local context, 2) encouraging complementary deep and radiomic features through feature de-correlation. These are illustrated in Fig. 2 and explained in detail below. Our dataset includes samples of input and binary label , where 0 indicates PaAF and 1 indicates PeAF. has two channels, one consisting of the 3D CT volume centered around the left atrium and the other the binary region-of-interest (ROI) mask indicating EAT. The ROI is obtained through Hounsfield value thresholding such that all voxels valued between -250 and 0 are identified as EAT [7, 22].


2.1 Feature fusion of locally computed radiomic features with low-level DNN features for improved local context
Under the radiomics pipeline, a large set of features, typically more than a thousand, is first extracted by performing calculations over the volume and ROI input . Feature selection methodologies such as mutual information (MI), principal component analysis (PCA), or LASSO regularization, are then used to identify predictive features for classification [23]. Radiomic features are classified into shape, first-order statistical features, and texture features. Texture features are designed to capture local variations and use measures such as Gray-level Co-occurrence Matrices (GLCM) to reflect second-order textural distributions. Conventional statistics such as entropy and correlation are then used to summarize these measures [25], but these tend to be limited in their ability to capture local heterogeneity, such as the varying textures on the surface of a cancer tumor. Although DNN’s are more effective at capturing local variations, they can overfit without sufficient data for training [17].
Unlike existing works that naïvely concatenate radiomic and deep features before the classification layer [2, 19], we observe that textural features selected through radiomics feature selection algorithms are known to be predictive and can be used as prior knowledge to improve low-level DNN features. Given radiomic feature extractor , the global radiomic feature, , for input is represented by:
(1) |
Our method applies feature calculations locally to cubic patches centered around each voxel, such that features are obtained on a voxel basis and reflect the statistics of the neighbouring region. For a cubic patch with radius and input , the local feature at location , denoted by , is obtained by performing on the cubic patch in centered around :
(2) |
where the input of is the cubic sub-volume. This process is illustrated in Fig. 2a. Local features can be calculated for multiple texture features and patch size , which are then concatenated to obtain , where is the total number of features used and , , and are original input dimensions. We note that only texture radiomic features are used for local calculation since they are specifically intended to capture local context.
is then concatenated with low-level DNN features, , to supplement the DNN with local radiomic features. To effectively fuse the features, we apply a channel attention module, , following the design in [20]:
(3) |
where is the fused feature, is channel concatenation, and is element-wise multiplication. The learned attention tensor has dimensions and is broadcasted along the volume dimension, such that attention is applied channel-wise and spatial feature distributions are preserved.
2.2 Encouraging complementary deep and radiomic features through feature de-correlation
Global radiomic features are also included in our model by concatenation with high-level DNN features before the classification layer. Unlike existing approaches however, we encourage our DNN to learn features complementary to radiomic features by enforcing de-correlation between the two. This ensures that different variations are captured, which provides a more comprehensive signal to the classification layer.
Accurate approximation of correlation requires large batches sizes however, which requires large GPU memory and can affect model convergence [9]. We instead propose a novel feature-bank implementation with exponential weighting to estimate sample statistics. Every iteration, we save DNN and global radiomic features in feature-bank , which holds up to features in a first-in first-out queue. After a warm-up period, we calculate the sample correlation using an exponential weighting scheme. Given weight parameter , and the normalized deep feature and radiomic feature from , we calculate feature de-correlation loss as:
(4) |
The first samples, where is the batch size, belong to the training sample of the current iteration, and their losses are back-propagated to encourage deep features to have zero correlation with radiomic features. This process is illustrated in Fig. 2b. Although feature banks have been used in techniques such as contrastive learning to address batch size limitations [6, 21], we are the first to formulate this technique for feature de-correlation.
2.3 Overall framework
The DNN model uses raw CT volumes concatenated with ROI masks as input. Global and local radiomic features are pre-computed for input into the feature layer. Binary cross-entropy is used for AF sub-type classification loss :
(5) |
where is the model prediction for sample . The model is trained together with feature de-correlation loss and its loss weighting, . To provide further regularization and prevent over-fitting, we perform an additional self-reconstruction task, using loss , which we describe in more detail in the supplementary materials. The overall loss function is then:
(6) |
3 Experiments
3.1 Implementation Details
3.1.1 Dataset
We use a dataset of 172 patients containing 94 PaAF and 78 PeAF cases collected from the Sun Yat-Sen Memorial Hospital in China. CT volumes are centered on the left atrium and normalized to between -1 and 1. ROI masks for EAT are obtained through Hounsfield value thresholding between -250 and 0. Volumes are resized to the same aspect ratio to ensure consistent dimensions across samples. We use an input size of 96x128x128 voxels and apply zero padding for smaller volumes. We use five-fold cross-validation and report average test performance across folds. Cross-validation is implemented by splitting the dataset into five equal subsets and using three subsets for training, one subset for validation, and one subset for testing. A rolling scheme is used such that different validation and test subsets are used for each of the five folds. Data acquisition procedures and statistics are given in the supplementary materials.
3.1.2 Setup
We use the PyRadiomic package [18] to extract radiomic features from the input volumes and masks. Using the cross-validation splits, we perform feature selection and classification using LASSO regularized logistic regression. LASSO regularization consistently selects four radiomic features as the ones with the most significant predictive power: maximum 3D diameter, Maximum 2D Diameter, Maximum voxel value, and normalized inverse difference of GLCM (glcm_Idn). The texture feature glcm_Idn is calculated locally for to obtain local radiomic features .
For our DNN network, we use a modified 3D U-Net [1] (abbreviated as m3DUNet) with skip connections between the encoder and decoder removed to enhance bottle-neck feature compression. Bottle-neck features are averaged across spatial dimensions for classification, whilst decoder outputs are used for self-reconstruction regularization. The model is trained using the Adam optimizer with learning rate for 100 epochs and 0.1 decay at 30 epochs. We use batch size , feature bank size , and warm-up period of one epoch. We use for de-correlation loss weighting, which was chosen based on the validation splits. Mean and standard deviation of ten runs are reported. Additional experiments and details are included in the supplementary materials.
3.2 Comparison with State-of-the-art Methodologies
We compare our method with alternative state-of-the-art approaches based on radiomics, deep learning, and hybrid techniques. Deep and hybrid volume-based classification methods [14, 24, 11] are adapted to our task since there are no existing works for AF sub-type classification. We use the same encoder for all deep architectures for fair comparison, except for methods that are architecture specific. A naïve feature concatenation method is used as our baseline for the hybrid approach. Radiomic features for the hybrid approach are selected through LASSO regularization as it is the most effective. Results are shown in Table 1.
Type | Selector∗ | Classifier | AUC (%) | MAP (%) | F1 (%) | Acc. (%) |
---|---|---|---|---|---|---|
Radiomic | Mutual Information | SVM | 74.2 | 65.2 | 71.1 | 74.7 |
Random Forest [22] | 72.4 | 63.4 | 70.5 | 72.4 | ||
Logistic Regression | 68.7 | 59.1 | 65.7 | 69.3 | ||
LASSO | Logistic Regression | 83.4 | 81.7 | 69.1 | 73.9 |
Type | Method | Model | AUC (%) | MAP (%) | F1 (%) | Acc. (%) |
---|---|---|---|---|---|---|
Deep Learning | Lee et al. [11] | f-rMC5 | 63.3 6.3 | 64.8 3.9 | 31.5 7.6 | 63.3 4.3 |
DNN⋆ | m3DUNet | 77.2 1.5 | 73.7 1.4 | 68.4 1.4 | 70.7 1.8 |
We can see that hybrid methods outperform radiomic and deep methods in general. Our method, RIDL, achieves the best results across all metrics however and improves AUC by 1.1% over the baseline method (86.9% v.s. 85.8%) and 3.5% over the best radiomics approach (86.9% v.s. 83.4%).
3.3 Ablation
3.3.1 Component analysis
We perform ablation experiments to demonstrate improvements from using local radiomic features, global radiomic features, and feature de-correlation loss. Results are shown in Table 2.
Method | AUC (%) | MAP (%) | |||
---|---|---|---|---|---|
DNN⋆ | 77.2 1.5 | 73.7 1.4 | |||
DNN⋆ + Local Radiomic Features | ✓ | 78.8 1.1 | 74.9 1.1 | ||
Baseline† | ✓ | 85.8 0.5 | 84.5 0.6 | ||
Baseline† + Local Radiomic Features | ✓ | ✓ | 86.3 0.9 | 84.9 8.0 | |
RIDL (ours) | ✓ | ✓ | ✓ | 86.9 0.6 | 86.3 0.6 |
We can see that including local radiomic features, improving AUC by up to 1.6% when included with a standard DNN (78.8% v.s. 77.2%). Using feature de-correlation further boosts performance and leads to the best overall results.
3.3.2 Effectiveness of radiomic feature selection
To demonstrate the effectiveness of radiomic feature selection as prior knowledge for feature fusion, we compare with results from using features discarded by radiomics feature selection. We randomly select three discarded features to generate local features as input whilst keeping other components constant. Results are shown in Table 3.
Feature used for local calculation | Selected | AUC (%) | MAP (%) |
---|---|---|---|
gldm_DependenceNonUniformityNormalized | ✗ | 86.1 0.8 | 85.5 0.9 |
glrlm_LongRunEmphasis | ✗ | 86.4 0.8 | 85.4 0.8 |
gldm_LargeDependenceEmphasis | ✗ | 85.6 0.7 | 84.6 0.9 |
glcm_IDN (ours) | ✓ | 86.9 0.6 | 86.3 0.6 |
We can see that using discarded features leads to worse performance in general. Given the large set of radiomic features, it is possible some discarded features may outperform selected features due to differences in global and local computation. Nevertheless, our results indicate that the radiomic feature selection process serves as an reasonable information prior. Our work is the first to propose fusing locally computed radiomic features with low-level DNN features, and we leave detailed local feature selection methods to future works.
4 Conclusion
In this work, we propose a new approach to atrial fibrillation sub-type classification from CT volumes by integrating radiomic and deep learning approaches through a radiomics-informed deep learning method, RIDL. Our method is based on two key ideas: feature fusion of locally computed radiomic features with low-level DNN features to improve local context, and encouraging complementary deep and radiomic features through feature de-correlation. Unlike existing hybrid approaches, our method specifically addresses the advantages and limitations of both techniques to improve feature extraction. We achieve state-of-the-art results on AF sub-type classification and outperform existing radiomic, deep learning, and hybrid methods.
Future improvements to RIDL can be made by introducing more sophisticated local radiomic features selection methods, given the large set features to choose from. Experiments on larger datasets or alternative tasks can also be done to provide more empirical support, since current results show only slight improvements over baseline. These issues may be addressed in future works. Overall, our method is a novel way of combining radiomic and deep learning approaches, and can be used to improve accuracy of PeAF screening from CT volumes for better preventive care of high-risk patients.
Acknowledgement
This work was supported in part by grants from Hong Kong Innovation and Technology Commission (Project no. ITS/030/21 & Project no. PRP/041/22FX), and by Foshan HKUST Projects under FSUST21-HKUST10E and FSUST21-HKUST11E.
References
- [1] Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3d u-net: learning dense volumetric segmentation from sparse annotation. In: MICCAI. pp. 424–432. Springer (2016)
- [2] Cui, Y., Zhang, J., Li, Z., Wei, K., Lei, Y., Ren, J., Wu, L., Shi, Z., Meng, X., Yang, X., et al.: A ct-based deep learning radiomics nomogram for predicting the response to neoadjuvant chemotherapy in patients with locally advanced gastric cancer: A multicenter cohort study. EClinicalMedicine 46, 101348 (2022)
- [3] Gaeta, M., Bandera, F., Tassinari, F., Capasso, L., Cargnelutti, M., Pelissero, G., Malavazos, A.E., Ricci, C.: Is epicardial fat depot associated with atrial fibrillation? a systematic review and meta-analysis. Europace 19(5), 747–752 (2017)
- [4] Go, A.S., Hylek, E.M., Phillips, K.A., Chang, Y., Henault, L.E., Selby, J.V., Singer, D.E.: Prevalence of diagnosed atrial fibrillation in adults: national implications for rhythm management and stroke prevention: the anticoagulation and risk factors in atrial fibrillation (atria) study. Jama 285(18), 2370–2375 (2001)
- [5] Gomez-Outes, A., Lagunar-Ruiz, J., Terleira-Fernandez, A.I., Calvo-Rojas, G., Suárez-Gea, M.L., Vargas-Castrillon, E.: Causes of death in anticoagulated patients with atrial fibrillation. Journal of the American College of Cardiology 68(23), 2508–2521 (2016)
- [6] He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: CVPR. pp. 9729–9738 (2020)
- [7] Huber, A.T., Fankhauser, S., Chollet, L., Wittmer, S., Lam, A., Baldinger, S., Madaffari, A., Seiler, J., Servatius, H., Haeberlin, A., et al.: The relationship between enhancing left atrial adipose tissue at ct and recurrent atrial fibrillation. Radiology 305(1), 56–65 (2022)
- [8] January, C.T., Wann, L.S., Alpert, J.S., Calkins, H., Cigarroa, J.E., Cleveland, J.C., Conti, J.B., Ellinor, P.T., Ezekowitz, M.D., Field, M.E., et al.: 2014 aha/acc/hrs guideline for the management of patients with atrial fibrillation: a report of the american college of cardiology/american heart association task force on practice guidelines and the heart rhythm society. Journal of the American College of Cardiology 64(21), e1–e76 (2014)
- [9] Keskar, N.S., Mudigere, D., Nocedal, J., Smelyanskiy, M., Tang, P.T.P.: On large-batch training for deep learning: Generalization gap and sharp minima. arXiv preprint arXiv:1609.04836 (2016)
- [10] Lee, H.Y., Yang, P.S., Kim, T.H., Uhm, J.S., Pak, H.N., Lee, M.H., Joung, B.: Atrial fibrillation and the risk of myocardial infarction: a nation-wide propensity-matched study. Scientific reports 7(1), 12716 (2017)
- [11] Lee, J., Oh, J., Shin, I., Kim, Y.s., Sohn, D.K., Kim, T.s., Kweon, I.S.: Moving from 2d to 3d: volumetric medical image classification for rectal cancer staging. In: MICCAI. pp. 780–790. Springer (2022)
- [12] Li, Q., Bai, H., Chen, Y., Sun, Q., Liu, L., Zhou, S., Wang, G., Liang, C., Li, Z.C.: A fully-automatic multiparametric radiomics model: towards reproducible and prognostic imaging signature for prediction of overall survival in glioblastoma multiforme. Scientific reports 7(1), 14331 (2017)
- [13] Pastori, D., Pignatelli, P., Angelico, F., Farcomeni, A., Del Ben, M., Vicario, T., Bucci, T., Raparelli, V., Cangemi, R., Tanzilli, G., et al.: Incidence of myocardial infarction and vascular death in elderly patients with atrial fibrillation taking anticoagulants: relation to atherosclerotic risk factors. Chest 147(6), 1644–1650 (2015)
- [14] Saeed, N., Sobirov, I., Al Majzoub, R., Yaqub, M.: Tmss: An end-to-end transformer-based multimodal network for segmentation and survival prediction. In: MICCAI. pp. 319–329. Springer (2022)
- [15] Shamloo, A.S., Dagres, N., Dinov, B., Sommer, P., Husser-Bollmann, D., Bollmann, A., Hindricks, G., Arya, A.: Is epicardial fat tissue associated with atrial fibrillation recurrence after ablation? a systematic review and meta-analysis. IJC Heart & Vasculature 22, 132–138 (2019)
- [16] Sun, Q., Lin, X., Zhao, Y., Li, L., Yan, K., Liang, D., Sun, D., Li, Z.C.: Deep learning vs. radiomics for predicting axillary lymph node metastasis of breast cancer using ultrasound images: don’t forget the peritumoral region. Frontiers in oncology 10, 53 (2020)
- [17] Truhn, D., Schrading, S., Haarburger, C., Schneider, H., Merhof, D., Kuhl, C.: Radiomic versus convolutional neural networks analysis for classification of contrast-enhancing lesions at multiparametric breast mri. Radiology 290(2), 290–297 (2019)
- [18] Van Griethuysen, J.J., Fedorov, A., Parmar, C., Hosny, A., Aucoin, N., Narayan, V., Beets-Tan, R.G., Fillion-Robin, J.C., Pieper, S., Aerts, H.J.: Computational radiomics system to decode the radiographic phenotype. Cancer research 77(21), e104–e107 (2017)
- [19] Wang, S., Dong, D., Li, L., Li, H., Bai, Y., Hu, Y., Huang, Y., Yu, X., Liu, S., Qiu, X., et al.: A deep learning radiomics model to identify poor outcome in covid-19 patients with underlying health conditions: A multicenter study. IEEE Journal of Biomedical and Health Informatics 25(7), 2353–2362 (2021)
- [20] Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: Cbam: Convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV). pp. 3–19 (2018)
- [21] Wu, Z., Xiong, Y., Yu, S.X., Lin, D.: Unsupervised feature learning via non-parametric instance discrimination. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 3733–3742 (2018)
- [22] Yang, M., Cao, Q., Xu, Z., Ge, Y., Li, S., Yan, F., Yang, W.: Development and validation of a machine learning-based radiomics model on cardiac computed tomography of epicardial adipose tissue in predicting characteristics and recurrence of atrial fibrillation. Frontiers in Cardiovascular Medicine 9 (2022)
- [23] Zhang, X., Zhang, Y., Zhang, G., Qiu, X., Tan, W., Yin, X., Liao, L.: Deep learning with radiomics for disease diagnosis and treatment: challenges and potential. Frontiers in Oncology 12 (2022)
- [24] Zhao, Z., Yang, G.: Unsupervised contrastive learning of radiomics and deep features for label-efficient tumor classification. In: MICCAI. pp. 252–261. Springer (2021)
- [25] Zwanenburg, A., Vallières, M., Abdalah, M.A., Aerts, H.J., Andrearczyk, V., Apte, A., Ashrafinia, S., Bakas, S., Beukinga, R.J., Boellaard, R., et al.: The image biomarker standardization initiative: standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology 295(2), 328–338 (2020)