∎

¹¹institutetext: Pengbo Liu, Hu Han, Heqin Zhu, Yinhao Li, Feng Gu, Jun Li, Li Xiao, S.Kevin Zhou ²²institutetext: Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
²²email: s.kevin.zhou@gmail.com ³³institutetext: Honghu Xiao, Chunpeng Zhao, Xinbao Wu ⁴⁴institutetext: Beijing Jishuitan Hospital, Beijing, China ⁵⁵institutetext: Yuanqi Du ⁶⁶institutetext: George Mason University, Virginia, USA ⁷⁷institutetext: Feng Gu ⁸⁸institutetext: Beijing Electronic Science and Technology Institute, Beijing, China

Deep Learning to Segment Pelvic Bones: Large-scale CT Datasets and Baseline Models

Pengbo Liu Hu Han Yuanqi Du Heqin Zhu Yinhao Li Feng Gu Honghu Xiao Jun Li Chunpeng Zhao Li Xiao Xinbao Wu S. Kevin Zhou

(Received: date / Accepted: date)

Abstract

Purpose: Pelvic bone segmentation in CT has always been an essential step in clinical diagnosis and surgery planning of pelvic bone diseases. Existing methods for pelvic bone segmentation are either hand-crafted or semi-automatic and achieve limited accuracy when dealing with image appearance variations due to the multi-site domain shift, the presence of contrasted vessels, coprolith and chyme, bone fractures, low dose, metal artifacts, etc. Due to the lack of a large-scale pelvic CT dataset with annotations, deep learning methods are not fully explored.

Methods: In this paper, we aim to bridge the data gap by curating a large pelvic CT dataset pooled from multiple sources, including $1,184$ CT volumes with a variety of appearance variations. Then we propose for the first time, to the best of our knowledge, to learn a deep multi-class network for segmenting lumbar spine, sacrum, left hip, and right hip, from multiple-domain images simultaneously to obtain more effective and robust feature representations. Finally, we introduce a post-processor based on the signed distance function (SDF).

Results: Extensive experiments on our dataset demonstrate the effectiveness of our automatic method, achieving an average Dice of 0.987 for a metal-free volume. SDF post-processor yields a decrease of 15.1% in Hausdorff distance compared with traditional post-processor.

Conclusion: We believe this large-scale dataset will promote the development of the whole community and open source the images, annotations, codes, and trained baseline models at https://github.com/ICT-MIRACLE-lab/CTPelvic1K.

Keywords:

CT dataset Pelvic segmentation Deep learning SDF post-processing

1 Introduction

The pelvis is an important structure connecting the spine and lower limbs and plays a vital role in maintaining the stability of the body and protecting the internal organs of the abdomen. The abnormality of the pelvis, like hip dysplasia HipDysplasia and pelvic fractures 2018pelvicFracture , can have a serious impact on our physical health. For example, as the most severe and life-threatening bone injuries, pelvic fractures can wound other organs at the fracture site, and the mortality rate can reach 45% 2020openpelvic at the most severe situation, the open pelvic fractures. Medical imaging zhou2021review ; zhou2019handbook plays an important role in the whole process of diagnosis and treatment of patients with pelvic injuries. Compared with X-Ray images, CT preserves the actual anatomic structure including depth information, providing more details about the damaged site to surgeons, so it is often used for 3D reconstruction to make follow-up surgery planning and evaluation of postoperative effects. In these applications, accurate pelvic bone segmentation is crucial for assessing the severity of pelvic injuries and helping surgeons to make correct judgments and choose the appropriate surgical approaches. In the past, surgeons segmented pelvis manually from CT using software like Mimics¹¹1https://en.wikipedia.org/wiki/Mimics, which is time-consuming and non-reproducible. To address these clinical needs, we here present an automatic algorithm that can accurately and quickly segment pelvic bones from CT.

Refer to caption — Figure 1: Pelvic CT image examples with various conditions.

Existing methods for pelvic bone segmentation from CT mostly use simple thresholding gaussianThreshold_a6 , region growing wavelet_a7 , and handcrafted models, which include deformable models deformable_a8 ; activeContour_a9 , statistical shape models shapeModel_a10 ; 3DshapeModel_sta1 , watershed keyframes_a11 and others mriPelvicStruc_tra1 ; knowledgeOrganSpecificStrategies_tra2 ; femurSeg_tra3 ; femurXray_tra4 ; RFregressionVoting_rfr1 ; RF&HsparseShape_rfr2 . These methods focus on local gray information and have limited accuracy due to the density differences between cortical and trabecular bones. And trabecular bone is similar to that of the surrounding tissues in terms of texture and intensity. Bone fractures, if present, further lead to weak edges. Recently, deep learning-based methods FCN_a13 ; unet_a12 ; FabianNNUnet_a4 ; zhao2017pyramid_dl1 ; atrousSeparableConvolution_dl2 ; 3dUNet_dl3 ; pancreaticCyst_dl4 ; DenseVNet_dl5 have achieved great success in image segmentation; however, their effectiveness for CT pelvic bone segmentation is not fully known. Although there are some datasets related to pelvic bone lee2012virtual_pelvicBoneDataset1 ; wu2016segmentation_pelvicBoneDataset2 ; hemke2020deep_pelvicBoneDataset3 ; chandar2016segmentation_pelvicBoneDataset4 , only a few of them are open-sourced and with small size (less than 5 images or 200 slices), far less than other organs kits19_url3 ; MSD . Although hemke2020deep_pelvicBoneDataset3 conducted experiments based on deep learning, the result was not very good (Dice=0.92) with the dataset only having 200 CT slices. For the robustness of the deep learning method, it is essential to have a comprehensive dataset that includes as many real scenes as possible. In this paper, we bridge this gap by curating a large-scale CT dataset and explore the use of deep learning in this task, which marks, to the best of our knowledge, the first real attempt in this area, with more statistical significance and reference value.

To build a comprehensive dataset, we have to deal with diverse image appearance variations due to differences in imaging resolution and field-of-view (FOV), domain shift arising from different sites, the presence of contrasted vessels, coprolith and chyme, bone fractures, low dose, metal artifacts, etc. Fig. 1 gives some examples about these various conditions. Among the above-mentioned appearance variations, the challenge of the metal artifacts is the most difficult to handle. Further, we aim at a multi-class segmentation problem that separates the pelvis into multiple bones, including lumbar spine, sacrum, left hip, and right hip, instead of simply segmenting out the whole pelvis from CT.

The contributions of this paper are summarized as follows:

•

A pelvic CT dataset pooled from multiple domains and different manufacturers, including $1,184$ CT volumes (over 320K CT slices) of diverse appearance variations (including 75 CTs with metal artifacts). Their multi-bone labels are carefully annotated by experts. We open source it to benefit the whole community;
•

Learning a deep multi-class segmentation network FabianNNUnet_a4 to obtain more effective representations for joint lumbar spine, sacrum, left hip, and right hip segmentation from multi-domain labeled images, thereby yielding desired accuracy and robustness;
•

A fully automatic analysis pipeline that achieves high accuracy, efficiency, and robustness, thereby enabling its potential use in clinical practices.

2 Our Dataset

Table 1: Overview of our large-scale Pelvic CT dataset. ‘#’ represents the number of 3D volumes. ‘Tr/Val/Ts’ denotes training/validation/testing set. Ticks[✓] in table refer to we can access the CT images’ acquisition equipment manufacturer[M] information of that sub-dataset. Due to the difficulty of labeling the CLINIC-metal, CLINIC-metal is taken off in our supervised training phase.

Dataset name[M] # Mean spacing(mm) Mean size # of Tr/Val/Ts Source and Year ABDOMEN 35 (0.76, 0.76, 3.80) (512, 512, 73) 21/7/7 Public 2015 COLONOG[✓] 731 (0.75, 0.75, 0.81) (512, 512, 323) 440/146/145 Public 2008 MSD_T10 155 (0.77, 0.77, 4.55) (512, 512, 63) 93/31/31 Public 2019 KITS19 44 (0.82, 0.82, 1.25) (512, 512, 240) 26/9/9 Public 2019 CERVIX 41 (1.02, 1.02, 2.50) (512, 512, 102) 24/8/9 Public 2015 CLINIC[✓] 103 (0.85, 0.85, 0.80) (512, 512, 345) 61/21/21 Collected 2020 CLINIC-metal[✓] 75 (0.83, 0.83, 0.80) (512, 512, 334) 0(61)/0/14 Collected 2020 Our Datasets $1,184$ (0.78, 0.78, 1.46) (512, 512, 273) 665(61)/222/236 -

2.1 Data Collection

To build a comprehensive pelvic CT dataset that can replicate practical appearance variations, we curate a large dataset of pelvic CT images from seven sources, two of which come from a clinic and five from existing CT datasets COLONOGRAPHY ; kits19_url3 ; MSD ; matlas . The overview of our large dataset is shown in Table 1. These seven sub-datasets are curated separately from different sites and sources with different characteristics often encountered in the clinic. In these sources, we exclude some cases of very low quality or without pelvic region and remove the unrelated areas outside the pelvis in our current dataset.

Among them, the raw data of COLONOG, CLINIC, and CLINIC-metal are stored in a DICOM format, with more information like scanner manufacturers can be accessed. More details about our dataset are given in Online Resource 1²²2https://drive.google.com/file/d/115kLXfdSHS9eWxQmxhMmZJBRiSfI8_4_/view.

We reformat all DICOM images to NIfTI to simplify data processing and de-identify images, meeting the institutional review board (IRB) policies of contributing sites. All existing sub-datasets are under Creative Commons license CC-BY-NC-SA at least and we will keep the license unchanged. For CLINIC and CLINIC-metal sub-datasets, we will open-source them under Creative Commons license CC-BY-NC-SA 4.0.

2.2 Data Annotation

Considering the scale of thousands of cases in our dataset and annotation itself is truly a subjective and time-consuming task. We introduce a strategy of Annotation by Iterative Deep Learning (AID) AIL to speed up our annotation process. In the AID workflow, we train a deep network with a few precisely annotated data in the beginning. Then the deep network is used to automatically annotate more data, followed by revision from human experts. The human-corrected annotations and their corresponding images are added to the training set to retrain a more powerful deep network. These steps are repeated iteratively until we finish our annotation task. In the last, only minimal modification is needed by human experts.

The annotation pipeline is shown in Fig. 2. In Step I, two senior experts are invited to pixel-wise annotate 40 cases of CLINIC sub-dataset precisely as the initial database based on the results from simple thresholding method, using ITK Snap (Philadelphia, PA) software. All annotations are performed in the transverse plane. The sagittal and coronal planes are used to assist the judgment in the transverse plane. The reason for starting from the CLINIC sub-dataset is that the cancerous bone and surrounding tissues exhibit similar appearances at the fracture site, which needs more prior knowledge guidance from doctors. In Step II, we train a deep network with the updated database and make predictions on new 100 data selected randomly at a time. In Step III, some junior annotators refine the labels based on the prediction results, and each junior annotator is only responsible for part of 100 new data. A coordinator will check the quality of refinement by all junior annotators. For easy cases, the annotation process is over in this stage; for hard cases, senior experts are invited to make more precise annotations. Step II and Step III are repeated until we finish the annotation of all images in our dataset. Finally, we conduct another round of scrutiny for outliers and mistakes and make necessary corrections to ensure the final quality of our dataset. In Fig. 2, ‘Junior annotators’ are graduate students in the field of medical image analysis. The ‘Coordinator’ is a medical image analysis practitioner with many years of experience, and the ‘Senior experts’ are cooperating doctors in the partner hospital, one of the best orthopedic hospitals in our country.

In total, we have annotations for $1,109$ metal-free CTs and 14 metal-affected CTs. The remaining 61 metal-affected CTs of image are left unannotated and planned for use in unsupervised learning.

3 Segmentation Methodology

The overall pipeline of our deep approach for segmenting pelvic bones is illustrated in Fig. 3. The input is a 3D CT volume. (i) First, the input is sent to our segmentation module. It is a plug and play (PnP) module that can be replaced at will. (ii) After segmentation is done, we send the multi-class 3D prediction to a SDF post-processor, which removes some false predictions and outputs the final multi-bone segmentation result.

3.1 Segmentation Module

Based on our large-scale dataset collected from multiple sources together with annotations, we use a fully supervised method to train a deep network to learn an effective representation of the pelvic bones. The deep learning framework we choose here is 3D U-Net cascade version of nnU-Net FabianNNUnet_a4 , which is a robust state-of-the-art deep learning-based medical image segmentation method. 3D U-Net cascade contains two 3D U-net, where the first one is trained on downsampled images (stage 1 in Fig. 3), the second one is trained on full resolution images (stage 2 in Fig. 3). A 3D network can better exploit the useful 3D spatial information in 3D CT images. Training on downsampled images first can enlarge the size of patches in relation to the image, then also enable the 3D network to learn more contextual information. Training on full resolution images second refines the segmentation results predicted from former U-Net.

3.2 SDF Post Processor

Post-processing is useful for a stable system in clinical use, preventing some mispredictions in some complex scenes. In the segmentation task, current segmentation systems usually determine whether to remove the outliers according to the size of the connected region to reduce mispredictions. However, in the pelvic fractures scene, broken bones may also be removed as outliers. To this end, we introduce the SDF sdf_url1 filtering as our post-processing module to add a distance constraint besides the size constraint. We calculate SDF based on the maximum connected region (MCR) of the anatomical structure in the prediction result, obtaining a 3D distance map that increases from the bone border to the image boundary to help determining whether ‘outlier prediction’ defined by traditional MCR-based method should be removed.

4 Experiments

4.1 Implementation Details

We implement our method based on open source code of nnU-Net³³3https://github.com/mic-dkfz/nnunet FabianNNUnet_a4 . We also used MONAI⁴⁴4https://monai.io/ during our algorithm development. Details please refer to the Online Resource 1². For our metal-free dataset, we randomly select 3/5, 1/5, 1/5 cases in each sub-dataset as the training set, validation set, and testing set, respectively, and keep such a data partition unchanged in all-dataset experiments and sub-datasets experiments.

Table 2: (a) The DC and HD results for different models tested on ‘ALL’ dataset. (b) Effect of different post-processing methods on ‘ALL’ dataset. ‘ALL’ refers to the six metal-free sub-datasets. ‘Average’ refers to the mean value of four anatomical structures’ DC/HD. ‘Whole’ refers to treating Sacrum, Left hip, Right hip and Lumbar spine as a whole bone. The top three numbers in each part are marked in bold, red and blue.

Exp Test Model Whole Sacrum Left hip Right hip Lumbar spine Average (Dataset) (Dataset) Dice/HD Dice/HD Dice/HD Dice/HD Dice/HD Dice/HD (a) ALL $\Phi_{ALL(2.5D)}$ .988/9.28 .979/9.34 .990/3.58 .990/3.44 .978/8.32 .984/6.17 ALL $\Phi_{ALL(3D)}$ .988/11.38 .984/8.13 .988/4.99 .990/4.26 .982/7.80 .986/6.30 ALL $\Phi_{ALL(3D\_cascade)}$ .989/10.23 .984/7.24 .989/4.24 .991/3.03 .984/7.49 .987/5.50 (b) w/o Post $\Phi_{ALL}$ .988/36.27 .984/38.36 .988/35.43 .991/28.70 .983/11.25 .987/28.43 MCR $\Phi_{ALL}$ .988/12.93 .984/7.50 .989/4.24 .991/3.72 .978/10.46 .986/6.48 SDF(5) $\Phi_{ALL}$ .988/12.02 .984/7.24 .989/4.24 .991/3.51 .980/9.54 .986/6.13 SDF(15) $\Phi_{ALL}$ .989/10.40 .984/7.24 .989/4.24 .991/3.35 .984/7.61 .987/5.61 SDF(35) $\Phi_{ALL}$ .989/10.23 .984/7.24 .989/4.24 .991/3.03 .984/7.49 .987/5.50 SDF(55) $\Phi_{ALL}$ .989/10.78 .984/7.24 .989/4.52 .991/3.38 .984/7.49 .987/5.66

4.2 Results and Discussion

4.2.1 Segmentation Module

To prove that learning from our large-scale pelvic bones CT dataset is helpful to improve the robustness of our segmentation system, we conduct a series of experiments in different aspects.

Performance of baseline models. Firstly, we test the performance of models of different dimensions on our entire dataset. The Exp (a) in Table 2 shows the quantitative results. $\Phi_{ALL}$ denotes a deep network model trained on ‘ALL’ dataset. Following the conventions in most literature, we use Dice coefficient(DC) and Hausdorff distance (HD) as the metrics for quantitative evaluation. All results are tested on our testing set. Same as we discussed in Sect. 3.1, $\Phi_{ALL(3D\_cascade)}$ shows the best performance, achieving an average DC of 0.987 and HD of 5.50 voxels, which means 3D U-Net cascade can learn the semantic features of pelvic anatomy better then 2D/3D U-Net. As the following experiments are all trained with 3D U-Net cascade, the mark _{(3D_cascade)} of $\Phi_{ALL(3D\_cascade)}$ is omitted for notational clarity.

Table 3: The ‘Average’ DC and HD results for different models tested on different datasets. Please refer to the Online Resource 1² for details. The top three numbers in each part are marked in bold, red and blue.

Average Dice/HD Test Datasets Models ALL ABDOMEN COLONOG MSD_T10 KITS19 CERVIX CLINIC $\Phi_{ABDOMEN}$ .604/92.81 .979/5.84 .577/104.04 .980/3.74 .360/158.02 .305/92.07 .342/148.12 $\Phi_{COLONOG}$ .985/5.84 .975/3.29 .989/5.65 .979/4.41 .982/8.41 .969/5.17 .974/9.24 $\Phi_{MSD\_T10}$ .534/96.14 .979/2.97 .501/106.86 .987/3.36 .245/170.39 .085/111.72 .261/151.61 $\Phi_{KITS19}$ .704/68.29 .255/120.31 .746/70.75 .267/121.57 .986/5.65 .973/5.14 .977/9.25 $\Phi_{CERVIX}$ .973/14.75 .969/4.30 .974/18.74 .967/6.55 .979/7.78 .973/4.49 .974/10.17 $\Phi_{CLINIC}$ .692/69.89 .275/117.09 .728/71.93 .254/126.66 .985/11.16 .968/9.69 .983/7.27 $\Phi_{ALL}$ .987/5.50 .979/2.88 .989/5.87 .987/3.11 .985/5.77 .972/5.01 .982/7.42 $\Phi_{ex\ sub-dataset}$ - .978/2.77 .986/7.37 .984/3.37 .982/8.33 .975/4.92 .975/8.87

Generalization across sub-datasets. Secondly, we train six deep networks, one network per single sub-dataset ( $\Phi_{ABDOMEN}$ , etc.). Then we test them on each sub-dataset. Quantitative and qualitative results are shown in Table 3 and Fig. 5, respectively. We also calculate the performance of $\Phi_{ALL}$ on each sub-dataset. For a fair comparison, cross-testing of sub-dataset networks is also conducted on each sub-dataset’s testing set. We observe that the evaluation metrics of model $\Phi_{ALL}$ are generally better than those for the model trained on a single sub-dataset. These models trained on a single sub-dataset are difficult to consistently perform well in other domains, except $\Phi_{COLONOG}$ , which contains the largest amount of data from various sources, originally. This observation implies that the domain gap problem does exist and the solution of collecting data directly from multi-source is effective. More intuitively, we show the ‘Average’ values in heat map format in Fig. 4.

Furthermore, we implement leave-one-out cross-validation of these six metal-free sub-datasets to verify the generalization ability of this solution. Models are marked as $\Phi_{ex\ ABDOMEN}$ , etc. The results of $\Phi_{ex\ COLONOG}$ can fully explain that training with data from multi-sources can achieve good results on data that has not been seen before. When the models trained separately on the other five sub-datasets cannot achieve good results on COLONOG, aggregating these five sub-datasets can get a comparable result compared with $\Phi_{ALL}$ , using only one third of the amount of data. More data from multi-sources can be seen as additional constraints on model learning, prompting the network to learn better feature representations of the pelvic bones and the background. In Fig. 5, the above discussions can be seen intuitively through qualitative results.

Others. For more experimental results and discussions, e.g. ‘Generalization across manufacturers’, ‘Limitations of the dataset’, please refer to Online Resource 1².

4.2.2 SDF post-processor

The Exp (b) in Table 2 shows the effect of the post-processing module. SDF post-processor yields a decrease of 80.7% and 15.1% in HD compared with no post-processor and MCR post-processor. Details please refer to Online Resource 1². The visual effects of two cases are displayed in Fig. 6. Large fragments near the anatomical structure are kept with SDF post-processing but are removed by the MCR method.

5 Conclusion

To benefit the pelvic surgery and diagnosis community, we curate and open source^{Deep Learning to Segment Pelvic Bones: Large-scale CT Datasets and Baseline Models} a large-scale pelvic CT dataset pooled from multiple domains, including $1,184$ CT volumes (over 320K CT slices) of various appearance variations, and present a pelvic segmentation system based on deep learning, which, to the best of our knowledge, marks the first attempt in the literature. We train a multi-class network for segmentation of lumbar spine, sacrum, left hip, and right hip using the multiple-domain images to obtain more effective and robust features. SDF filtering further improves the robustness of the system. This system lays a solid foundation for our future work. We plan to test the significance of our system in real clinical practices, and explore more options based on our dataset, e.g. devising a module for metal-affected CTs and domain-independent pelvic bones segmentation algorithm.

Declarations

Funding

This research was supported in part by the Youth Innovation Promotion Association CAS (grant 2018135).

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Availability of data and material

Please refer to URL^{Deep Learning to Segment Pelvic Bones: Large-scale CT Datasets and Baseline Models}.

Code availability

Please refer to URL^{Deep Learning to Segment Pelvic Bones: Large-scale CT Datasets and Baseline Models}.

Ethical approval

We have obtained the approval from the Ethics Committee of clinical hospital.

Informed consent

Not applicable.

References

(1) Aguirre-Ramos, H., Avina-Cervantes, J.G., Cruz-Aceves, I.: Automatic bone segmentation by a gaussian modeled threshold. In: AIP Conference Proceedings, vol. 1747, p. 090009. AIP Publishing LLC (2016)
(2) Barratt, R.C., Bernard, J., Mundy, A.R., Greenwell, T.J.: Pelvic fracture urethral injury in males—mechanisms of injury, management options and outcomes. Translational Andrology and Urology 7(Suppl 1), S29 (2018)
(3) Bennett, L., Zhoubing, X., Juan Eugenio, I., Martin, S., Thomas Robin, L., Arno, K.: 2015 miccai multi-atlas labeling beyond the cranial vault – workshop and challenge (2015). DOI 10.7303/syn3193805
(4) Chandar, K.P., Satyasavithri, T.: Segmentation and 3d visualization of pelvic bone from ct scan images. In: IACC, pp. 430–433. IEEE (2016)
(5) Chen, C., Zheng, G.: Fully automatic segmentation of AP pelvis x-rays via random forest regression and hierarchical sparse shape composition. In: Intl. Conf. on Computer Analysis of Images and Patterns, pp. 335–343. Springer (2013)
(6) Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: ECCV, pp. 801–818 (2018)
(7) Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: MICCAI, pp. 424–432. Springer (2016)
(8) Ding, F., Leow, W.K., Howe, T.S.: Automatic segmentation of femur bones in anterior-posterior pelvis x-ray images. In: International Conference on Computer Analysis of Images and Patterns, pp. 205–212. Springer (2007)
(9) Gibson, E., Giganti, F., Hu, Y., Bonmati, E., Bandula, S., Gurusamy, K., Davidson, B., Pereira, S.P., Clarkson, M.J., Barratt, D.C.: Automatic multi-organ segmentation on abdominal CT with dense v-networks. IEEE Transactions on Medical Imaging 37(8), 1822–1834 (2018)
(10) Guo, Q., Zhang, L., Zhou, S., Zhang, Z., Liu, H., Zhang, L., Talmy, T., Li, Y.: Clinical features and risk factors for mortality in patients with open pelvic fracture: A retrospective study of 46 cases. Journal of Orthopaedic Surgery 28(2), 2309499020939830 (2020)
(11) Haas, B., Coradi, T., Scholz, M., Kunz, P., Huber, M., Oppitz, U., Andre, L., Lengkeek, V., Huyskens, D., Van Esch, A., Reddick, R.: Automatic segmentation of thoracic and pelvic CT images for radiotherapy planning using implicit anatomic knowledge and organ-specific segmentation strategies. Physics in Medicine & Biology 53(6), 1751 (2008)
(12) Heller, N., Sathianathen, N., Kalapara, A., Walczak, E., Moore, K., Kaluzniak, H., Rosenberg, J., Blake, P., Rengel, Z., Oestreich, M., Dean, J., Tradewell, M., Shah, A., Tejpaul, R., Edgerton, Z., Peterson, M., Raza, S., Regmi, S., Papanikolopoulos, N., Weight, C.: The kits19 challenge data: 300 kidney tumor cases with clinical context, CT semantic segmentations, and surgical outcomes. arXiv:1904.00445 (2019)
(13) Hemke, R., Buckless, C.G., Tsao, A., Wang, B., Torriani, M.: Deep learning for automated segmentation of pelvic muscles, fat, and bone from CT studies for body composition assessment. Skeletal Radiology 49(3), 387–395 (2020)
(14) Isensee, F., Jäger, P.F., Kohl, S.A., Petersen, J., Maier-Hein, K.H.: Automated design of deep learning methods for biomedical image segmentation. arXiv preprint arXiv:1904.08128 (2019)
(15) Johnson, C.D., Chen, M.H., Toledano, A.Y., Heiken, J.P., Dachman, A., Kuo, M.D., Menias, C.O., Siewert, B., Cheema, J.I., Obregon, R.G., Fidler, J.L., Zimmerman, P., Horton, K.M., Coakley, K., Iyer, R.B., Hara, A.K., Halvorsen, R.A., Casola, G., Yee, J., Herman, B.A., Burgart, L.J., Limburg, P.J.: Accuracy of CT colonography for detection of large adenomas and cancers. New England Journal of Medicine 359(12), 1207–1217 (2008)
(16) Kainmueller, D., Lamecker, H., Zachow, S., Hege, H.C.: Coupling deformable models for multi-object segmentation. In: ISBI, pp. 69–78. Springer (2008)
(17) Kotlarsky, P., Haber, R., Bialik, V., Eidelman, M.: Developmental dysplasia of the hip: What has changed in the last 20 years? World Journal of Orthopedics 6(11), 886 (2015)
(18) Lamecker, H., Seebass, M., Hege, H.C., Deuflhard, P.: A 3D statistical shape model of the pelvic bone for segmentation. pp. 1341–1351. International Society for Optics and Photonics (2004)
(19) Lee, P.Y., Lai, J.Y., Hu, Y.S., Huang, C.Y., Tsai, Y.C., Ueng, W.D.: Virtual 3D planning of pelvic fracture reduction and implant placement. Biomedical Engineering: Applications, Basis and Communications 24(03), 245–262 (2012)
(20) Lindner, C., Thiagarajah, S., Wilkinson, J.M., Consortium, a., Wallis, G.A., Cootes, T.F.: Accurate fully automatic femur segmentation in pelvic radiographs using regression voting. In: MICCAI, pp. 353–360. Springer (2012)
(21) Lindner, C., Thiagarajah, S., Wilkinson, J.M., Wallis, G.A., Cootes, T.F.: Fully automatic segmentation of the proximal femur using random forest regression voting. IEEE Transactions on Medical Imaging 32(8), 1462–1472 (2013)
(22) Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)
(23) Pasquier, D., Lacornerie, T., Vermandel, M., Rousseau, J., Lartigau, E., Betrouni, N.: Automatic segmentation of pelvic structures from magnetic resonance images for prostate cancer radiotherapy. International Journal of Radiation Oncology* Biology* Physics 68(2), 592–600 (2007)
(24) Perera, S., Barnes, N., He, X., Izadi, S., Kohli, P., Glocker, B.: Motion segmentation of truncated signed distance function based volumetric surfaces. In: WACV, pp. 1046–1053. IEEE (2015)
(25) Philbrick, K.A., Weston, A.D., Akkus, Z., Kline, T.L., Korfiatis, P., Sakinis, T., Kostandy, P., Boonrod, A., Zeinoddini, A., Takahashi, N., Erickson, B.J.: Ril-contour: a medical imaging dataset annotation tool for and with deep learning. Journal of Digital Imaging 32(4), 571–581 (2019)
(26) Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: MICCAI, pp. 234–241. Springer (2015)
(27) Seim, H., Kainmueller, D., Heller, M., Lamecker, H., Zachow, S., Hege, H.C.: Automatic segmentation of the pelvic bones from CT data based on a statistical shape model. VCBM 8, 93–100 (2008)
(28) Simpson, A.L., Antonelli, M., Bakas, S., Bilello, M., Farahani, K., Van Ginneken, B., Kopp-Schneider, A., Landman, B.A., Litjens, G., Menze, B., Ronneberger, O., Summers, R.M., Bilic, P., Christ, P.F., Do, R.K., Gollub, M., Golia-Pernicka, J., Heckers, S.H., Jarnagin, W.R., McHugo, M.K., Napel, S., Vorontsov, E., Maier-Hein, L., Cardoso, M.J.: A large annotated medical image dataset for the development and evaluation of segmentation algorithms. arXiv:1902.09063 (2019)
(29) Truc, P.T., Lee, S., Kim, T.S.: A density distance augmented chan-vese active contour for CT bone segmentation. In: Intl. Conf. of the IEEE Engineering in Medicine and Biology Society, pp. 482–485. IEEE (2008)
(30) Vasilache, S., Ward, K., Cockrell, C., Ha, J., Najarian, K.: Unified wavelet and gaussian filtering for segmentation of CT images; application in segmentation of bone in pelvic CT images. BMC Medical Informatics and Decision Making 9(1), 1–8 (2009)
(31) Wu, J., Hargraves, R.H., Najarian, K., Belle, A., Ward, K.R.: Segmentation and fracture detection in CT images (2016). US Patent 9,480,439
(32) Yu, H., Wang, H., Shi, Y., Xu, K., Yu, X., Cao, Y.: The segmentation of bones in pelvic CT images based on extraction of key frames. BMC medical imaging 18(1), 18 (2018)
(33) Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR, pp. 2881–2890 (2017)
(34) Zhou, S.K., Greenspan, H., Davatzikos, C., Duncan, J.S., van Ginneken, B., Madabhushi, A., Prince, J.L., Rueckert, D., Summers, R.M.: A review of deep learning in medical imaging: Imaging traits, technology trends, case studies with progress highlights, and future promises. Proceedings of the IEEE (2021)
(35) Zhou, S.K., Rueckert, D., Fichtinger, G.: Handbook of Medical Image Computing and Computer Assisted Intervention. Academic Press (2019)
(36) Zhou, Y., Xie, L., Fishman, E.K., Yuille, A.L.: Deep supervision for pancreatic cyst segmentation in abdominal CT scans. In: MICCAI, pp. 222–230. Springer (2017)