Results on website
Reliably obtaining a carotid bifurcation, lumen segmentation, and stenosis grading from computed tomography angiography data is relevant in clinical practice. This evaluation framework provides a large-scale standardized evaluation methodology and reference database for the quantitative evaluation of carotid bifurcation, lumen segmentation, and stenosis grading algorithms. Using this framework different methods can be compared in a objective, standardized way.
Well-defined measures are presented and a multi-site multi-vendor database containing 56 carotid CTA datasets with corresponding reference standard is described and made available, and different methods are available to extract statistics from the evaluation results.
Using this framework is simple, just follow the following recipe:
More details about the evaluation framework (data, reference standard, measures, scores and ranking) can be found below and in this document.
The CLS 2009 framework was tested during one of the challenges of the 3rd MICCAI Workshop in the series "3D Segmentation in the Clinic: a Grand Challenge III", which was held on 24 September 2009 at the 12th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI). Proceedings of this workshop can be found at the midas journal website. The framework is open for new submissions.
The framework can be used to evaluate methods that perform:
Each team can participate in either one of the tasks, or in both. This page briefly describes respectively the tasks to be performed, the data used, the manual annotation, the reference standard and the evaluation criteria.
The Common Carotid Artery (CCA) and Internal Carotid Artery (ICA), see Fig. 1, are clinically the most relevant arteries of the Carotid Bifurcation. Therefore, the segmentation evaluation focuses on these two arteries. A small part of the External Carotid Artery (ECA) is also included, to prevent evaluation issues at the location where the ECA bifurcates from the ICA. Additionally, it allows us to include a complete bifurcation in the evaluation.
The goal of this category is to accurately segment the lumen of the Carotid Bifurcation in a Computed Tomography Angiography (CTA) dataset. There are two versions: a fully automated version, and a semi-automated version where three initial points are provided.
The region to be segmented is defined around the bifurcation slice, which we define as the first (caudal to cranial) slice where the lumen of the CCA appears as two separate lumens: the lumen of the ICA and the lumen of the ECA. The segmentation must contain the CCA, starting at least 20 mm caudal of bifurcation slice, the ICA, up to at least 40 mm cranial of bifurcation slice, and the ECA, up to between 10 and 20 mm cranial of the bifurcation slice, see also Fig. 1.
The performance measures are only determined over the region of interest as specified above. However, the bifurcation slice is not communicated to the participant. Therefore, the participants should make sure that there segmentation at least includes this region. Our definition of the bifurcation slice, and the specified regions, should be sufficient to determine a suitable region of interest for the segmentations.
For the External Carotid Artery, the segmented lumen should be cut between 10 and 20 mm cranial of the bifurcation slice. To allow for some flexibility in cutting of the ECA, the region around the ECA between 10 and 20 mm cranial of the bifurcation slice is a "masked" region, where the evaluation measures will not be evaluated, see also Fig. 1.
The input for the participant is:
Two different stenosis grades have to be determined for each
ICA that needs to be segmented.
We use the following NASCET-like definitions for stenosis
In the above formulations,
Figure 2: Minimal diameter lines for various cross-sectional contours.
The second stenosis grade is determined using minimal diameters. The minimal diameter of a cross-section is defined as the shortest straight line that divides the contour in two equal-sized areas, see Fig. 2 for examples of minimal diameters for various contour shapes.
Similar to the lumen segmentation task, there are two versions of the stenosis grading task: a fully automated version, where the stenosis grading uses only the CTA dataset and the specification whether the left or right side needs to be graded, and a semi-automated version, where the algorithm also may use the three points in each of the arteries of the bifurcation (as supplied with the data). The input data available for this task is identical to the data for the lumen segmentation task.
Evaluation measures and ranking
The partial volume lumen segmentations will be evaluated using the following four performance measures:
All distance measures are symmetric, and all these measures are only evaluated in a the region of interest that is specified in 2.1. Furthermore, the mask for the distal part of the ECA is also used in all the above measures. The measures above will lead to one performance value of a participants for each dataset and for each performance measure. Per dataset and per performnce measure a ranking of the participants will be made, i.e. with N datasets, and the 4 measures, N * 4 rankings will be obtained. The final ranking for a participant is obtained by averaging the ranks of all these N * 4 rankings.
Stenosis gradingThe evaluation of the stenosis grade is straightforward: the absolute difference between the reference standard value and the value determined by a participant is the error in stenosis grade. As revealing the (exact) error per dataset also more or less reveals the reference stenosis grades, the stenosis errors are not communicated per dataset, but only per ensemble (testing or on-site). The same holds for the ranking. The final ranking, however, is determined by averaging the (hidden) errors per dataset and stenosis grade (diameter and area).
Information on the data format and submission format can be found in
The website can be used to upload processed data. Use the submit button on the Download/Submit page to upload processed data. Uploaded data should be in the format as described in this document: one subdirectory per challenge, named according to the input cta data, and the appropriate data files (roi and partial volume for lumen, and area- and diameter stenosis for stenosis grading). Next to adhering to the directory structure and file naming conventions, also note the following:
Checking your submission
The processing of the submitted data contains some very basic checks on the input data, such as checking whether all files are present and whether your segmentation overlaps with the region of interest of the reference standard. This check is for all possible input datasets, both training and testing. After processing (which is usually finised in half an hour for lumen and a few minutes for stenosis submissions, but on heavy loads can take a few hours), your submissions can be viewed by clicking on the submissions button on the Download/Submit page.
For each submission, the number of errors is listed. As we try to process all possible datasets, you will get errors on the training data if you only submitted test data, and vice versa. These errors should be ignored, you should only check whether the data you submitted was successfully processed. If an error has been detected, you can either upload a new set of data, or mark the dataset as failed. If you upload a new set of data, make it a complete set again, as their are no ways to combine submission results. Marking a dataset as failed will always rank that dataset as worst, but it will not take the performance measures into account when averaging the performance measures over all datasets. If you do not mark your dataset as failed, it will get default values for the performance measures, which will be much worse then your average performance. Note that, as the final ordering is on the ranks, marking as failed does not affect the ranking of your method.
For the training data (if submitted), you can inspect the performance measures immediately after processing is finished. They should be similar to the values that result from applying the evaluation software provided (contact the organizers if you detect large differences!).
Confirming your submission
If you are confident that the testing data submitted is fine, you can confirm your submission by sending an e-mail to cls2009.bigr.nl, with subject "cls2009 submission confirmation", and in the body clearly state your team name and which submission you want to confirm, by providing the data/time and/or label of the submission, and whether you confirm the lumen segmentation or the stenosis grading or both. Note that you can confirm only once, after confirmation you can NOT upload a new set of data and get it confirmed.
Viewing your resultsAfter our confirmation by e-mail of your confirmation, you can view your results. We also submitted the results of the three observers (ObserverA is the best observer, ObserverC is the worst observer), and show these results together with your results. This shows how you perform w.r.t. our manual observers.
For the lumen segmentation, both average values and a complete list of performance measures for each dataset are provided. For the stenosis grading, only the aggregate values are given.
The tables can be sorted by clicking on the header of a column. If you want to sort the table with detailed lumen scores, first click on the header for sorting some measure, and then click on the header of the column with dataset ids, to regroup the rows on dataset.
Results on websiteIf you want to include your results in a publication and want them to be visible for everyone, You have to send us a copy of the paper or a link to it and we will make the results viewable for everyone. If you are a commercial company and do not want to disclose your method, you should provide us with the exact software version number and a precise description of how the results are obtained.