Rapid image recognition of body parts scanned in computed...

Int J CARS (2010) 5:527–535DOI 10.1007/s11548-009-0403-1

ORIGINAL ARTICLE

Rapid image recognition of body parts scanned in computedtomography datasets

Volker Dicken · B. Lindow · L. Bornemann ·J. Drexl · A. Nikoubashman · H.-O. Peitgen

Received: 10 January 2008 / Accepted: 17 December 2009 / Published online: 30 May 2010© CARS 2010

AbstractAim Automatic CT dataset classification is important toefficiently create reliable database annotations, especiallywhen large collections of scans must be analyzed.Method An automated segmentation and labeling algorithmwas developed based on a fast patient segmentation andextraction of statistical density class features from the CTdata. The method also delivers classifications of image noiselevel and patient size. The approach is based on image infor-mation only and uses an approximate patient contour detec-tion and statistical features of the density distribution. Theseare obtained from a slice-wise analysis of the areas filled byvarious materials related to certain density classes and thespatial spread of each class. The resulting families of curvesare subsequently classified using rules derived from knowl-edge about features of the human anatomy.Results The method was successfully applied to more than5,000 CT datasets. Evaluation was performed via expertvisual inspection of screenshots showing classification resultsand detected characteristic positions along the main bodyaxis. Accuracy per body region was very satisfactory in thetrunk (lung/liver >99.5% detection rate, presence of abdo-men >97% or pelvis >95.8%) improvements are requiredfor zoomed scans.Conclusion The method performed very reliably. A test on1,860 CT datasets collected from an oncological trial showed

JCARS paper submission to CARS 2008, 2nd revision of 27 October2009.

V. Dicken (B) · B. Lindow · L. Bornemann · J. Drexl ·A. Nikoubashman · H.-O. PeitgenFraunhofer MEVIS, Institute for Medical Image Computing,Bremen, Germanye-mail: [email protected]: http://www.mevis.fraunhofer.de/

that the method is feasible, efficient, and is promising as anautomated tool for image post-processing.

Keywords Classification · Computed tomography ·Anatomy · Body region · Database · Segmentation ·Content-based image retrieval

Introduction

Our method was developed to address the problem of classi-fying a very large collection of some thousand CT imagedatasets with respect to recognition of the depicted bodyregion. This was motivated by lack or inconsistency of theavailable documentation in cases collected over many yearsin various research projects with international partners toallow efficient search for test and evaluation cases requiredin ongoing research projects. The aim is to rely as muchas possible on image data and to largely avoid the use ofDICOM tag information found to be unreliable, inconsis-tent, or unavailable due to anonymization in many cases. Itturned out that DICOM tag information is hard to classifyfor a collection of data originating from multiple countrieswhere different languages were used for entering DICOMtags. This is especially true for contrast use, often recordedif at all in conjunction with a large variety of abbrevia-tions. Also body region classification of DICOM tags basedon DICOM tag values faces difficulties, as the respectiveDICOM tag “BODY REGION” is often not correctly setand scan protocol names such as “biph. Thorax∧Abdomen”give no reliable clue whether chest or abdomen is actuallydepicted. Frequently, certain universal protocols are clini-cally used to scan body regions not directly related to theprotocol name. We aim for a coarse automatic classifica-tion on a level that a human with basic knowledge of CT

123

528 Int J CARS (2010) 5:527–535

Fig. 1 Principal data flow of the classification method

Table 1 Density ranges and corresponding color codes in the subsequent figures and curves

Material class Density range Color code

BackgroundVoxels below −1,000 HU after areas outside Below −1,000 HU Greyof the patient were set to below −1,000 HU

Air or other gasesMostly within the body after patient segmentation Up to −910 HU Violet

Lung parenchyma(or foam in positioning aids if not removed) Up to −300 HU Dark blue

Fatty tissue Up to −50 HU Dark green

Water dense areas Up to +30 HU Brown

Not enhanced soft tissue Up to +60 HU Orange

Enhanced soft tissue Up to +100 HU Yellow

Contrasted vessels Up to +250 HU Pink(may include strongly contrasted liver and partialvolume voxels within or near bones)

Highly contrasted areas and most bones Up to 500 HU Light blue

Harder bone parts and concentrated digestive contrast Up to 1,200 HU Light green

Solid parts of large bones, skull and metal implants Above 1,200 HU White

Another class used is the collection of all dense materials (every voxel >−300 HU, curves in olive). The peak value of this along the scan lengthprovides a reliable estimate of the patient size for images containing torso cross-sections

imaging could easily perform within seconds. However, werequire a fast and robust method that allows processing ofa CT scan of some 100 MB in a few seconds once imagesare loaded. Stability and careful memory management arerequired to run the procedure overnight on large image col-lections comprising of some Terabyte of collected data. Ouraim is somewhat similar to the object of the content-basedimage retrieval works by Greenspann [1,2] or Lehmann [3,4]and others cited there working on image database retrieval.Though, unlike these publications on 2D imaging we aredealing with often very large 3D datasets and require veryfast determination of a coarse level classification. The clos-est related work appears to be by a Japanese group at GifuUniversity [5].

Materials and Methods

The method comprises a fast slice-wise patient segmentationthat attempts to remove most of the background (air, table,clothing, positioning aids) followed by a slice-wise extrac-tion of statistical features for a range of density classes onwhich the classification is based (Fig. 1).

Extraction of statistical density class features

The statistical feature extraction provides for selected slices(e.g. every 3 mm) estimates of the area filled with certainmaterials for a number of density classes (e.g., lung paren-chyma, fat, soft tissue, bones, c.f. Table 1). Further, the spatialspread of each class is computed per slice as mean position(center of gravity) and standard deviation of the X and Ycoordinates of those voxels classified to belong into the corre-sponding density class. The starting point of the body regionclassification are families of curves showing the effectivediameter of the area filled per sampled slice (i.e., diameterof an area equivalent disk) as well as the spread parame-ters for each investigated material class plotted over the tableposition (Z -axis). In order to differentiate various levels ofcontrasted tissue and normal soft tissue, a fine classificationof individual voxels into 11 materials was employed (Fig. 2).

Noise level estimation and noise removal

To reduce the influence of image noise, especially in lungor bone CT reconstructions, and to estimate the noise levela local average smoothing was employed prior to the

123

Int J CARS (2010) 5:527–535 529

Fig. 2 a Classified materials at nine selected body cross sections from the CT component of a PET/CT scan. b Family of curves showing theeffective diameter (0–34 cm) of the area filled by each density class over the table position (0–100 cm; for the color codes per material class c.f.Table 1)

classification of individual voxels. For each material classa noise level was computed from the mean of the absolutedifference between original and locally smoothed voxel-val-ues. The mean was computed separately for all materials overall voxels with 3 × 3 kernel averaging values in the respectivedensity class. We selected the noise level of the non-enhancedsoft tissue fraction (e.g., muscles, heart, brain, liver withoutcontrast) as basis for a classification of the noise level, sincemany other material classes fill only a small area or includea high fraction of partial volume with high gradients thatyield high differences after smoothing, and some classes arenot present in every region of the body. The noise classifica-tion distinguishes data with (a) little image noise, (b) normalnoise as in most CTs scanned with normal radiation dose andreconstructed for high soft tissue contrast, (c) high noise as istypical for lung and bone reconstructions, and (d) very highnoise, almost exclusively present in low-dose scans of thethorax.

Patient segmentation

Our algorithm to determine the body region from the sta-tistical density class feature curves will start with a searchfor lung tissue within the scan. In order to avoid errone-ously taking CT-table, clothing, blankets, positioning aids orpartial volume voxels near the skin for lung tissue, it is nec-essary to remove most of the background prior to the featureextraction. This patient segmentation step is implementedon a per-slice basis to efficiently integrate with the statisti-cal feature extraction also working on a per-slice basis forselected positions only. Hence, an arbitrarily large datasetmay be analyzed keeping only one slice at a time in memory.The main idea of the patient segmentation consists of a col-umn-wise search for the patient contour, defined as the firstvoxel exceeding a minimum tissue density (−600 HU fromthe front, −300 HU from the back) followed by a minimumlength (4 mm from the chest, 10 mm from the back) of voxels

exceeding the threshold density. The search will usually runover clothing or thin tubes on the patient’s front and the plas-tic shell of the CT’s patient table. The different treatment offront and back was found to be advantageous to at least par-tially keep thin structures like ears and nose, while being ableto remove the table and thicker blankets at the back. Someheuristic modifications were added to the basic idea to avoidmajor jumps in the contour due to mistaking the edge of athicker amount of soft tissue dense material within the table(e.g. for holding positioning aids attached to the table) forthe patient contour. Furthermore, a special case is requiredto avoid removing lung tissue near the image boundary inzoomed images of the heart because it was confused withfoam in the table or positioning aids. For proper segmenta-tion of the patient also in noisy low dose data an inexpensivecumulative line-by-line smoothing was integrated with thesegmentation:

Smoothed value :=(original value + previous voxels smoothed value)/2

(1)

Patient segmentation may fail, e.g., in scans with circularreconstruction area when the reconstruction is zoomed onthe lungs or if thicker dense material, e.g., stereotaxie equip-ment is present (c.f. Fig. 3).

Performance issues

In order to speed up the calculation it was empirically deter-mined that the curve families providing the basis of the subse-quent classification will hardly change if only a small subsetof the available data is processed. It turned out that it sufficesto smooth and classify, e.g., only every third voxel in everythird row of one slice every 3 mm of data initially down-sam-pled to 2562 resolution using nearest neighbor sampling. Dueto smoothing with a 3×3 kernel averaging every voxel of the

123

530 Int J CARS (2010) 5:527–535

Fig. 3 Errors resulting from limitations of our patient contouring orzoomed views. Top row pseudo lung near feet resulting from failedcomplete removal of a support cushion on top of a dense thick board,a configuration somewhat similar to chest wall; Classified voxels infeet assumed to be patient (left), resulting eff. diameter curves of areafilled by different material classes (right, for colors codes c.f. Table 1,

dark blue lung, grey background). Bottom row pseudo lung resultingfrom mistaking a stereotaxie device for the patient and imaging artifactsincreasing the air density in the intermediate space (left), erroneouslyremoved lung, mistaken as foam in table in zoomed view of a shoulder(right)

2562 intermediate images will still contribute to the result.For the actual classification of the smoothed voxel values apre-computed look-up table was employed. Thus, computingthe initial statistical feature curves from the CT data may beperformed within times below 1 s even for CT data with sev-eral hundred images exceeding 200 MB, as computationaleffort only depends on the length of the scan range, and noton the actual number of images in the CT series. With thesedown-sampling and local smoothing settings, the computa-tion is mainly limited by the hard disc or network speed thatrestricts the number of volume dataset slices that may beanalyzed within a given time.

Classification of body regions

The classification employs a complex set of rules with param-eters selected heuristically, and derived either from knowl-edge about the human anatomy or selected after experiments

on limited selections of initially failed classifications. Coredecisions follow the pseudo code

1. Start: Analyze Lung related curves

2. IF (no Lung found) OR (upper Lung found)

THEN Analyze Head related curve features

3. IF (lower Lung found) OR (no Lung AND no Head)

THEN Search Pelvis + Hip related features in bone related curves

4. IF (lower Lung AND dataset continues caudally for at least 8 cm) OR

(Pelvis+Hip found AND dataset continues cranially for min. 20 cm)

THEN Analyze Liver features (muscle dense and enhanced tissue)

5. IF (no Lung, Head or Pelvis+Hip found AND scan range exceeds 40 cm)

OR (Pelvis+Hip found AND dataset continues caudally for min. 20 cm)

OR (lower Lung found AND at least 50 cm below diaphragm in dataset)

THEN Analyze Knee related features in dense tissue related curves

123

Int J CARS (2010) 5:527–535 531

The respective steps per body region searched are exp-lained in some detail below.

Lung

The method starts with searching for the presence of normallung tissue. An estimate of the lung volume is derived fromthe area per slice filled with lung dense voxels. In case thisexceeds 250 ml and has an area with effective diameter ofat least 8 cm on some slices we search for the presence of asteep increase/decrease of the lung fraction along the bodyaxis starting from the slice depicting the maximum of lungtissue to determine positions near diaphragm or lung apex. Ifboth are found we classify the presence of a complete lungand determine the lung center as mean of the two approx-imate end positions; otherwise, a partial lower respectivelyupper lung scan will be detected. If lung end positions can bedetermined the search for other body regions is greatly sim-plified. The presence of lung apex or diaphragm restricts thefurther search for other body region to the appropriate frac-tion of the scan. For example, there will be no pelvis abovethe diaphragm and no head below the apex. Furthermore, ascan depicting a certain length above/below the lung mustcontain certain other body regions in humans of commonsize even though specific features of those body parts cannotbe identified in the computed families of curves.

Head and neck

A head scan can be recognized from a ratio of spread in Ydirection (patient anterior–posterior) to spread in × direc-tion (patient right–left) exceeding 1.5 for the bony parts inthe skull combined with a size estimate of the patient crosssection below the area of a 20-cm disk. The base of the skullis commonly seen near the maximum of the gradient of acombination of the bony tissue fractions in the range abovethe lung. A position near the brain center is detected at themaximum of the soft tissue fraction above the base of skull. Ifany of the mentioned positions are found or the scan extendsfor some length above the lung apex the presence of neck orhead is classified.

Pelvis and legs

Views of the lower bones section show that pelvis and hipbone mainly consist of voxels with HU values between 100and 500. Therefore, the classification analyzes the sum of thecurve for “contrasted vessels” and “inner bone” that in mostpelvis scans has close twin peaks, related to the pelvic boneand the hip position. To robustly find the twin peaks, the pro-cessed interval is limited to the section between diaphragmand lower end of the scan, respectively, a position somewherein the thigh if present. This lower end of the search range in

the thigh is defined as highest point below the diaphragmwhere the effective diameter of the area filled with tissuesabove −300 HU is below than 20 cm. A curve derived fromeffective diameter of all bone-dense areas is searched start-ing at the lower end of the search interval for the first localmaximum exceeding the heuristically determined thresholdof 10 cm. When one is found a search for another local max-ima between 4 and 14 cm away from the first is performed.Depending on additional rules with heuristically determinedparameters a differentiation of peaks related to hip and pelvicbone from peaks in the bone-related part of the statistical fea-ture extraction present due to the use of stomach or colon CTcontrast agent is attempted. The decision whether any andif which of the local maxima were related to hip and pelvicbone was implemented in two ways.

Liver

A position near the center of the liver is identified from themaximum of the sum of areas filled by material in the catego-ries non-enhanced and enhanced soft tissue and moderatelycontrasted vasculature. Such a maximum is only searched forbelow a position 4 cm above a mean diaphragm position andthe lower end of the data, respectively, the pelvis position ifthis was identified in the step before.

Knees

Presence of the legs at the lower end of the scan is assumedif the sum of dense material has an effective diameter below20 cm. Within the leg range, starting at the upper end, asearch for the highest local maximum of a smoothed curveof the sum of areas filled by material in the 100–500 HUrange is performed. This search stopped at the knee joint inall datasets depicting the knees.

Evaluation

Performance and robustness

The method was implemented using the development plat-form MeVisLab [6] and tuned to successfully run on multiplecollections of about 5,000 distinct CT scans. Therefore, keyalgorithmic components such as the background removal andthe statistical feature extraction as well as the analysis ofcurve features were implemented as C++ MeVisLab mod-ule. The evaluation framework was implemented employ-ing the MeVisLab graphical network programming approachand a scripting layer using python script and the MEVISGUI definition language MDL. Evaluations were performedon state-of-the-art (as of 2007) Windows PC workstations,but also run on current Mac-OS and Linux systems due to

123

532 Int J CARS (2010) 5:527–535

MeVisLabs cross-platform design. After algorithmic opti-mization run time was mainly determined by hard disk andnetwork performance as the distributed datasets had to beaccessed over the institute network.

The cases used during development evaluation mostly hadmore or less severe pathologies. Most came from patientswith lung disease or final stage metastatic cancer and some,e.g., from patients with a scan of pelvis and the lower extrem-ities for vascular disease.

However, since pathologies in most cases only affect avolume smaller than 100 ml they did not influence the sta-tistical parameters substantially enough to cause problems.Even severe diffuse lung disease left most of the lung withinthe broad density range we considered as typical lung tissue.

For each CT data set, the statistical density class featureswere derived with 3 mm resolution (or coarser with thickerCT slices) along the CT table and moderate local smoothingon datasets downsized with an in-plane down sampling factorof 2. In order to preserve the original image histogram shapeas much as possible and to avoid long computations and load-ing, all slices over the network nearest neighbor resamplingwere employed along the Z axis. For the statistical featureextraction an in-plane sampling further reduced computa-tions to every third voxel in every third line of an internally3 × 3 smoothed version these 256 × 256 CT slices. Thissmoothing prior to voxel-wise classification was essential tobecome robust with respect to high noise. Effects related to

noise strongly depend on scan protocols (e.g. low dosage,high resolution) and patient size. Their influence on the clas-sification was greatly reduced by the averaging employedprior to our voxel-wise classification of density values.

For each dataset a classification was derived, and a screen-shot like the one in Fig. 4b was automatically generated.Including times for transferring the data over our network,resampling the data and generating multiple reformattedviews in sagittal and coronal orientation from the resam-pled data for documenting the findings a processing rate ofone screenshot per 11 s. was obtained on average. The densityclass feature extraction and classification itself accounted foronly a small fraction (1–2 s) of the processing time.

Verification

Correctness was confirmed by visual inspection of screen-shots depicting the annotated sagittal, coronal, and axialviews, and by judging the positions of the markers drawnfor various features identified in the curve family analysis onthe slice with the Z coordinate related to the positions wherea feature was identified. Examples of such features are theposition of max-lung-area and diaphragm: = highest posi-tion below max lung were lung area is below a percentageof the max lung area adjusted on a set of test cases (not partof the evaluation data base) low enough to yield a positionusually near the middle of the range of slices depicting the

Fig. 4 a (Left) Coronal view with indicators of characteristic positions determined from the curve families generated by the statistical featureextraction. b Example of a screenshot from a MeVisLab application for visual verification of the landmark position and classification results

123

Int J CARS (2010) 5:527–535 533

caudal lung boundary but high enough to avoid classificationof a position in the colon in, e.g., distended colonographyscan as diaphragm. Other markers evaluated were, e.g., cen-ter of liver which was searched for over a combination of allmaterial classes potentially found in abdominal scans of notextremely cirrhotic livers for CT studies performed native aswell as with low or high doses of contrast agent on eitherarterial, venous, or late contrast phases.

Further possible marker positions are, e.g., center of brain,lung apex, the pair of peaks along the body axis in bonyfractions related to pelvis and hip, and the knees. When dur-ing evaluation a marker was drawn on a slice depicting therespectively labeled region, the underlying feature recogni-tion was considered correct.

We are convinced that the task of visual evaluation of therecognition of major organs and joints in CT scans does notrequire expert radiologist. Evaluation was therefore based onvisual judgment by those authors sufficiently experienced inidentification of the relevant structures in CT scans. Screen-shots from nightly runs were stored under a systematic namederived from our classification results and basic DICOMinformation such as place and time of the image acquisition,patient birth year and sex, CT model name, geometry, andparameters of the reconstruction. With this approach col-lections of similarly classified data yielded screenshots thatwere grouped by classification result and allowed efficientevaluation of changes with respect to rated earlier runs. Thisallowed fast verification of the results (about 2–3 s per screen-shot on average) because mismatches in the classificationstood out from sets of similar screenshots. Thus, it becamepossible to visually check the classification findings of runsover up to 5,000 cases within a few hours per run, and toidentify problematic cases needed to further refine the imple-mented rules and tune the internal parameters for a lower rateof false classifications. Tuning and refinement of the rulesdid not involve the data employed in the evaluation reportedbelow.

Recognition rate and remaining problems

On the level of recognized depicted body regions a very highsuccess rate was obtained. More than 98% of scans showinga complete body cross-section were correctly classified withregard to the imaged region. Some problems were identifiedin particular with cases where large foam positioning aidswere not properly removed in the patient segmentation step.These were subsequently erroneously identified as equallydense lung tissue, e.g., at the height of head or feet in theimage. Further limitations include zoomed details from thepatient, e.g., reconstructions showing only a field of view(FOV) of about 15 cm around the heart, brain, spine, ormajor joints; however, due to their geometry they may stillbe correctly labeled as “zoomed”, and the common special

case of limited field of view heart studies can often be rec-ognized from nearby lung on either side. On the level ofmarked positions along the body main axis results may stillbe improved in a substantial fraction of cases. Whereas lung-related markers are very reliable (only rarely erroneous dueto corrupt data (e.g. missing slices) or failure in the patientcontouring), positions derived from bony structures such asthe markers for mandible, hip, or pelvis were sometimes afew centimeters beside their proper position. In the abdomen,this was often caused by the presence of colon contrast agentand most noticeable in large patients.

Evaluation of the noise level classification

The noise classification allows a very robust differentiationof images reconstructed for reading with focus on soft tis-sue findings (typical with soft tissue noise level as describedabove around 15 HU, at most 30 HU) and images recon-structed with high noise and spatial detail reconstructed withfocus on findings related to lung and bones typical with a softtissue noise level in the 40–80 HU range. The latter may alsobe differentiated from low-dose scans commonly used in lungcancer studies with a minimum soft tissue noise level exceed-ing 80 HU. So far, no wrong classifications with regard to thedistinction between soft tissue and lung/bone reconstructions(including low dose scans) were observed. The noise levelswere computed only over voxels of muscle density after mod-erate local 3 × 3-neighbor smoothing on images down-sam-pled by a factor of 2 using nearest neighbor resampling. Thementioned somewhat arbitrary thresholds were selected afterinitial experiments because they gave a very good separationof the described noise level classes that may also be identifiedin many cases from an inspection of the DICOM informa-tion through experts familiar with the various CT scanningparameters influencing the noise level (e.g. reconstructionkernel names, X-Ray dose and voltage as well as voxel size)and their coding in DICOM through different vendors. Ournoise level classification indirectly also incorporates patient-specific aspects (in particular patient weight/size) influenc-ing image quality that are not commonly found reliably inDICOM tags.

Unbiased evaluation on clinical trial data

The recognition rates were evaluated on a particular collec-tion of 1,884 scans collected from an international multicen-ter oncological trial, provided by Bayer Vital, that was notused during development of the methods. The results werethe following:

From the total collection 24 scans had missing slices filledwith constant values or incorrect DICOM tags for scalinggray values to HU. Those had corrupted results from the sta-tistical feature extraction and were discarded. Another 12

123

534 Int J CARS (2010) 5:527–535

got no classification due to depicting only a very short scanrange (5–7 cm) within the abdomen not providing sufficientcontext for the classification. This was in some cases causedby an erroneous splitting of a larger scan range into multipleshort scans based on the available DICOM information dur-ing image import.

Head 43 (91.5%) of 47 scans showing the head were iden-tified; the 4 remaining cases showed arms beside the head.This situation was not anticipated when implementing therules and will require additional detection rules.

Neck All 28 CT scans (100%) covering the neck regionwere correctly identified.

Lung 1,771 (99.7%) of 1,775 scans containing a substan-tial part of the lung were correctly classified as either upper,lower, or complete lung. The four failures were due to pre-processing problems with the slice-wise background removalobscuring the statistical features at certain positions.

Liver On 1,536 (99.5%) of 1,543 scans depicting most ofthe liver, a position near the liver center was correctly marked.

Abdomen The presence of the abdominal region, defined asthe body part between the center of the liver and a positionbetween the lower end of the kidneys and the highest pointof the pelvic bone, was correctly identified in 936 (97.7%)of 958 scans ranging over some part of the abdomen. Anattempted distinction between scans rated as upper abdomenonly and full or lower abdomen was not very robust; however,such a distinction also lacks a clear anatomical definition.

Pelvis Detecting the pelvis region in the density class fea-ture curves was well possible but the least reliable part inour experiments. The search for twin peaks in the bony areacurves correctly identified 955 (95.8%) of 997 scans showingthe pelvis and hips with the so far most successful implemen-tation of an admissible peak pair detection algorithm. Thealgorithm searches two peaks with plausible values for (a)bony and soft tissue area for a hip or pelvis cross-section (b)distance to lung if present, and (c) distance between pelvisand hip peaks. However, the positions marked as center ofthe pelvis and hip joint were somewhat variable over similarscans or between variants of the algorithm and frequentlynot at the proper location. Most of the failures were due tocolon contrast agent mistakenly considered as pelvic bone,especially in large patients.

Lower extremities Legs and knees were not scanned in theoncological trial. During development, knees were detectedin all arterial run off and PET/CT data presented to the algo-rithm whenever the knees were in the scanned range.

Discussion

Problems

A crucial step in our method is the patient contour determi-nation. Its failure seldom leads to inclusion of foam materialnear the head or legs, or partial removal of the lung in zoomedthorax data resulting in absurd lung end positions and con-sequently useless results.

A principle problem are CT data that have erroneousDICOM information about rescaling gray level informationto Hounsfield units or data with some missing slices filledby constant values yielding useless results of the statisticalfeature extraction.

Furthermore, scans over only a short range frequentlyfail to obtain a proper classification due to missing context.Attempts to classify the individual slices based on featuresof the statistical feature extraction may help to improve theresults, but initial work on classification without the use ofcontext information found no sufficiently robust features sofar to recognize the depicted region from single slices basedon the computed statistical parameters alone. More featuresor an analysis using advanced classification based on rulesextracted from training sets rather than implemented rulesderived from human anatomical knowledge may improve thisapproach. However, we believe this would require very largetraining sets to cover the board variability seen in body shapesand different uses of CT contrast material.

The principal challenge to our method is provided by scansdepicting only a ROI inside the patient (e.g. near heart, spineor major joints, one extremity only). These may be recog-nized as zoomed scans simply from the small depicted areanot showing a head, but detailed classification in the spiritof our approach remains a challenge as many different casesare possible. A reasonable recognition rate was so far onlyachieved with special treatment of zoomed data for certaincommon types of cardiac scans.

Another current limitation is the lacking of support for nonaxial data becoming clinically more frequent with the spreadof Multislice-CT scanners. This limitation may be overcomeat some computational cost by reformatting the data prior toanalysis or a straightforward alternative implementation ofthe statistical feature extraction in case of coronal- or sagit-tal-reconstructed CT data. Such scans are easily recognizableas reformatted data from reliable geometric DICOM infor-mation on image orientation. Oblique reformatted CT dataare possible. These would require a more refined approach,but will remain rare in clinical use.

The exact location of a determined position outside thelung will vary sometimes by a few centimeters between dif-ferent reconstructions of the same CT scan, between repeatedscans of the patient, or when considering a sub-range of thescan rather than the full scan. Further results might differ

123

Int J CARS (2010) 5:527–535 535

after minor modification of algorithmic parameters. Some-times certain positions are correctly detected in one scan butnot in a very similar one of the same patient. Nevertheless, theobtained overall classification of the depicted body regionsremains the same due to some redundancy in the classifi-cation in the great majority of cases. More robustness ofthe detection of non-lung landmark positions to consistentlyseparate body regions along the Z -axis may require moreelaborate image analysis for more specific landmarks. Suchanalysis may be performed efficiently when restricted to asmall ROI that can be obtained from the regions identifiablefrom the statistical features discussed above.

Conclusion

We believe the presented method provides a robust and novelconcept for a fast, yet reliable coarse level classification ofCT databases. This holds the potential to foster evaluation ofvarious algorithms based on substantial case numbers neededto obtain a confidence in the robustness of image process-ing methods required for their implementation in clinicaluse. Furthermore, the classification may aid post process-ing or reading software to perform its task with algorith-mic or visualization parameters optimized with respect tothe present patient’s size, image noise level, and contrastuse. In addition, it might help to establish standardized basic

quantification, like an air-lung ratio most likely correlatingwith severe emphysema or an abdominal body fat ratio thatmay be automatically computed at almost no cost on all scanswhere the computation is appropriate from the density classfeature curves. Such quantifications may be employed, e.g.,to retrieve cases for an evaluation of more refined emphysemaquantification methods.

References

1. Greenspan H, Goldberger J, Ridel L (2001) A Continuous probabi-listic framework for image matching. J Comput Vis Image Underst84(3):384–406

2. Pinhas A, Greenspan H (2004) A continuous and probabilisticframework for medical image representation and categorization.In: Proceedings of SPIE, vol 5371, pp 230–238

3. Lehmann T, Wein B, Keysers D, Kohnen M, Schubert H (2002)A mono-hierarchical multi-axial classification code for medicalimages in content-based retrieval. In: Proceedings of the 1st IEEEInternational Symposium on biomedical imaging, pp 313–316

4. Deselaers T, Müller H, Clough P, Ney H, Lehmann TM (2007) TheCLEF 2005 automatic medical image annotation task. Int J ComputVis 74(1):51–58

5. Zhou X, Kamiya N, Hara T, Fujita H, Yokoyama R, Kiryu T, Hoshi H(2004) Automated Recognition of Human Structure From Torso CTImages, VIIP. In: International Conference on Visualization, imag-ing and image processing. Marbella, Spain

6. MeVisLab, Software for Medical Image Processing and Visualiza-tion. http://www.mevislab.de

123

http://www.mevislab.de

Rapid image recognition of body parts scanned in computed...

Documents

Transcript of Rapid image recognition of body parts scanned in computed...