Data Analysis using Scale-space filtering and Bayesian

15
/ j,/ ; / ," /1 )• Data Analysis using Scale-space filtering and Bayesian Probabilistic Reasoning Deepak Kulkarni Sterling Federal Systems AI Research Branch, Mail Stop 244-17 NASA Ames Research Center Moffett Field, CA 94035 Kiriakos Kutulakos Computer Sciences Department University of Wisconsin Madison, W153706 Peter Rob[r_ _,,_,n RECOM Technologies AI Research Branch, Mail Stop 244-17 NASA Ames Research Center Moffett Field, CA 94035 Nq 2- 25' ' _ Uric! ._s " 3101 0091.5 ]._ Ames Research Center Artificial Intelligence Research Branch "l'echnical Report FIA-91-05 March, 1991 https://ntrs.nasa.gov/search.jsp?R=19920016106 2019-04-05T14:48:11+00:00Z

Transcript of Data Analysis using Scale-space filtering and Bayesian

• / • j,/ ;

/ ,"

/1 )•

Data Analysis using Scale-space filteringand Bayesian Probabilistic Reasoning

Deepak Kulkarni

Sterling Federal SystemsAI Research Branch, Mail Stop 244-17

NASA Ames Research Center

Moffett Field, CA 94035

Kiriakos Kutulakos

Computer Sciences DepartmentUniversity of Wisconsin

Madison, W153706

Peter Rob[r_ _,,_,n

RECOM TechnologiesAI Research Branch, Mail Stop 244-17

NASA Ames Research Center

Moffett Field, CA 94035

N q 2- 2 5 ' ' _

Uric! ._s" 3101 0091.5 ]._

Ames Research Center

Artificial Intelligence Research Branch

"l'echnical Report FIA-91-05

March, 1991

https://ntrs.nasa.gov/search.jsp?R=19920016106 2019-04-05T14:48:11+00:00Z

REPORT DOCUMENTATION PAGE OMB No. 0704-0188

Public reoor_mg burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data source:;,

gatilermg and mamtaimng the data needed, and completing and reviewing the collection of reformation. Send comments regarding this burden estimate or any other aspect of t_iscollection of information, including suggestions for reducing this burden, to Washington Headquarters Services, Directorate TOt Information Operations and Reports, 1215 Jefferson

Davis Highway, Suite 1204, Arlington, VA 22202-4302, and to the Office of Management and Budget, Paperwork Reduction Pro ect (0704-0 !88), Washington, DC 20503.

1. AGENCY USE ONLY (Leave blank) 2., REPORT DATE 3. REPORT TYPE AND DATES COVEREDDates attached

4. TITLE AND SUBTITLE

Titles/Authors - Attached

6. AUTHOR(S)

7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES)

Code FIA - Artificial Intelligence Research Branch

Information Sciences Division

9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES)

Nasa/Ames Research Center

Moffett Field, CA. 94035-1000

5. FUNDING NUMBERS

8. PERFORMING ORGANIZATIONREPORT NUMBER

Attached

10. SPONSORING / MONITORINGAGENCY REPORT NUMBER

11. SUPPLEMENTARY NOTES

12a. DISTRIBUTION / AVAILABILITY STATEMENT

Available for Public Distribution

13. ABSTRACT (Maximum 200 words)

12b. DISTRIBUTION CODE

Abstracts ATTACHED

14. SUBJECT TERMS

17. SECURITY CLASSIFICATION

OF REPORT

18. SECURITY CLASSIFICATIONOF THIS PAGE

19. SECURITY CLASSIFICATIONOF ABSTRACT

15. NUMBER OF PAGES

16. PRICE CODE

20. LIMITATION OF ABSTRACT

NSN 7540-01-280-5500 Standard Form 298 (Rev. 2-89)Prescribed by ANSI Std, Z39-18

298-'102

Data Analysis using Scale-space filtering

and Bayesian Probabilistic Reasoning

DEEPAK KULKARNI

STERLING FEDERAL SYSTEMS

AI Research Branch, Mail Stop 244-17NASA Ames Research Center

Moffett Field, CA 94035

KIRIAKOS KUTULAKOS

COMPUTER SCIENCES DEPARTMENT

University of Wisconsin

Madison, WI 53706

PETER ROBINSON

RECOM TECHNOLOGIES

AI Research Branch, Mail Stop 244-17NASA Ames Research Center

Moffett Field, CA 94035

ABSTRACT

This paper describes a program for analysis of output curves from Differential Thermal

Analyzer. The program first extracts probabilistic qualitative features from a DTA curve

of a soil sample, and then uses Bayesian probabilistic reasoning to infer the mineral in the

soil. The qualifier module employs a simple and efficient extension of scale-space filtering

suitable for handling DTA data. We have observed that points can vanish from contours

in the scale-space image when filtering operations are not highly accurate. To handle

the problem of vanishing points, perceptual organization heuristics are used to group the

points into lines. Next, these lines are grouped into contours by using additional heuris-

tics. Probabilities are associated with these contours using domain-specific correlations. A

Bayes tree classifier processes probabilistic features to infer the presence of different miner-

als in the soil. Experiments show that the algorithm that uses domain-specific correlations

to infer qualitative features outperforms a domain-independent algorithm that does not.

AREA : Perception and Signal Understanding

CONTACT: Deepak Kulkarni

ELECTRONIC MAIL: KULKARNI_PTOLEMY.ARC.NASA.GOV,

PHONE NUMBER: (415) 604--4869

1 Introduction

Many data interpretation programs extract qualitative features from data and carry out further anal-

ysis using them. Scale-space filtering [1,2] is a technique that can produce qualitative descriptions

of continuous curves. This paper describes an extension of this technique that has been applied to

the analysis of Differential Thermal Analyzer (DTA) curves. DTA is an instrument that geologists

use to analyze the contents of a soil sample. An investigator uses DTA to heat the soil sample and

an inert reference material at the same time, and measures the difference in the temperature of the

two at discrete temperature intervals. DTA output is a graph of this temperature dil]_erence vs. the

temperature of the reference material. Figure I shows an example of DTA output. As a soil is heated

in a DTA, minerals in the soil sample undergo phase transitions and reactions at various stages.

An exothermic reaction produces heat and an endothermic reaction consumes heat. An exothermic

reaction results in a hill in the output curve. In contrast, an endothermic reaction results in a valley

in it. The temperature at which a reaction occurs is diagnostic of minerals present in the soil. For

example, an endothermic reaction at 570°C would indicate the presence of quartz. A number of

reactions may occur simultaneously and may correspond to small and large features superimposed

on each other. The resultant curve is complex, and a domain expert can not identify features in the

curve with certainty. For example, in Figure 1, experts interprete that there is some chance of an

endotherm at 907°C at A. We have developed an expert system that can first extract probabilistic

qualitative features from the DTA output of a sample and use them in recognizing its contents. The

system uses a simple and efllcient extension of scale-space filtering suitable for handling DTA data

and for producing a probabilistic scale-space description. The system also has a Bayes tree classifier

that can use probabilistic qualitative features in the curve to recognize the contents of the sample.

I"

Figure 1: DTA output for a sample

2 Generating Qualitative Descriptions of DTA Curves

Scale-space filtering has proved useful in the analysis and representation of one-dimensional signals

[1,2,3,4]. This is especially the case in the analysis of curves from DTA experiments, because of

the specialcause-effectrelationshipsbetweenthe thermal events(i.e. reactions)taking placeinthe experimentsand the resultingcurvevariations: (a) If a thermal eventis fully containedin a

temperature interval and it is the only event taking place in that interval, then the event gives rise to

a single peak or valley in the DTA curve. (b) The effects of thermal events fully contained in disjoint

temperature intervals are independent, and (c) if two or more thermal events overlap, their effects on

the DTA curve are additive. I These properties allow us to characterize the thermal events taking

place in a DTA experiment by identifying and localizing peaks, valleys or overall trends in the DTA

curve. In the case of curve valleys and peaks the associated extremumlies between a pair of inflection

points. This pair of points therefore identifies a temperature interval over which a curve variation

of the above types is, in general, known to take place (Clark [6] discusses how one can determine

whether or not a particular point of inflection corresponds to some signal valley or peak). The main

contribution, of the scale-space filtering approach is in identifying these, temperature intervals by

pairing the detected points of inflection.

2.1 Scale-Space Filtering

The theory of scale-space filtering [1,2,5], addresses the problem of qualitatively describing an input

signal .f(z). The approach is based on the assumption that the variations present in 3¢(z) are caused

by the interaction of several physical processes [1,2]; the signal's final description is a characterization

of these processes. Signal variations are detected by finding the points of inflection in f(z). Since

the points of inflection are detected by applying derivative operators, the number and position of

these points depends not only on )¢(z) but on the spatial extent (i.e. the scale) of the operators.

No single scale is assumed correct; rather, each scale gives rise to a number of points of inflection.

The goal of the approach is to group, across all scales, the points of inflection caused by the same

physical process [2]. Therefore the curve can be described not by the individual points of inflection

detected but by the groups they form.

As described in [3,2], we can analyze the input signal .f(z) at various scales by first convolving

it with Gaussian kernels of varying standard deviation ¢. For fixed _, the points of inflection in

the smoothed signal f(z, _) are the points where _ crosses zero. As the original signal getsd_increasingly smoothed, the only zero-crossing points remaining are those that correspond to the most

prominent variations in f(z). When plotted on the (z, ¢)-plane the points of inflection of f(z, ¢) form

contours similar to the ones shown in Figure 2a, creating the so-called scale-space image for f(z).

Yuille and Poggio [5] proved that these contours are 'well-behaved' only in the case of the Gaussian

kernel; no new contours are created as _ increases and the contours form continuous, smooth curves

with a single extremum. In this case, all the zero-crossing points in the scale-space image can be

traced by starting at the finest level of detail (i.e. the unsmoothed signal) and terminating at the

first scale in which 3¢(z, ¢) has no points of inflection. Motivated by Witkin's [2] assumption that

each contour in the scale-space image corresponds to a single physical event, two zero-crossing points

in the (unsmoothed) signal are paired only if they belong to the same zero-crossing contour. The

paired zero-crossing points define the interval within which the event takes place. This observation

applied to the DTA curve analysis problem gives a way to determine the temperature intervals over

which thermal events take place in DTA experiments.

sWe are developing a separate module that handles exceptional situations that are not consistent with theseproperties.

.;").. ,"Figure 2: (a) Typical scale-space image ofa ])TA curve. Note that contours ,4 and B are intersecting.

(b) The computed scale-space image contains disconnected contours (e.g. X), due to approximation

errors in the derivative and zero-crossing calculations. The images are superimposed on the signalS.

2.2 Implementation

A major concern in the implementation of the above approach was the appearance of the zero-

crossing contours in the computed scale-space image: In the scale-space images of DTA curves,

contour intersections are not uncommon (Figure 2a). In addition, the use of finite-size approxi-

mations to the Gaussian filter for smoothing the curve implies that the ideal case of smooth and

continuous zero-crossing contours cannot be assumed. Therefore, simply tracking the zero-crossing

points from lower scales to higher ones is inadequate to group them into contours. Even further,

as the standard deviation cr of the smoothing filters grows the possibiLity of numerical errors in the

derivative calculations becomes increasingly important: If g(T, a) is a Gaussian operator of stan-

dard deviation _r (defined as in [111 and dta(T), dta(T,_) are the original and smoothed curves

respectively, then we have

1

_-_--_2_a(T, e) - ay2 [dta( )•#2

= dta(T)*[_--_g(T,_)].

8_The magnitude of _9(T, _) decreases rapidly as _ increases, to values very close to zero. Since

numerical errors can affect the detection of zero-crossing points in the scale-space image, it is impor-

tant to be able to process scale-space images that are not taken to be completely accurate. Figure

2b shows a typical scale-space image in which some contours appear to be disconnected.

The theory of scale-space filtering assumes that the variance of the Gaussian smoothing filters

varies continuously in the interval (0, co I, creating a continuous scale-space image. In any imple-

mentation this image must be sampled to give a discrete approximation. The density of the samples

o,

• ,

°.° ".• : o"

."

." ; °. ;

T

Figure 3: The sampled scale-space image of Figure 2. Note that the sampling rate is larger at lower

scales.

greatly affects performance, since the CPU requirements of the convolution operations become large

as the scale of the filter increases. It is therefore an important issue to be able to handle scale-space

images that are only sparsely sampled. "A typical sampled image is shown in Figure 3. Note that

because of the sparse sampling of the image, the scale-space contours appear split into two lines

starting at u = 0 and extending to higher scales in a near-vertical fashion. Also note that several

sampled contours have points missing at some scales.

With the above issues in mind, qualitative descriptions of DTA curves are generated in four

steps: (i) A sparsely-sampled scale-space image is computed for the curve (Figure 4a), (fi) zero-

crossing points in the image are grouped into lines starting at the lowest scale and extending to

higher scales (Figures 4b,c), (Hi) the lines are paired to form scale-space contours (Figure 4d), and

(iv) the scale-space contours detected are used as a description of the underlying thermal events

generating the DTA curve.

The discrete scale-space image is computed by first convolving the signal with finite-size approx-8_

imations to _g(T, or). The scales of the filters are taken from a fixed set of values, sampling the

image at larger or-intervals as _r grows. Our experimental results show that the use of a fixed number

of scales is adequate for our analysis task. Approximations to -_g(7', or) are created by truncating

the infin/te-size filters at a point where the area under the truncated part of the filters becomes

less than an error e. Therefore, when e is fixed the standard deviation determines the size of the

truncated filter. Instead of defining zero-crossing thresholds based on this error [7,8] we chose to

use the simple approach of slightly distorting the filter so as to force its total mass to be zero. The

zero-crossing points can then be found by detecting all negative-to-positive or positive-to-negative

transitions in the filtered curve (Figure 3).

In order to efficiently and reliably group the zero-crossing points into lines in situations similar

to the one in Figure 5 we use the perceptual organization techniques developed by Lowe and Binford

[9,10,11]. These techniques try to estimate the likeh_ood of particular relations between points or

lines happening by accident. Since the points of a zero-crossing contour create two almost straight

lines that eventually converge at some scale, we are interested in finding groups of collinear points.

4

(@

[ t"t t /tx 14

Figure 4: The sequence of steps taken in the analysis of DTA curves. (a) A discrete scale-space

image is computed for a curve C. (b) The points in scale-space are grouped into segments. (c)

The segments are grouped into lines. (d) The lines are paired to form the scale-space contours.

The curve is described as the result of two reactions, one exotherm reaction taking place over the

temperature interval X and one endotherm reaction taking place over the interval N.

• • •

• • •

• • • • •

• • • • •

i

II.ml

II :Pll

"S"

Figure 5: The density of zero-crossing contours at lower scales results in a large number of points

that must be reliably grouped into lines. This figure is a ma_3filed part of a typical scale-space forDTA curves.

In this case, a number of points would be accidentally grouped together if some of the points do

not belong to the same contour (Figures 6a-c). Perceptual organization then allows us to group the

zero-crossing points into lines while _zing likelihood of these groupings being accidental.

The points are grouped by scanning the scale-space image row-by-row, starting fTom the lowest

scale first. Each point at the lowest scale (i.e. the bottom row of the scale-space image) will belong

to a different line. W_nen a point is found at the next row it is placed in the group most likely to

contain the newly-found point. This group is found by computing the likelihood that the newly-

found point is put in the group by accident. When the top row of the image is reached all points

• r {• t

i! \'o i 1 B °¢'-tl

• -" lg L.

(@ 0_) (c) (¢0

Figure 6: Using perceptual organization to group points. (a) Points belonging to two zero.crossing

lines. (b) A likely grouping of the points. (c) A less plausible grouping of the points. (d) Estimating

the likelihood of C being accidentally aligned with A and B: This likelyhood can be expressed using

the angle between AC and AB.

Figure 7: Detecting the lines in the scale-space image of Figure 2.

in the image will be in some group. These groups represent the lines forming contours in the scale-

space image. In order to quantify the probability that a particular point p is placed in a group

G of points by accident we assume that the points in the group lie on a background of uniformly

distributed points. Also, we are given a measure of point collinearity and the density of the randomly

distributed background points. The probability of p being placed in G by accident is equal to the

expected number of background points that have stronger collinenrity relations with the points in

G than p (with respect to the defined collinearity measure). Figure 6d gives an example of such a

measure in the case where G contains only two points [9]. We have made several improvements to

the above basic algorithm in order to reduce the search conducted by the grouping process, including

an initial stage where points in the image are first traced to form small segments (Figure 4b); the

resulting segments are then grouped into lines using perceptual organization techniques similar to

the ones described above [10] (Figures 4c and 7).

Lines in the scale-space image are of one of two (opposite) types: Maztype, which contain zero-

crossings corresponding to maxima of -_dta(T, _) and mintype which is contain zero-crossings cor-

responding to _ of _£.dta(7',_). If two lines terminate at scales _I and _2, we define their

scale-ratio to be the ratio of the larger scale divided by the smaller scale. Thus, if lines LI and L2

77

Figure 8: The output of the qualifier. Associated probabilities are not shown in the figure.

have maximum scales of 5 and 10 respectively, then their scale-ratio is 2. When a line of maztype

is paired with another line of mintype at a higher temperature (i.e. the mi,tgpe line is on the right

of the mo.ztype line), the two correspond to a hill. When a line of mo_ype is paired with another

line of mintype at a lower temperature, the two correspond to a valley. The system employs the

following procedure to identify partner lines for s line L having mA_m,,m scale ¢rL and whose lower

endpoint hits the _r = 0 axis at temperature T. It considers as potential candidates all lines with

type opposite to that of L and with distance from L less than a threshold defined as a fimction of

¢rL. It identifies a line that has the minimum scale-ratio with L among the candidates with lower

endpoints at temperatures greater than T, and defines it as the hill partner. Similarly, it identifies

a line that has the rnlnlm.m scale-ra1_io with L among the candidates with lower endpoints at tem-

peratures lower than T, and defines it as the valley partner. Line L is paired with its hill partner

to form a hit1. Line L is paired with its valley partner to form a valley. The system then associates

a probability with every hill and valley it finds. To do this, it uses domain specific statistics that

are correlations between the behavior of the function at the extrema in the derivatives, the second

derivative at the peak and the probability of its being a real contour. For example: if the value of

function is decreasing at the minima in the derivative, and it is increasing at the rnA_rn_ in the

derivative, and the second derivative at the peak of an endotherm is high, then the probability of

the peak being an endotherm is .9. The output of the qualifier is shown in figure 8.

3 Classifier

In this section, we describe how the classifier module uses qualified description to recognize the

contents of a sample. The module has a Bayes tree rooted in each mineral node. Figure 9 shows

an example of a tree for the mineral 'Smectite'. The terminal nodes in the tree are predicates

of the form: data has an endotherm / exotherm with peak temperature between a and b. They

are tested against the features defined by the qualifier. The intermediate nodes represent phase

transitions and reactions. Nodes are connected to each other by arrows. An arrow from node A

to node B represents the relation that A causes B. Thus in figure 9, smectite causes water loss,

7

Figure 9: A Bayes tree for the mineral Smectite.

dehydroxylation and mullite nucleation. Furthermore water loss causes an endotherm between 80

and 200, dehydroxylation causes an endotherm between 675 and 725 and mullite nucleation causes

an endotherm between 900 and 1000. The root, the intermediate nodes and the terminal nodes can

take the values true and false. A link from node A to node B in the tree stores four probability

values: P(a/b) P(-_a / b) P(a / -_b), and P(-,a/-_b). As we described earlier, qualifier associates

probabilities with each feature in the curve. The classifier uses this information to assign probabilities

to the terminal nodes in the tree. Given probabilities in the terminal nodes, a standard Bayes tree

propogation algorithm [12] is used calculate the probabilities of all non-terminal nodes including the

root. For the curve in figure, the system fmds only two minerals - glycine and smectite - to have

probability greater than .5.

4 Experimental results

Our line-pa/ring scheme is based on the conjecture that an algorithm that uses domain specific

correlations should perform better than one that does not. We will now describe an experiment that

we carried out to test this. For the sake of clarity, we will refer to the line-pairing algorithm described

at the end of Section 2.2 as A, and another algorithm that pairs the lines with min_m,m Manhattan

distance between their upper endpoints in the scale-space image as M. These two algorithms were

tested on a random sample of 20 curves from the data bank in the DTA laboratory in NASA Ames

Research Center. Algorithm M detected 49% of the peaks reported by an expert. In contrast,

algorithm A detected 89% of the peaks identified by an expert as having probability greater than

0.5. In this evaluation we compared the performance of the algorithms with that of an expert. In a

number of domains, the features identified by an expert differ from those identified by a non-expert.

In the DTA domain, our algorithm can identify 70% of the peaks identified by a non-expert, but

89% of th peaks identified by an expert. While the curve will be given an unique interpretation by

a non-expert, it can have different interpretations in different domains. Qualifier described in this

paper can produce different qualitative descriptions of a curve from correlations specific for different

do_. It also meets the real-time requirements of the DTA task, as it takes 360.2 seconds of

CPU time on SPARC workstation for extracting the features.

The integrated system was tested on 16 known soil samples, and the system is able to identify

the contents of the samples. 75% of the terminal nodes of the minerals present in these samples were

assigned a probability greater than 0.5 by the classifier. Purthermore out of 121 terminal nodes, 64

of the terminal nodes belonged to minerals not present in the soil. The reason for this is that a

number of different minerals share common symptoms.

5 Conclusion

This paper described an extension of scale-space filtering method that is useful for extracting prob-abi]Jstic qualitative features from data sampled at discrete time intervals.

The system groups lines in the scale-space image into contours by examining the m_mnm scale

of lines in the scale space graph. It uses domain specific correlations to associate probabilities with

these contours. Experiments show that this scheme outperforms a domain-independent scheme. We

also showed that a Bayes tree algorithm can make use of the probabilistic description produced

by the qualifier in interpretation of the curve. Our future plan to carry out a systematic study of

how the CPU time and the classification performance changes when the sampling of the scale-space

image gets more and more sparce. This would determine at which point the algorithms break down

and how much performace can be gained by filtering the curve the smallest number of times.

6 Acknowledgements

We would like to thank Rocco Mancinelli and Lisa White, who provided the DTA test data. Our

research has benefitted from numerous discussions with Dave Thompson, Rich Levinson, and PatLangley.

References

[1] Witkin, A., Scale-Space Filtering, PTvc. 8th Int. Joint Conf. Artificial Intelligence, 1983, 1019-1021.

[2]

[3]

Witkin, A., Scale-Space Methods, in Encyclopedia of Artificial Intelligence, S. Shapiro, ed.,Wiley, New York, 1987, 973-980.

Mokhtarian, F., and Macworth, A., Scale-Based Description and Recognition of Planar Curves

and Two-Dimensional Shapes, IEEE Trans. Pattern Analysis Machine Intelligence 8, 1986,34-43.

[61

Hildreth, E., Edge Detection, in Encyclopedia of Artificial Intelligence, S. Shapiro, ed., Wiley,New York, 1987, 257-267.

Yuille, A., and Pogglo, A., Scaling Theorems For Zero Crossings, IEEE Trans. Pattern AnalysisMachine Intelligence 8, 1986, 18-25.

9

[61

[7]

[8]

[9]

[1o]

[11]

[12]

Clark, J., Authenticsting Edges Produced by Zero-Crossing Algoritlm=, IEEE Trans. Pattern

Analysis Machine Intelligence 11, 1989, 43-57.

Lu, Y., and Jsin, R., Behavior of Edges in Scale Space, IEEE Trans. Pattern Analysis Machine

Intelligence 11, 1989, 337-356.

Zuerndorfer, B., and Wakefield, G., Extensions of Scale-Space Filtering to Machine-Sensing

Systems, IEEE Trans. Pattern Analysis Machine Intelligence 12, 1990, 868-882.

Lowe, D., and Bini'ord, T., Perceptual Organization As A Basis For Visual Recognition, Proc.

AAAI-83, 1983, 255-260.

Lowe, D., Three-Dimensional Object Recognition from Single One-Dimensional Images, Artifi-

cial Intelligence 31, 1987, 355-395.

Lowe, D., The Viewpoint Consistency Constraint, International Journal of Computer Vision

1, 1987, 57-72.

Pearl, J., Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Mor-

gan Kaufina_, San Msteo, 1988, 144-200.

10