CLASSIFICATION, DETECTION AND PREDICTION...
Transcript of CLASSIFICATION, DETECTION AND PREDICTION...
CLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND
ANOMALOUS EVENTS IN MEDICAL ROBOTS
by
Feng Cao
Submitted in partial fulfillment of the requirements
For the degree of Master of Science
Thesis Advisor: Dr. Soumya Ray
Department of Electrical Engineering and Computer Science
CASE WESTERN RESERVE UNIVERSITY
August, 2012
CASE WESTERN RESERVE UNIVERSITY
SCHOOL OF GRADUATE STUDIES
We hereby approve the thesis of
Feng Cao
candidate for the Master of Science degree*.
(signed) Soumya Ray
(chair of the committee)
Andy Podgurski
M. Cenk Cavusoglu
Gultekin Ozsoyoglu
(date) May 2, 2012
*We also certify that written approval has been obtained for any proprietary material
contained therein.
ii
TABLE OF CONTENTS
Page
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
LIST OF ABBREVIATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Robotic surgery systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Background and Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1 Current validation practices and limitations . . . . . . . . . . . . . . . . . . . 52.2 Safety of RoS systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.3 Robot fault detection and isolation . . . . . . . . . . . . . . . . . . . . . . . . 82.4 Monitoring and modeling safety-critical software systems . . . . . . . . . . . . 82.5 Reviews of statistical learning models . . . . . . . . . . . . . . . . . . . . . . 8
2.5.1 Dynamic Bayesian networks . . . . . . . . . . . . . . . . . . . . . . . 92.5.2 Linear Gaussian model . . . . . . . . . . . . . . . . . . . . . . . . . . 102.5.3 Regression tree model . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3 Phased Validation Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1 The SABiR robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.2 Robot simulation and environment . . . . . . . . . . . . . . . . . . . . . . . . 143.3 Software architecture and data collection . . . . . . . . . . . . . . . . . . . . . 17
3.3.1 State variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.4 Modeling the Hardware/Software state with DBN . . . . . . . . . . . . . . . . 20
4 Empirical Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1 Sampling normal trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . 234.2 Trajectories of A&A events . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.3 Modeling normal trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.3.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.4 Classification, detection, and prediction of A&A events . . . . . . . . . . . . . 29
iii
Page
4.4.1 A&A event classification . . . . . . . . . . . . . . . . . . . . . . . . . 294.4.2 A&A event detection . . . . . . . . . . . . . . . . . . . . . . . . . . . 314.4.3 A&A event prediction . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
APPENDICES
Appendix A: State variables and DBN structures . . . . . . . . . . . . . . . . . . 39Appendix B: Workspace, environment, and parameters . . . . . . . . . . . . . . . 42Appendix C: DBN models for normal trajectories . . . . . . . . . . . . . . . . . . 44
LIST OF REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
iv
LIST OF TABLES
Table Page
3.1 State variables in DBNs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2 Set of predictable variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.1 Average Test r2 for normal state trajectories. . . . . . . . . . . . . . . . . . . . . 28
4.2 Comparison of models on classifying, and detecting different A&A events. . . . . 37
4.3 Comparison of models on predicting “sweep” event. . . . . . . . . . . . . . . . . 37
AppendixTable
A.1 All state variables in DBNs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
A.2 Relevant variable sets for DBN structures. See Table A.3 for DBN structures. . . . 40
A.3 DBN structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
C.1 r2 values of different models on predictable hardware state variables. HS: mod-els using hardware/software variables. HS10: models using hardware/softwarevariables with 10-step prediction. HWOnly: models using only hardware variables. 44
C.2 r2 values of different models on predictable software state variables. HS: mod-els using hardware/software variables. HS10: models using hardware/softwarevariables with 10-step prediction. . . . . . . . . . . . . . . . . . . . . . . . . . . 45
v
LIST OF FIGURES
Figure Page
1.1 A “phased” validation framework for robotic surgery systems. . . . . . . . . . . . 3
2.1 A schematic dynamic Bayesian network, where St is the state at time t, X , Y , andZ are state variables. Note that there is conditional probability distribution (CPD)associated with each node (not shown in graph). . . . . . . . . . . . . . . . . . . . 9
3.1 SABiR: The robotic system for image-guided needle-based interventions on smallanimals. Left: CAD model of the system. Right: A prototype system built in ourlab. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 A normal trajectory (dashed line) in the simulation environment consisting of two“tissue” blocks (the red and blue cubes) . . . . . . . . . . . . . . . . . . . . . . . 15
3.3 Software architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.4 Schematic of dynamic Bayesian network models we use showing linear Gaussianand regression tree CPDs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.1 Trajectories of different scenarios (2D view). Top left: normal trajectory. Topright: encoder failure trajectory. Bottom left: sweep trajectory. Bottom right:out-of-workspace trajectory. The latter three correspond to A&A events. . . . . . . 24
4.2 Regression tree trained for actual needle tip position on Z axis. In each node areR2, mean, variance and training data size of the associated linear Gaussian model.The green nodes are leaves. The others are root and internal nodes, which havethe variable name of splitting test specified. Attached to each edge is the outcomeof the test on corresponding branch. . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.3 Regression tree trained for total force on needle on Z axis. The content has thesame meaning as in Figure 4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.4 ROC graph for classification of “sweep” A&A event. . . . . . . . . . . . . . . . 29
vi
AppendixFigure Page
4.5 ROC graph for classification of “OOW” A&A event. . . . . . . . . . . . . . . . 30
4.6 ROC graph for detection of “encoder failure” A&A event. . . . . . . . . . . . . 32
4.7 ROC graph for detection of “sweep” A&A event. . . . . . . . . . . . . . . . . . 32
4.8 ROC graph for detection of “OOW” A&A event. . . . . . . . . . . . . . . . . . 33
4.9 ROC graph for prediction of “sweep” A&A event with 10 steps looking ahead. . . 34
4.10 ROC graph for prediction of “sweep” A&A event with 100 steps looking ahead. . 35
4.11 ROC graph for prediction of “sweep” A&A event with 1000 steps looking ahead. 35
AppendixFigure
B.1 Workspace of the SABiR simulator. . . . . . . . . . . . . . . . . . . . . . . . . . 43
B.2 Normal trajectory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
vii
ACKNOWLEDGMENTS
I would like to thank my advisor Dr. Soumya Ray, for his great patience and careful guid-
ance through the past two years. He is a knowledgeable person with great passion on everything
he works on. It has been a pleasure to meet with him weekly, having discussions on either re-
search or life. He always granted me enough flexibility on scheduling my time, as well as gave
me insightful advices on difficulties I encountered. It is my privilege to work with Soumya
during my graduate time.
I am thankful to the group of people with whom I work on this project, Dr. Andy Podgurski,
Dr. M. Cenk Cavusoglu, Zhuofu Bai, and Mark Renfrew. This work will be impossible without
their teamwork. Thanks are also granted to Ye Fang, who helped revise my thesis during her
busy final time.
I also would like to show my special thanks to my sister, who always supports me for any
decision I made, and unreservedly sharing her thoughts with me, about life, career, family, and
everything. Thank my parents for their supporting my studies and respecting my decisions.
Without them I would not get a chance to enjoy this wonderful two years in Case!
viii
LIST OF ABBREVIATIONS
A&A Events: Adverse and Anomalous System Events
API: Application Programming Interface
CPD: Conditional Probability Distribution
DBN: Dynamic Bayesian Network
FDA: Food and Drug Administration
FPR: False Positive Rate
GUI: Graphical User Interface
OOW: Out Of Workspace
ROC Graph: Receiver-Operating Characteristic Graph
RoS Systems: Robotic Surgery Systems
SABiR: Small Animal Biopsy Robot
TPR: True Positive Rate
Classification, Detection and Prediction of Adverse andAnomalous Events in Medical Robots
Abstract
by
FENG CAO
In this project we propose a framework to model the behavior and evaluate the reliability and
safety of robotic surgery systems. A software simulator and associated user interface tools are
used to generate the simulated hardware/software data of a robotic system performing interven-
tions on small animals. The main contributions of this work are the usage of Dynamic Bayesian
Networks (DBN) to model both software and hardware dynamics of the robotic surgery system,
as well as detecting adverse and anomalous (A&A) events in the system. We show empirically
that the model can accurately capture aspects of the software/hardware dynamics. Further-
more, we show that the models are able to accurately classify, detect and predict certain kinds
of A&A events. Finally, comparison between different models demonstrates the usefulness of
modeling both hardware and software state, as opposed to using only hardware state alone.
1
Chapter 1
Introduction
With the increasing number of minimally invasive surgical (MIS) procedures, attention is
being directed to the development of robotic surgery (RoS) systems [1]. The nature of the
application makes these systems safety-critical. Further these systems are also complex, both
in terms of hardware and software. In this work, we propose a phased validation system to
measure the reliability of RoS systems. As part of this framework, we develop models to
detect and predict adverse and anomalous system events (A&A events) [2].
1.1 Robotic surgery systems
Robotic surgery systems are cyber-physical systems that use robots to aid in surgical pro-
cedures, specifically MIS procedures. Over the years, the number of MIS procedures being
performed has increased. Such procedures require access to places and scales that are not eas-
ily accessed with manual instruments and conventional techniques. These needs may be met
by mechanical instruments that are easier to control and are more precise in terms of perfor-
mance. Moreover, with the help of RoS systems, the surgeon can conduct the operation in
front of a console, instead of directly over the bedside, which makes remote surgery possible.
In addition to precision and miniaturization brought to surgery, RoS systems can also improve
patient health by making smaller incision with less blood loss and less pain. This results in
shorter patient recovery time and duration of hospital stays, and less side effects [3].
RoS systems operate in safety-critical environment. Generally, they have three compo-
nents: the surgeon interface (console), the bedside robotic system, and the imaging processor.
2
The surgeon interface is the place where surgeons operate. The bedside robotic system is
the component which has direct interaction with patients. Imaging processor provides video
from bedside robotic system to the surgeon interface. Each of the above components may
malfunction, due to either hardware failure or software bugs. Furthermore, the surgeon might
also misoperate the system, because the system often has steep learning curves [4]. Indeed,
accidents have already occurred, as evidenced by a number of adverse event reports filed by
manufacturers with the Food and Drug Administration (FDA). One such report [5] contains the
following description of an accident involving the da VinciTMS Surgical System [6]:
Event Description: It was reported that during a da Vinci’s beating heart dou-
ble vessel coronary artery bypass graft procedure at the hospital, there was an
unexplained movement on the system arm which had the endowrist stabilizer in-
strument attached to it. The unexpected movement caused the feet at the distal end
of the endowrist stabilizer instrument to tip downward resulting in damage to the
myocardium of the patient’s left ventricle. ...
Manufacturer Narrative: The investigation conducted by an isu field service en-
gineer found the system to [have] successfully completed all verification tests and
to function as designed. No system errors related to the cause of this event were
found. Additional investigations conducted by clinical and software engineering
were inconclusive as a root cause could not be determined based on the event de-
scriptions reported by several eye witnesses and the review of the system’s event
logs. ...
The possibility of such adverse and anomalous events motivates a framework for empiri-
cally assessing and monitoring the reliability and safety of RoS systems. Our framework, as
illustrated in Figure 1.1, takes into account how surgeons use a RoS system, how the system
interacts with patients, how operating conditions affect the system, and how observable sys-
tem behaviors are related to the system’s hardware and software dynamics. The framework
3
Simulation Improvement
Testingbased
Simulation Tests with
Tissue Phantoms
Clinical
Trials
Postmarket
Surveillance
Adverse and Anomalous Event Report
Confirmation using collected data
Fault Diagnosis and Localization
Revalidation
Testing
Further
Figure 1.1 A “phased” validation framework for robotic surgery systems.
should be able to reveal any serious underlying safety problems by analyzing previously de-
tected anomalous but seemingly harmless system behaviors. Furthermore, the framework will
allow us to characterize a reliable region of operation that we call the “usage envelope”. In this
work we introduce a limited prototype of the framework that is built and evaluated based on
simulations of a robot that are concurrently being developed by us for small-animal biopsies.
A major goal of this work is to effectively classify, detect, predict, and respond to adverse
and anomalous system events (A&A events) that threaten patient safety. The following steps
are essential in achieving our goal:
• Designing a robust simulation platform to reproduce system behavior;
• Developing efficient means of collecting pertinent hardware and software execution data,
as well as user feedback, from simulation platform and software controller;
• Building statistical machine learning models to effectively relate the collected data to the
occurrence of A&A events;
• Using models in real system and developing clinically effective response strategies.
4
In this work we focus on the third step based on the tools developed for the first two steps .
These tools include a simulator of the robotic system, and a supervisory software controller and
GUI for data collection. We learn dynamic Bayesian network (DBN) models of the system’s
behavior from the collected data.1 In the evaluation, we test the ability of these models to
classify, detect and predict A&A events. Specifically, we compare models which learn by only
using hardware data with models which learn by using both hardware and software data. Our
experimental results show that the two models are both good for our tasks, while one performs
slightly better than the other in some tasks. The results are encouraging and indicate that our
approach provides a solid foundation for completing the proposed framework.
1My work focuses on building DBN models of the system’s behavior, Mark Renfrew developed the simulator,and Zhuofu Bai implemented the software controller and GUI.
5
Chapter 2
Background and Related Work
In this chapter, we describe current RoS systems practice, and related work on the safety of
RoS systems. We also review the statistical learning models that are used to model the behavior
of RoS systems.
2.1 Current validation practices and limitations
RoS systems are subjected to federal regulation by the Food and Drug Administration
(FDA). The FDA has established classifications for different generic types of devices and
grouped them into three regulatory classes based on the level of control necessary to assure
the safety and effectiveness of the device [7]. We believe RoS systems should be considered
Class III devices, which are “usually those that support or sustain human life, are of substantial
importance in preventing impairment of human health, or which present a potential, unrea-
sonable risk of illness or injury”. However, since 2000, the FDA has assigned RoS systems
as Class II devices, which are subjected to less stringent regulations than Class III devices.
This seems problematic since the requirements for Class II devices are inadequate to assure
the safety and effectiveness of RoS systems. On the other hand, the FDA has required RoS
systems to be evaluated in clinical trials [8], which is more typical for Class III devices.
The FDA currently does not have specific regulations or even guidelines for the design and
validation of RoS systems. The regulations that RoS systems are assigned to are the guid-
ance documents provided by the FDA for manufacturers on proper development and validation
practice for medical devices [9] and associated softwares [10]. Also, the Good Manufacturing
6
Practice (GMP) requirements of the Quality System Regulation (contained in Title 21 Part 820
of the Code of Federal Regulation) have established the general quality requirements for med-
ical device manufacturers [11]. These documents call for applications of standard procedures
for the development and validation of safety critical systems.
Although the FDA provides general guidelines for the design and validation of medical
devices, the guidelines are not sufficient to ensure the reliability and safety of a cyber-physical
medical system as complex as RoS systems. Essentially, there are two issues in the current
validation practices for the safety and reliability of RoS systems. One is the large state space,
and the other is long-term reliability of software systems. The combined hardware/software
state space of RoS systems is so large that we can only explore a small portion of it during
validation. A much larger portion of the state space is visited when the system is deployed
in fields and used for a long period. Moreover, the state space can change over the life of the
system because of the changes in usage patterns. Consequently, it is likely that latent defects
will trigger system failures after deployment as previously unexplored regions of the state space
are encountered.
The very large state space of RoS systems poses big challenges for even the most promis-
ing validation techniques, such as model checking [12], model based testing [13, 14, 15], and
directed random testing [16]. Model checking and model based testing are based on abstract
models of programs behavior. They abstract a significant part of the system’s actual complexity
in order to make analysis tractable. This, however, makes the techniques vulnerable to model
errors and incompleteness of the system. Directed random testing explores a program’s control
structure by combining random testing with symbolic program evaluation and automatic con-
straint solving. While these methods are useful when applied to models or system components
of limited size, it is not clear if they can be made to scale to large systems. Also, because they
do not consider operational usage, none of them provide objective assessment of a system’s
operational (field) reliability and safety.
In our work, we use statistical learning models, i.e. dynamic Bayesian networks (DBNs),
to compactly represent the dynamics of the hardware/software state space. These models are
7
trained based on a sample of the whole state space. Further, because the system is engineered
to exhibit regular behavior, we expect that observing a sample will allow us to capture the
significant elements of the system dynamics. In this way we believe these models can be
accurate without exhaustively covering the state space.
There is also a large body of research on modeling the long-term reliability of software and
software-controlled systems, which does consider operational system usage [17]. Most of the
work uses reliability growth models, which assumes that the reliability of a software system
tends to grow over its lifetime due to debugging. With reliability growth models, an estimate
of system reliability is updated after system maintenance, based on a model of how debugging
affects reliability. However, since the immediate effects of attempted debugging on reliability
are extremely variable, reliability growth models are not appropriate for estimating the current
reliability of critical systems such as RoS systems, although they may be useful for long-term
planning. On the other hand, it is also possible for our system to update the model based on
newly observed system behaviors and dynamics, though we have not considered this in our
work.
2.2 Safety of RoS systems
Most of the studies on safety of RoS systems primarily focus on the design of intrinsically
safe systems [18, 19, 20, 21, 22, 23, 24]. In other words, the system is designed to fail in a safe
manner and come to a controlled halt so that it can be removed, and the procedure completes
manually. This is typically achieved by using actuators with limited power and speed, current
limiters, redundant sensors, safety monitors at the sensor, servo and supervisory levels, and
watchdogs monitoring activity of the control system. Some other studies lay out approaches
based on identification of potential hazards and mitigating them throughout the development
lifecycle [15, 25, 26, 27]. All of these design practices are basically complementary to our
approach, which uses statistical learning models to analyze system behavior.
8
2.3 Robot fault detection and isolation
Apart from the work on safety of RoS systems, there is also a lot of work on online detec-
tion and isolation of faults in general robotics and other hybrid systems [28, 29, 30, 31]. Most
commonly used approaches include using probabilistic sequence model to represent the system
state and performing inference in real time to detect or predict when A&A events happen. Our
framework aligns with these approaches in the sense that we also use probabilistic sequence
models, i.e. dynamic Bayesian network, to capture the system behaviors and dynamics. How-
ever, prior work mostly focuses on modeling the hardware state transition and devote attention
to efficient inference algorithms to account for the online setting. On the contrary, our work
not only incorporate dynamics of both hardware state and software state, but also consider both
online detection and offline classification of A&A events.
2.4 Monitoring and modeling safety-critical software systems
Related work also exists in software engineering. Prior work has analyzed safety-critical
systems, such as spacecraft [32], and recommended the use of runtime monitoring to detect
faults. There is also a large body of work that uses probabilistic models for software testing
and fault localization (e.g. [33, 34]). Unlike our work, this work is typically not in the context
of robotic systems or medical robots, however.
2.5 Reviews of statistical learning models
Several statistical learning algorithms are used in our approach. Dynamic Bayesian network
(DBN) are used to represent the dynamic relationship among hardware state variables, software
state variables, and parameters. Conditional probability distributions of each node in the DBN
are represented with either regression trees or linear Gaussian models.
9
Figure 2.1 A schematic dynamic Bayesian network, where St is the state at time t, X , Y , andZ are state variables. Note that there is conditional probability distribution (CPD) associated
with each node (not shown in graph).
2.5.1 Dynamic Bayesian networks
Dynamic Bayesian networks (DBNs) [35] are first order Markov models1 that represent,
in a factored manner, the probability of the next state given the current one, i.e. Pr(St+1|St),
where St means state at time t, and is described by a vector of state variables. We further
consider structures that have no ”symmetric” edges, i.e., no edges from Xt+1 to Yt+1, where
X and Y are state variables. In this case the value of a state variable Xt+1 in state St+1 only
depends on variables in St from which there is an edge to Xt+1. Figure 2.1 shows a schematic
1It is possible for them to be higher order; we only consider first order models in this work.
10
DBN structure. The factored state transition distribution according to the DBN is defined by:
Pr(St+1|St) = Pr(Vt+1|Vt) =n∏
i=1
Pr(V it+1|Vt) =
n∏i=1
Pr(V it+1|V
par(i)t ) [2.1]
where Vt = {V it } denotes all of the variables at time t, Vpar(i)
t denotes all of the variables at
time t that have an edge to V it+1.
Each node (or variable) V i in the DBN is associated with a conditional probability distri-
bution (CPD), denoted as Pr(V it+1|V
par(V )t ), which is the probability distribution of V i given
all the nodes in previous time step that have an edge to it. From Equation 2.1, we know that
the probability of current state given previous state equals to the product of the probability of
each variable in current state given relevant variables in previous state. In other words, the
probability of current state given previous state is the product of the CPD of each variable in
current state.
In a problem where we model with a DBN, normally the CPDs of each state variable are
unknown a priori. We estimate CPDs from observed trajectories of hardware/software states.
There are many ways to represent a CPD. If all the relevant variables are discrete, then one
could store the CPD as a big table explicitly. However this is not always possible, since the size
of table grows exponentially as the number of variables increases. Also, when the variables are
continuous, we cannot store all the probabilities in a table. Thus, alternatively, one could use a
parametric or nonparametric model to approximate the CPD. In the following two subsections,
we will briefly go over two models we used to represent the CPDs in our approaches.
2.5.2 Linear Gaussian model
A linear Gaussian model is an approach to modeling the relationship between a dependent
variable Y and a set of independent variables X. It assumes that the conditional distribution of
Y given X is normal, with a mean that depends linearly on X:
Y |X;w ∼ N (wT ·X, σ2X) [2.2]
The learning problem for linear Gaussian model is formalized in this way: given a set of
training data, i.e. (x, y) pairs, find the parameters w so that the likelihood of training data in
11
the model is maximum:
w∗ = argmaxw
∏i
Pr(yi|xi;w) [2.3]
Note that variance σ2 is implicitly contained in the equation. However, the choice of w does
not depend on what σ2 is, and indeed same results will be obtained even σ2 is unknown.
2.5.3 Regression tree model
Regression tree [36] is one type of decision tree that predicts the value of a target variable
based on several input variables. Regression tree handles the situation when the target variable
is continuous. A regression tree partitions the whole input data into different disjoint groups,
each of which corresponds to a leaf node. Each internal node of a regression tree is associated
with a test over the value of a input variable, and each branch corresponds to an outcome of the
test. Usually, a test compares the value of an input variable with a constant. When a new data
point is input, it starts from the root node, and goes to a leaf according to testing outcomes. A
regression model, say, linear Gaussian model, is assigned to each leaf node.
The learning problem for regression tree can be formalized in the following way: given a
set of training data, i.e. (x, y) pairs, find the tree structures, including tests on each internal
node, and regression model on each leaf node. Standard methods for regression tree training
and pruning are introduced in [36]. The tree is constructed in a top down way, recursively.
At first, a test is picked for the root node, according to some proper splitting criteria. Then
branches are made for each possible outcome of the test. This splits the training data into
subsets, one for each outcome. Repeat the process recursively for each branch, until no more
splitting is needed. In the problem where most variables are continuous, r2 is used as splitting
and stopping criteria. First of all, a linear Gaussian model is built for each node (both internal
and leaf nodes) using the associated data. Then r2 of the model can be calculated to measure
the goodness of the model. r2 is defined as follows:
r2 = 1−∑
i(yi − yi)2∑i(yi − y)2
[2.4]
12
where yi is the true value of the ith input data point, yi is the estimated value of yi from model,
and y is mean of yi’s. At each node, the goodness of each test is calculated in the following
way:
r2split =∑i
Ni
N∗ r2i [2.5]
where N is number of data points in current node, Ni is the number of data points that go to ith
branch after splitting, r2i is r2 of model for ith branch. The splitting with largest improvement in
r2 will be used. If no such split exists, current node will become leaf, and all the data survived
in this node will be used to train a linear regression model. There are also ways to do pruning in
order to keep the generated tree from overfitting. In this work, we also use a modified version,
to be introduced in next chapter, that takes advantage of the simulator available.
13
Chapter 3
Phased Validation Framework
The big picture behind this work is the framework shown in Figure 1.1. This is a phased
validation strategy where the RoS system is tested in several ways: first through accurate simu-
lations, then in hardware with tissue “phantoms” that simulate real tissue, then in clinical trials
and finally when it is marketed. At each point data is collected and analyzed to determine
possible A&A events. Reports of A&A events are also collected from users. If confirmed, di-
agnosis and fault localization techniques can be used to identify and correct the malfunctioning
hardware/software. The modified system is then revalidated using the same phased validation
pipeline.
In this work, we focus on the simulation-based testing phase. Specifically, statistical learn-
ing models are built for the behavior of simulator, and then are used to classify, detect, and
predict A&A events. In this chapter ,we describe a simulation of our robot, the supervisory
software we have developed, and the statistical methods we use to analyze the collected data. 1
3.1 The SABiR robot
In this work, we use the Small Animal Biopsy Robot (SABiR) [37]. Figure 3.1 shows
an image of the robot. It is a five-degree-of-freedom parallel robotic manipulator which is
designed to take biopsies or deliver therapeutic drugs at targets in live small animal subjects
and to achieve accuracy better than 250µm. It employs a parallel design to achieve low inertia.
The robot has high position resolution and can realize dexterous alignment of the needle before
1Work on the simulation environment and supervisory software system is accomplished by Mark Renfrew andZhuofu Bai, respectively.
14
Figure 3.1 SABiR: The robotic system for image-guided needle-based interventions on smallanimals. Left: CAD model of the system. Right: A prototype system built in our lab.
insertion. The design is light weight, and has high motion bandwidth, so that biological motion
(e.g., breathing, heartbeat, etc) at the target can be canceled while the needle is inserted inside
tissue.
The robot consists of a needle mechanism held by two 5-bar linkage mechanisms, referred
to as the front and rear stages. The front stage has two degrees of freedom (up/down, left/right)
and the rear stage has three degrees of freedom (up/down, left/right, rotate forward/rotate back-
ward). The stage are driven by five tendon mechanism motors and the joint angles are measured
by encoders. The robot’s state is characterized by its five joint angles, and there is a one-to-one
correspondence between any position and orientation that the needle tip can reach and a set of
joint angles.
3.2 Robot simulation and environment
In section 1.1, we described several subproblems that need to be solved in order to achieve
our goal. The first step is to design a robust simulation platform to reproduce system behavior.
We use the simulation of the robot rather than the physical system because building our models
15
Figure 3.2 A normal trajectory (dashed line) in the simulation environment consisting of two“tissue” blocks (the red and blue cubes)
requires a significant amount of data, and it is difficult in terms of time and cost to collect this
from the actual system. However, to ensure that the models we build are still applicable to
the real system we seek to ensure that the simulation is as accurate as possible as we describe
below.
In prior work, models for the kinematics and inverse kinematics for SABiR have been de-
veloped [38]. Here we use them to create a simulation of the robot, implemented in Simulink [39],
in which the robot’s motors are each represented as third-order transfer functions. The simula-
tor is designed to be a modular component of the system, in the sense that it can be seamlessly
swapped with the controller of the actual robot.
16
The environment of the simulated robot consists of two gel blocks with different charac-
teristics placed in the workspace, one contained within the other (Figure 3.2). A needle force
model, which assumes a stiff non-deformable needle, is used to provide a resistive force caused
by the combined cutting, frictional, and stiffness forces produced when the needle is inside the
gel block. The cutting force is caused by the needle tip piercing the gel block and provides
a resistance to the needle’s motion during insertion into the gel block. The frictional force is
produced by the friction between the needle body and the walls of the channel in the gel block,
and resists the needle during insertion and extraction. The stiffness force is caused by the gel
block’s tendency to resist sideways motion of the needle, i.e., any motion not in the direction
the needle is pointing. In this way, realistic and distinguishable forces can be produced by any
possible motion of the needle. The needle model is described in detail in [40].
We use a simple low level controller to control the simulation. After calibration the robot’s
initial location is called its “home position”. The controller can then be given a new position
and orientation to move the needle to. If the new position can be reached, a “reference tra-
jectory” is computed using linear interpolation. Velocities along this trajectory are set to be
mainly constant, with fast acceleration and deceleration at the start and end (subject to a de-
sired maximum acceleration). The simulator then uses a PD controller to follow this trajectory
to guide the needle to the end position.
The single insertion task for the robot is to insert the needle tip to a specific target position
in the inner block, at a specific target orientation. Upon completing an insertion task, the
needle starts from a predefined home position. The needle is moved towards a ready position,
which is outside but near the surface of the outer block. At the same time, it is rotated to
the specific inserting orientation. After that, motors work together again to insert the needle
into blocks until reaching target point. Then the needle is extracted from the blocks to ready
position. Finally the needle is guided towards home position, back to the state as it starts with.
The four steps described here divide the whole task into four consecutive macro actions, i.e.,
move-needle, insert-needle, extract-needle, and move-home. The dashed line in Figure 3.2
17
RobotGUI
Current statePrinter
Task schema
for single insertion
check parameters
Error Handler
Move to ready position
Insert the needle
Extract the needle
Move to home position
SABiRRobot
(API)Move to ready
(API)Needle Insertion
(API)Needle
Extraction
(API)Move
Home
Invalid parameters invoke error handler
API execution failure invoke error handler
current state of robotRobot state
needle positionneedle directionneedle depth
Task Delegator Robot Proxy
...
Figure 3.3 Software architecture
shows a normal trajectory of the single insertion task. Note that the trajectories of last two
actions overlap with the first two here. 2
Three sets of parameters are used to specify a single insertion task: ready position, ready
orientation (same as target orientation), insert distance (distance between ready position and
target position). Note that position and orientation are specified with 3 values, respectively,
each corresponds to one dimension in space.
3.3 Software architecture and data collection
A supervisory software system is built on top of the low-level controller. The user interacts
with the robot through this software. This system has three components: a GUI, a task dele-
gator, and a robot proxy. Figure 3.3 shows the information flow between them when the robot
performs a high level insert/extract needle operation.
The graphical user interface (GUI) part is built for the ease of user to pass commands to
the robot, through the task delegator. For this work, we only need to collect the data, both
from software and hardware. So the data flow here is as follows: the commands (i.e., the
parameters specifying the target location and orientation) are passed by calling the application
2See Appendix B for a more detailed description of the environment and parameters.
18
Type Name Num Description
Parameter
Reference Trajectory 18 Reference position, starting position, end position
Ready Position 6 Position and orientation where insertion begins
Insert Distance 1 Insert distance into tissue from ready position
Software
Action 1 High level action, e.g. “Insert Needle”
Insert Depth 1 Depth of the needle inside tissue
Needle Force 3 Force on the needle
Needle Torque 3 Torque on the needle
Motor Position Error 5 Error between Actual Position and Reference Position
Motor Speed Error 5 Error between Actual Speed and Reference Speed
Hardware
Needle Position 6 Position and orientation of needle tip
Motor Position 5 Positions of 5 motors
Motor Torque 5 Torques of 5 motors
Table 3.1 State variables in DBNs.
programming interfaces (APIs) provided by task delegator. The task delegator first checks
the validity of input parameters for the specified operation; for example, it ensures that target
locations are within the robot’s workspace. It then decomposes a complex task into a set of
basic needle motions that can be accomplished by calls to the API for the robot (in this case,
simulator). The delegator is equipped with different schema to decompose different high level
tasks. It then invokes the robot API to handle these decomposed tasks. If an error occurs, it is
responsible for stopping the current action and returning an error signal. When the robot proxy
gets an API call, it communicates with the real robot (again, in this case, simulator), issuing
low-level operations and collecting low-level sensor data from the robot (simulator).
19
Variable Name Num
Needle Position 6
Motor Position 5
Motor Torque 5
Insert Depth 1
Needle Force 3
Needle Torque 3
Table 3.2 Set of predictable variables.
3.3.1 State variables
There are three types of variables that are parts of the system state, namely, parameters,
hardware variables, and software variables. Variables such as the reference trajectory to be
followed by the robot are “parameters”. These variables are inputs to the system and do not
change over time. While we use these parameters to help predict future state trajectories, they
are not themselves predicted since they are constant. Other variables such as the x position of
the needle tip are “hardware variables”. The values for these variables are obtained by sensors
on the robot or by direct hardware measurements of various sorts. The third type of variables is
“software variables”. These variables include flags set by subroutines in the software, denoting
which high level motion is being executed. Software variables also include variables such as
“force on the needle” which cannot be directly sensed in the hardware but can be estimated or
derived in software indirectly from other variables. Such estimates can be used to estimate yet
other variables such as “depth of needle in tissue”. The set of variables describing the state is
shown in Table 3.1. For each kind of variable, “Num” refers to the number of variables of that
kind, e.g., there are 3 needle forces, one each direction. 3
Variables can also be categorized in terms of whether we build models for them or not,
i.e. “predictable” and “observable”. Predictable variables are variables whose values are only
3For a full description of all variables and DBN structures, please refer to Appendix A
20
t
Depth
y
Torque3
Depth
y
Torque3
Needle
x
Needle
x
SS t+1
Figure 3.4 Schematic of dynamic Bayesian network models we use showing linear Gaussianand regression tree CPDs.
available in real time. They include hardware variables, and some derived software variables,
such as Needle Force. Observable variables are variables whose values are available before
execution of a task. Examples of them include reference trajectory and parameters. Table 3.2
lists the predictable variables. The remaining variables are observable.
3.4 Modeling the Hardware/Software state with DBN
We use the data collected to build models representing the time evolution of the joint soft-
ware/hardware state space of the system. We use dynamic Bayesian networks (DBNs) to rep-
resent the probability of the next state given the current one, i.e., Pr(St+1|St), where each state
21
St is described by a vector of variables. A schematic DBN structure is shown in Figure 3.4.
The parameters of these probability distributions are learned in the way described below.
We represent the conditional probability distributions (CPDs) in the DBN in two ways.
Certain state variables, such as the x position of the needle tip, vary at a near constant rate from
t to t+1, because (in this case) the robot controllers are designed to maintain a constant velocity
as far as possible. For such variables, we use linear Gaussian models. For other variables, we
employ regression tree models for their CPDs.
We learn DBNs to model “normal” state transitions, and then use them to estimate the
likelihoods of the actual state transitions. State transitions with low likelihoods are classified
as “abnormal”, thus considered to relate to A&A events. We do not build models for abnormal
state transitions in this work. This is because A&A events happen rarely and unexpectedly. It is
impossible to collect data for all potential A&A events. Also, state transitions of A&A events
may not share the same transitions since they are caused by different reasons. Given these, it
is improper to either build separate models for each A&A state transitions, or build one model
for all of them. As a result, our models are not able to classify different types of A&A events.
On the other hand, however, models of “normal” state transitions are more robust to the “rare”
states of the system, because enough data can be obtained from the simulator to ensure that the
models trained are accurate.
We generate sequences of normal trajectories from our simulation and estimate the CPDs
for these variables from them. CPD parameters are estimated using maximum likelihood;
for linear Gaussian models, this is equivalent to linear regression and yields simple closed
form solutions. For regression tree models, we use a standard greedy top down recursive
decomposition approach. At each point, each variable and its associated values are evaluated
to estimate the benefit of splitting on that (variable, value) pair. All splits are binary. The
benefit is computed using a weighted-average r2 metric which compares the r2 goodness-of-fit
value before and after splitting (the weight is the fraction of points that go to each partition
after a candidate split). The (variable, value) pair with the maximum gain is selected as the
internal node. We make one modification to this standard tree construction procedure. In a
22
normal case, the number of data points decreases as we go deeper into the tree because of the
recursive partitioning. However, since we have a simulator, we use the simulator as an oracle to
generate data points as needed. These points are generated using a form of rejection sampling;
a random trajectory is sampled and a point on it is evaluated to see if it satisfies the checks
at the internal nodes. If so, it is kept, or else discarded. This procedure ensures that we have
enough points at each node in the tree to make decisions about the choice of splits. Further,
to prevent overfitting, we prune the produced regression tree for each variable using a fixed
validation set. We use a standard greedy post-pruning approach to do this [41].
23
Chapter 4
Empirical Evaluation
In this chapter, we evaluate how well our models can represent the robot dynamics of
conducting a single insertion task, and how accurately they can classify, detect, and predict
certain kinds of A&A events. All the experiments are performed with our simulator, which is
an accurate simulation of the robot. We first describe the experimental methodology, and then
present our results.
4.1 Sampling normal trajectories
Ideally, in order to build a model to capture the normal behavior of the system, all the
possible trajectories should be used as training data. However, this is impossible since there
are infinitely many normal trajectories. In practice, we uniformly sample from the set of all
normal trajectories, and then use them as representatives for the whole data set. The way
we do this is by sampling parameters from the feasible space. The feasible parameter space
are defined such that: the ready position and orientation are made so that they are close to the
surface of outer block, and the insert distance is made so that the target position lies somewhere
in the inner block.
However, not all the trajectories specified by sets of parameters are reachable in the robot’s
workspace. One could imagine asking robot to reach a target position from a weird orientation,
say, the other side of the blocks. This is impossible, and upon given these parameters, the
simulator will generate some abnormal trajectories. Fortunately, our simulator does provide a
way, though incomplete, to rule out most of these infeasible parameters.
24
Figure 4.1 Trajectories of different scenarios (2D view). Top left: normal trajectory. Topright: encoder failure trajectory. Bottom left: sweep trajectory. Bottom right:
out-of-workspace trajectory. The latter three correspond to A&A events.
4.2 Trajectories of A&A events
Adverse events are any untoward medical occurrence in a patient or clinical investigation
subject administered a pharmaceutical product and which does not necessarily have to have
a causal relationship with this treatment. In our case, A&A events can be triggered by either
hardware problems or software bugs. Our goal here is to use our model to detect adverse and
anomalous events. We consider three kinds of A&A events (as shown in Figure 4.1):
25
• Encoder Failure: This event happens at some point within the trajectory, when the ele-
ment that reports a motor’s position fails, so the system can no longer track that motor’s
position. As a result, the controllers can no longer effectively guide the needle tip to
follow the reference trajectories.
• Sweep: This event happens where prior to needle insertion, the needle tip strays and
grazes the tissue surface. The reason for this happening could be when part of the refer-
ence trajectory of move-needle or move-home action is very close to the surface. There-
fore, as the actual trajectory tries to follow it, but deviates a little bit from it, it might
touch the surface (skin) and injure the subject.
• Out-of-workspace (OOW): This is an event triggered by software bugs. The reference
trajectories in this case are normal. But the actual trajectories fail to follow the reference
trajectories. The reason for this, as we figured out and fixed, is that the expected trajec-
tory is outside the feasible workspace of the robot. In the buggy code, parameters were
not checked for validation before being input to simulator/robot system.
We generate several trajectories for each such event from simulator. Since actual A&A events
are rare, we restrict the proportion of trajectories for each “A&A event” to normal ones in our
dataset to 1.25%.
4.3 Modeling normal trajectories
We build DBN models with normal trajectories, so that they can be used to classify, detect,
and predict A&A events by observing that their trajectories are of low likelihood according to
the DBNs. We consider and evaluate three DBNs: a model using all the variables in Table 3.1
(HS), a model using only the “parameter” and “hardware” variables (HWOnly), and a model
using all variables but making 10-step predictions (i.e. modeling Pr(St+10|St)) (HS10).
Models are trained using the algorithms described in Section 3.4. We sample 400 normal
trajectories from simulator, and then from them, 5, 000 (st, st+1) pairs are sampled as initial
training data for building regression tree to model the CPD for each predictable variable. More
26
HA
ctP
ZR
oo
t: S
Act
ion
Nam
eR
2:
0.9
99
99
99
87
8(2
00
0)
#1
SB
egin
PZ
R2
: 0
.99
99
99
96
91
Mea
n:
34
2.5
8 V
ar:
1.6
7e-
05
(56
1)
extr
act-
nee
dle
#1
0 S
En
dP
ZR
2:
0.9
99
99
99
72
8 M
ean
: 3
42
.83
Var
: 1
.17
e-0
5(5
67
)
inse
rt-n
eed
le
#1
9 S
Rea
dy
PZ
R2
: 0
.99
99
99
96
06
Mea
n:
33
4.7
4 V
ar:
1.6
2e-
05
(42
6)
mo
ve-h
om
e
#2
4 H
Po
s2R
2:
0.9
99
99
99
59
4 M
ean
: 3
34
.83
Var
: 1
.82
e-0
5(4
46
)
mo
ve-n
eed
le
#2
R2
: 0
.99
99
99
98
18
Mea
n:
34
1.9
8 V
ar:
1.1
2e-
05
(14
1)
<3
45
.29
#3
SB
egin
PZ
R2
: 0
.99
99
99
96
77
Mea
n:
34
3.1
6 V
ar:
8.6
9e-
05
(18
59
)
>3
45
.29
#4
R2
: 0
.99
99
99
98
02
Mea
n:
34
2.1
9 V
ar:
1.0
7e-
05
(11
1)
<3
45
.58
#5
SB
egin
PZ
R2
: 0
.99
99
99
96
72
Mea
n:
34
3.2
2 V
ar:
1.2
0e-
04
(17
48
)
>3
45
.58
#6
R2
: 0
.99
99
99
98
15
Mea
n:
34
2.2
6 V
ar:
1.9
4e-
05
(17
1)
<3
45
.97
#7
SB
egin
PZ
R2
: 0
.99
99
99
96
59
Mea
n:
34
3.3
3 V
ar:
1.9
3e-
04
(15
77
)
>3
45
.97
#8
R2
: 0
.99
99
99
98
63
Mea
n:
34
2.4
4 V
ar:
1.7
2e-
05
(16
4)
<3
46
.50
#9
R2
: 0
.99
99
99
96
40
Mea
n:
34
3.4
3 V
ar:
5.6
1e-
04
(14
13
)
>3
46
.50
#1
1R
2:
0.9
99
99
99
70
8 M
ean
: 3
42
.06
Var
: 2
.07
e-0
5(1
33
)
<3
45
.20
#1
2 S
En
dP
ZR
2:
0.9
99
99
99
69
8 M
ean
: 3
43
.12
Var
: 5
.87
e-0
5(1
86
7)
>3
45
.20
#1
3R
2:
0.9
99
99
99
58
4 M
ean
: 3
42
.26
Var
: 2
.48
e-0
5(1
32
)
<3
45
.29
#1
4 S
En
dP
ZR
2:
0.9
99
99
99
70
0 M
ean
: 3
43
.18
Var
: 7
.19
e-0
5(1
73
5)
>3
45
.29
#1
5R
2:
0.9
99
99
99
55
0 M
ean
: 3
42
.22
Var
: 2
.24
e-0
5(2
40
)
<3
45
.97
#1
6 H
Act
PZ
R2
: 0
.99
99
99
97
10
Mea
n:
34
3.3
4 V
ar:
1.4
5e-
04
(14
95
)
>3
45
.97
#1
7R
2:
0.9
99
99
98
88
2 M
ean
: 3
39
.13
Var
: 1
.09
e-0
5(3
31
)
<3
40
.02
#1
8R
2:
0.9
99
99
99
42
8 M
ean
: 3
44
.54
Var
: 4
.25
e-0
4(1
16
4)
>3
40
.02
#2
0R
2:
0.9
99
99
99
88
7 M
ean
: 3
34
.65
Var
: 1
.30
e-0
5(1
62
)
<3
38
.27
#2
1 S
Beg
inP
ZR
2:
0.9
99
99
99
49
4 M
ean
: 3
34
.86
Var
: 1
.17
e-0
4(1
83
8)
>3
38
.27
#2
2R
2:
0.9
99
99
99
77
1 M
ean
: 3
34
.73
Var
: 1
.74
e-0
5(1
17
)
<3
38
.38
#2
3R
2:
0.9
99
99
99
48
1 M
ean
: 3
34
.87
Var
: 1
.60
e-0
4(1
72
1)
>3
38
.38
#2
5R
2:
0.9
99
99
99
99
3 M
ean
: 3
38
.66
Var
: 4
.74
e-1
1(4
)<5
.12
#2
6 S
En
dP
ZR
2:
0.9
99
99
99
62
7 M
ean
: 3
34
.86
Var
: 9
.37
e-0
5(1
99
6)
>5
.12
#2
7R
2:
0.9
99
99
99
85
0 M
ean
: 3
34
.98
Var
: 1
.22
e-0
5(1
20
)<3
38
.18
#2
8 S
En
dP
ZR
2:
0.9
99
99
99
61
7 M
ean
: 3
34
.85
Var
: 1
.29
e-0
4(1
87
6)
>3
38
.18
#2
9R
2:
0.9
99
99
99
81
2 M
ean
: 3
34
.52
Var
: 2
.70
e-0
5(1
96
)
<3
38
.37
#3
0 S
En
dP
ZR
2:
0.9
99
99
99
59
8 M
ean
: 3
34
.89
Var
: 2
.06
e-0
4(1
68
0)
>3
38
.37
#3
1R
2:
0.9
99
99
99
84
4 M
ean
: 3
34
.73
Var
: 4
.18
e-0
5(3
35
)
<3
38
.68
#3
2R
2:
0.9
99
99
99
54
5 M
ean
: 3
34
.93
Var
: 5
.75
e-0
4(1
34
5)
>3
38
.68
Figure 4.2 Regression tree trained for actual needle tip position on Z axis. In each node areR2, mean, variance and training data size of the associated linear Gaussian model. The green
nodes are leaves. The others are root and internal nodes, which have the variable name ofsplitting test specified. Attached to each edge is the outcome of the test on corresponding
branch.
27
HFTotZRoot: SActionNameR2: 0.8450941985
(2000)
#1R2: 0.6578921795 Mean: -7.54e-07
Var: 1.22e-11(627)
extract-needle
#2R2: 0.7885379944
Mean: 1.20e-06 Var: 1.16e-11
(611)
insert-needle
#3R2: NaN
Mean: 0.00e+00 Var: 0.00e+00
(386)
move-home
#4R2: NaN
Mean: 0.00e+00 Var: 0.00e+00
(376)
move-needle
Figure 4.3 Regression tree trained for total force on needle on Z axis. The content has thesame meaning as in Figure 4.2
pairs are sampled when they are needed to split an internal node. Some resulting models are
shown in Figure 4.2 and Figure 4.3. Linear Gaussian models are also trained with 5, 000 pairs
of data points randomly sampled from 400 normal trajectories.
By comparing r2 of each model, we observe that for some variables like needle tip posi-
tions, linear Gaussian models can perfectly capture it’s dynamics, although regression trees are
also almost as good. We use linear Gaussian models for those variables. However, for some
other variables like needle forces, they have different dynamics in different actions. For ex-
ample, for move-needle and move-home actions, forces are always 0, while for insert-needle
and extract-needle actions, they are not. Simple linear Gaussian models are not able to capture
this relationship because it is no longer linearly representable. On the other hand, however, a
regression tree can perfectly model this behavior by creating a branch for each of these cases.
Figure 4.3 shows the resulting model we have for needle force on Z axis.
4.3.1 Results
We first evaluate how well our DBNs can model normal trajectories. We randomly sample
20, 000 (st, st+1) pairs from the 400 normal trajectories (different from the train set). Using the
st values, we then produce predicted state st+1 and compute an r2 metric that represents the
28
Model Hardware Software All
HS 0.784 0.415 0.670
HS10 0.773 0.397 0.648
HWOnly 0.725 N/A N/A
Table 4.1 Average Test r2 for normal state trajectories.
accuracy of these predictions in the following way:
r2v = 1−∑
i(vi(t+ 1)− vpredi (t+ 1))2∑i(vi(t+ 1)− v(t+ 1))2
[4.1]
r2 =1
|V |∑v∈V
r2v [4.2]
where V is the set of predictable variables in DBN, vi(t+ 1) is the (t+1) value of variable v in
ith pair, vpredi (t+ 1) is predicted value from the model, v(t+ 1) is mean of (t+1) value of v in
training set. r2 is the average of all r2v’s.
These results are shown in Table 4.11. From these results, we observe that the DBNs are
quite good at modeling the time-evolution of the hardware variables. It is more difficult to
predict the software variables, though there is a lot of variability. Some variables, such as the
depth, can be very well predicted (r2 = 0.99) because it varies in a nearly constant rate, as
robot controllers are designed to maintain a constant velocity as far as possible. Others such
as the force on the needle in the y direction, are hard (r2 = 0.4) to predict since its dynamic
is not as linear. However, as we show below, this still often allows us to effectively classify,
detect, and predict A&A events. Finally, we observe that when using the HWOnly model,
the accuracy of prediction on the hardware variables is less than when using the HS model.
This suggests that the software variables add value to the DBN, and using them results in more
accurate predictions.
1See Appendix C for r2 values of different models for all predictable variables.
29
4.4 Classification, detection, and prediction of A&A events
Now we evaluate the effectiveness of our model on classifying, detecting, and predicting
the three types of A&A events we introduced.
4.4.1 A&A event classification
0
0.2
0.4
0.6
0.8
1
0 0.01 0.02 0.03 0.04 0.05
Tru
e po
sitiv
e ra
te
False positive rate
HSHS10
HWOnly
Figure 4.4 ROC graph for classification of “sweep” A&A event.
We first consider the question of classifying trajectories according to whether they contain
an A&A event. This may be useful if execution data is logged on a remote system and later
fetched for analysis.
We consider a test set (distinct from the train set) of 400 normal trajectories and 5 trajec-
tories each of the three A&A events described above. For each point st on each trajectory, we
use the DBN models to compute the negative log likelihood (NLL) of st+1 (or st+10 for HS10).
We record the maximum negative log likelihood over the trajectory, and use this score to rank
30
0
0.2
0.4
0.6
0.8
1
0 0.01 0.02 0.03 0.04 0.05
Tru
e po
sitiv
e ra
te
False positive rate
HSHS10
HWOnly
Figure 4.5 ROC graph for classification of “OOW” A&A event.
trajectories. The negative log likelihood2 for a state st+1 given its previous state st is computed
as follows:
NLL(st+1|st) =∑
i∈{Pred}
(vpredi (t+ 1)− vi(t+ 1))2
V ar(i)[4.3]
where Pred is the set of indices of predictable variables in the DBN, vpredi (t+1) is the predicted
value for variable i at time t+ 1, vi(t+ 1) is the real observed value for variable i at time t+ 1
V ar(i) is the variance of the CPD (in our case either regression tree model or linear Gaussian
model) over training data. Having defined the NLL for state, we then define the “score” of a
trajectory as follows:
scoreC =T−1maxt=1
NLL(st+1|st) [4.4]
where T is the length of the trajectory, and a higher value of scoreC indicates a higher possi-
bility of the trajectory being A&A event. Since we already know the ground truth of each tra-
jectories, i.e., whether it contains an A&A event or not, we can then create a receiver-operating
2The NLL also has a∑
12 log(2πσ
2) term. Since we use NLL to rank predictions and this is a constant offset,we omit it.
31
characteristic (ROC) graph from this ranked list of predictions. ROC graphs show how the
true positive rate (TPR) varies with the false positive rate (FPR) as a threshold is moved over a
confidence measure (i.e. the score in our case).
Results for classifying “encoder failure” A&A event are perfect for all three models, in the
sense that the area under the ROC curves are all 1. Results for the other two A&A events are
shown in Figure 4.4 and 4.5, respectively. We can see that a trajectory with an encoder failure
or software bug (here generating a “OOW” trajectory) is quite easy to classify for all our
models. This is probably because these kinds of events have a very strong signature among the
hardware variables, which can be captured by models incorporating them. On the other hand,
the sweep event, is harder to classify if software variables are not used. This is because one of
the variables in the HS model is the force on the needle tip, which behaves very abnormally
in such cases and causes a large deviation from the expected behavior. Even though HWOnly
is relatively less accurate, however, all of the DBNs are quite accurate in absolute terms and
reach 100% TPR within 3% FPR. Also, note that for “OOW” event, HWOnly model is doing
perfect while the others are not. This is because the models for some software variables are
not good enough such that there could be large deviation from the prediction to actual values.
As we are taking maximum of the NLLs across the whole trajectory, it is possible that we get
large scoreC for normal trajectories, although the chance is very small.
4.4.2 A&A event detection
Next we consider how quickly our models can detect A&A events after the event has oc-
curred. Of course, we would like not just to detect these events but to predict them, and we will
discuss prediction in the next section. However, some A&A events, such as our encoder failure
and software bugs, may be unpredictable in that the trajectory appears completely normal until
the point when the event happens. Therefore it is still valuable to ask, given that an A&A event
has happened, how quickly a model such as we use can detect it. To measure this we again use
a test set of 400 normal trajectories and 5 trajectories for each of the three A&A events. As
before, we use the DBNs to check every (st, st+1) pairs, but in this case we do not aggregate
32
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
Tru
e po
sitiv
e ra
te
False positive rate
HSHS10
HWOnly
Figure 4.6 ROC graph for detection of “encoder failure” A&A event.
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
Tru
e po
sitiv
e ra
te
False positive rate
HSHS10
HWOnly
Figure 4.7 ROC graph for detection of “sweep” A&A event.
33
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
Tru
e po
sitiv
e ra
te
False positive rate
HSHS10
HWOnly
Figure 4.8 ROC graph for detection of “OOW” A&A event.
the prediction over trajectories. Rather we associate each point with a smoothed NLL score,
where the smoothing is done over a window of 50 (chosen arbitrarily) previous time steps.
The smoothing helps to reduce error in intermediate short regions where the DBN’s estimate
is poor. The “score” of a state (point) in a trajectory is defined as follows:
scoreD(i) =i−1∑
t=max{i−50,0}
NLL(st+1|st)min{i, 50}
[4.5]
where NLL is as defined in Equation 4.3, and again, a higher score indicates a higher prob-
ability of an A&A event at this time point. In this case, the “ground truth” is set as follows:
every point after an A&A event until the end of the event receives a label “positive”, while
every other is labeled “negative”. We then use the smoothed NLL score to construct a ROC
graph, as shown in Figure 4.6, Figure 4.7, and Figure 4.8.
From these results we observe that while all the models are good at detecting encoder
failures, the HS10 models have a small advantage. Also for the other two A&A events, HS10
34
are doing slightly better than the other models. This is probably due to the built-in “lookahead”
in these models.
4.4.3 A&A event prediction
0
0.2
0.4
0.6
0.8
1
0 0.1 0.2 0.3 0.4 0.5 0.6
Tru
e po
sitiv
e ra
te
False positive rate
HSHS10
HWOnly
Figure 4.9 ROC graph for prediction of “sweep” A&A event with 10 steps looking ahead.
Finally, we consider the task of predicting A&A events. Imagine that the model is being
used in an online setting where at every step it can make a determination as to whether an
A&A event is likely to occur in the next k steps. To do this, at every point t, we obtain k points
st+1, ..., st+k from our DBNs conditioned on st. These k points are the means of the associated
Gaussian distributions, so this is the most likely trajectory conditioned on st. From these points
we pick the needle tip positions and measure the average NLL score (Equation 4.6) compared
to the reference trajectory provided as input. (We use only the needle tip position because that
is the only information in the reference trajectory.) Thus we are evaluating, given the current
state, how likely it is that the most likely needle tip trajectory k steps later will be significantly
different from the reference. Each st is then associated with this average NLL score. The
35
0
0.2
0.4
0.6
0.8
1
0 0.1 0.2 0.3 0.4 0.5 0.6
Tru
e po
sitiv
e ra
te
False positive rate
HSHS10
HWOnly
Figure 4.10 ROC graph for prediction of “sweep” A&A event with 100 steps looking ahead.
0
0.2
0.4
0.6
0.8
1
0 0.1 0.2 0.3 0.4 0.5 0.6
Tru
e po
sitiv
e ra
te
False positive rate
HSHS10
HWOnly
Figure 4.11 ROC graph for prediction of “sweep” A&A event with 1000 steps looking ahead.
36
NLL score for the predicted needle tip position of state st+1 based on that of previous state,
compared to the reference needle tip position of state st+1 is defined as follows:
NLL(pt+1|st) =(vpredactX(t+ 1)− vrefX(t+ 1))2
V ar(actX) + V ar(refX − actX)
+(vpredactY (t+ 1)− vrefY (t+ 1))2
V ar(actY ) + V ar(refY − actY )
+(vpredactZ(t+ 1)− vrefZ(t+ 1))2
V ar(actZ) + V ar(refZ − actZ)[4.6]
where vpredactX(t + 1) is the predicted value of actual needle tip position in x axis, vrefX(t +
1) is the value of reference needle tip position in x axis, which is provided as parameter a
priori, V ar(actX) is the variance of the CPD (in our case either regression tree model or linear
Gaussian model) of variable actX over training data, V ar(refX − actX) is the variance of
variable refX and actX over training data. The same logic applies for the meanings of other
notations concerning Y and Z in the equation. Having defined NLL(pt+1|st), we now can
define a score for a state i in prediction task as follows:
scoreP (i) =i+k−1∑t=i
NLL(pt+1|st)k
[4.7]
where k is a user specified parameter indicating how far to look ahead. And again, the higher
the score is, the more possible that an A&A event occurs at the corresponding time point. The
ground truth for each st is as follows: if an A&A event does happen within k steps, it is labeled
“positive”, else “negative”. We then, as before, construct an ROC graph from these predictions.
Experimental results are shown in Figure 4.9, Figure 4.10, and Figure 4.11, with k = 10
(0.01s), k = 100 (0.1s), and k = 1000 (1s) for the sweep event, respectively. From the results,
we observe that HS and HWOnly both do very well, while HS10 is lagging behind. The reason
for HWOnly models performing as good as HS models is that only (a subset of) the hardware
variables are evaluated to calculate the NLL scores (because the reference trajectory has only
those variables). As a result, the advantage of predicting the software variables is limited in
this setting. On the other hand, HS10 models are not as good in prediction. This is probably
because predicted values for software variables are used to further predict hardware variables
37
A&A events Classification Detection
Encoder failure All perfect HS10>HS>HWOnly
Sweep HS>HS10>HWOnly HS10>HS≈HWOnly
OOW All perfect HS10>HS≈HWOnly
Table 4.2 Comparison of models on classifying, and detecting different A&A events.
Lookahead (ms) Prediction
10 HS≈HWOnly>HS10
100 HS≈HWOnly>HS10
1000 HS≈HWOnly>HS10
Table 4.3 Comparison of models on predicting “sweep” event.
when looking ahead, and the HS10 models for those software variables are not all good enough.
Table 4.2 and Table 4.3 summarize the empirical results. To conclude, all of the DBNs
we test perform well in our experiments at different tasks. They are all good at classifying
trajectories when there is strong signature among the hardware variables, while HWOnly is not
as good at classification when only some software variables behave abnormally in the A&A
events. For detection and prediction tasks, HS10 is the best at detecting A&A events after the
fact, and HS and HWOnly are best at predicting A&A events before they happen. However, we
also see that even though it is not the best, HS is a close second, indicating that this model is
a good compromise for all three tasks that we consider. These results also indicate that DBNs
of the type we use can be successful in identifying A&A events and modeling elements of the
software state can be helpful in classification and detection of these events.
38
Chapter 5
Conclusion and Future Work
In this work, we have proposed a framework to improve the safety and reliability of medical
robotic surgery systems. We have described a prototype that addresses some of the subprob-
lems, and presented encouraging preliminary results that indicate the feasibility of building on
this foundation. Specifically, DBNs are designed and used to model the behavior and dynamics
of the simulator of a RoS system, i.e. SABiR. We show empirically that the DBNs are able to
effectively relate collected data to the occurrence of A&A events.
Despite good performance of DBN models, a lot remains to be done for the whole frame-
work to be realized. A lot of improvements can be conducted to the simulation and controller
by adding path planning and a simulation of image-guidance, improving the software archi-
tecture to enable more complex high level actions to be performed and to collect more data
about the software execution, and finally also adding to the statistical analysis by computing
“usage envelopes”, that represent regions of the state space where the robot can operate reli-
ably, without many A&A events. It is also important to consider alternative A&A events in
different categories. Last but not least, it is important to integrate with the real hardware and
evaluate how well our model can detect A&A events there, as well as use the results to refine
the software/simulation framework.
39
Appendix A: State variables and DBN structures
Type Name Components Num
Parameter
Reference Trajectory
Reference position X, Y, Z 3
Reference orientation X, Y, Z 3
Begin position X, Y, Z 3
Begin orientation X, Y, Z 3
End position X, Y, Z 3
End orientation X, Y, Z 3
Ready PointReady position X, Y, Z 3
Ready orientation X, Y, Z 3
Insert Distance Insert distance 1
Software
Action Action name: move-needle, insert-
needle, extract-needle, move-home
1
Insert Depth Depth of needle inside tissue 1
Needle Force Force on the needle X, Y, Z 3
Needle Torque Torque on the needle X, Y, Z 3
Motor Position Error Errors between actual motor joint angles
and reference motor joint angles
5
Motor Speed Error Errors between actual motor speeds and
reference motor speeds
5
Hardware
Needle PositionNeedle tip position X, Y, Z 3
Needle tip orientation X, Y, Z 3
Motor Position Motors’ joint angles 5
Motor Torque Motors’ torques 5
Total: 59
Table A.1 All state variables in DBNs.
40
Set Variable Name Num
RVSet All
Needle Position 6
Motor Position 5
Motor Torque 5
Insert Depth 1
Needle Force 3
Needle Torque 3
Reference Trajectory 18
Ready Point 6
Insert Distance 1
Action 1
Motor Position Error 5
Motor Speed Error 5
Total: 59
RVSet 1
Needle Position 6
Motor Position 5
Motor Torque 5
Insert Depth 1
Needle Force 3
Needle Torque 3
Reference Trajectory 18
Ready Point 6
Insert Distance 1
Action 1
Total: 49
Table A.2 Relevant variable sets for DBN structures. See Table A.3 for DBN structures.
41
Type Variable Name Num Relevant Variable Set
Predictable
Needle Position 6 RVSet 1
Motor Position 5 RVSet All
Motor Torque 5 RVSet All
Insert Depth 1 RVSet 1
Needle Force 3 RVSet 1
Needle Torque 3 RVSet 1
Total: 23 -
Observable
Reference Trajectory 18 -
Ready Point 6 -
Insert Distance 1 -
Action 1 -
Motor Position Error 5 -
Motor Speed Error 5 -
Total: 36 -
Table A.3 DBN structures.
42
Appendix B: Workspace, environment, and parameters
This appendix contains information about the workspace of SABiR and the simulator. As
shown in Figure B.1, there is a home position, and two nested gels.
B.1 Parameters
The coordinates of initial needle tip position (home position) is1:
(-10.7505, -206.2838, 330.8692)
The initial needle tip orientation is:
(-0.0327, 0.0020, 0.9995)
The coordinates of corners of outside gel is:
(0 -240 360), (0 -180 360)
(0 -240 340), (0 -180 340)
(-20 -240 360), (-20 -180 360)
(-20 -240 340), (-20 -180 340)
The coordinates of corners of inner gel is:
(-7 -220 350), (-7 -200 350)
(-7 -220 345), (-7 -200 245)
(-12 -220 350), (-12 -200 350)
(-12 -220 345), (-12 -200 245)
1All the coordinates here use millimeter as unit.
43
Figure B.1 Workspace of the SABiR simulator.
Figure B.2 Normal trajectory.
44
Appendix C: DBN models for normal trajectories
Variables HS HS10 HWOnly
Needle Tip Position X: 1 1 1
Needle Tip Position Y: 1 1 1
Needle Tip Position Z: 1 1 1
Needle Tip Orientation X: 1 1 1
Needle Tip Orientation Y: 1 1 1
Needle Tip Orientation Z: 1 1 1
Motor Joint Angle 1: 1 1 1
Motor Joint Angle 2: 1 1 1
Motor Joint Angle 3: 1 1 1
Motor Joint Angle 4: 1 1 1
Motor Joint Angle 5: 1 1 1
Motor Torque 1: 0.363 0.305 0.161
Motor Torque 2: 0.359 0.297 0.111
Motor Torque 3: 0.289 0.194 0.054
Motor Torque 4: 0.377 0.322 0.162
Motor Torque 5: 0.374 0.258 0.107
Average 0.784 0.773 0.725
Table C.1 r2 values of different models on predictable hardware state variables. HS: modelsusing hardware/software variables. HS10: models using hardware/software variables with
10-step prediction. HWOnly: models using only hardware variables.
45
Variables HS HS10
Insert Depth: 1 1
Needle Tip Force X: -0.202 0.227
Needle Tip Force Y: 0.402 0.452
Needle Tip Force Z: 0.792 0.810
Needle Tip Torque X: 0.075 0.173
Needle Tip Torque Y: -0.029 0.344
Needle Tip Torque Z: 0.866 -0.231
Average 0.415 0.397
Table C.2 r2 values of different models on predictable software state variables. HS: modelsusing hardware/software variables. HS10: models using hardware/software variables with
10-step prediction.
46
LIST OF REFERENCES
[1] B. Franklin, “Robotic surgical systems,” Biomedical Instrumentation & Technology,vol. 40, no. 6, pp. 461–464, 2006.
[2] F. Cao, R. Mark, Z. Bai, M. C. Cavusoglu, A. Podgurski, and S. Ray, “A framework fordetecting adverse and anomalous events in medical robotic systems,” in IEEE/RSJ Inter-national Conference on Intelligent Robots and Systems (IROS), IEEE, 2012. submitted.
[3] E. Estey, “Robotic prostatectomy: The new standard of care or a marketing success?,”Canadian Urological Association Journal, vol. 3, no. 6, p. 488, 2009.
[4] J. Finkelstein, E. Eckersberger, H. Sadri, S. Taneja, H. Lepor, and B. Djavan, “Openversus laparoscopic versus robot-assisted laparoscopic prostatectomy: the European andUS experience,” Reviews in Urology, vol. 12, no. 1, p. 35, 2010.
[5] FDA, “Adverse event report 2955842-2008-01144: Intuitive Surgical Inc., Da Vinci SSurgical System endoscopic instrument control system.” www.accessdata.fda.gov/
scripts/cdrh/cfdocs/cfMAUDE/Detail.CFM?MDRFOI__ID=1077464, April 2012.
[6] Intuitive Surgical Inc., “Da Vinci S Surgical System.” www.intuitivesurgical.com/
products/davincissurgicalsystem/index.aspx, April 2012.
[7] FDA, “Device classification.” http://www.fda.gov/MedicalDevices/
DeviceRegulationandGuidance/Overview/ClassifyYourDevice/default.htm,April 2012.
[8] T. Schaaf, “Robotic surgery: The future is now,” Mediacal DeviceLink, March, 2001.
[9] FDA, “Medical device quality systems manual: A small entity complianceguide.” www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/
PostmarketRequirements/QualitySystemsRegulations/
MedicalDeviceQualitySystemsManual/default.htm, April 2012.
[10] FDA, “General principles of software validation; final guidance for industry and fdastaff.” http://www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/
GuidanceDocuments/ucm085281.htm, April 2012.
47
[11] FDA, “The quality system regulation.” www.fda.gov/MedicalDevices/
DeviceRegulationandGuidance/PostmarketRequirements/
QualitySystemsRegulations/MedicalDeviceQualitySystemsManual/
ucm122391.htm, April 2012.
[12] E. Clarke, O. Grumberg, and D. Peled, Model Checking. MIT press, 1999.
[13] I. El-Far and J. Whittaker, “Model-based software testing,” Encyclopedia of SoftwareEngineering, 2001.
[14] J. Jacky, M. Veanes, C. Campbell, and W. Schulte, Model-Based Software Testing andAnalysis with C#. Cambridge University Press, 2008.
[15] R. Jetley, S. Purushothaman Iyer, and P. Jones, “A formal methods approach to medicaldevice review,” Computer, vol. 39, no. 4, pp. 61–67, 2006.
[16] P. Godefroid, N. Klarlund, and K. Sen, “Dart: directed automated random testing,” inACM Sigplan Notices, vol. 40, pp. 213–223, ACM, 2005.
[17] M. Lyu, Handbook of software reliability engineering. McGraw-Hill New York et al.,1996.
[18] R. Taylor and D. Stoianovici, “Medical robotics in computer-integrated surgery,” IEEETransactions on Robotics and Automation, vol. 19, no. 5, pp. 765–781, 2003.
[19] B. L. Davies, Computer-Integrated Surgery: Technology and Clinical Applications, ch. ADiscussion of Safety Issues for Medical Robots, pp. 287–300. MIT Press, 1996.
[20] E. Dombre, P. Poignet, F. Pierrot, G. Duchemin, and L. Urbain, “Intrinsically safe ac-tive robotic systems for medical applications,” in 1st IARP/IEEE-RAS Joint Workshop onTechnical Challenge for Dependable Robots in Human Environment, Seoul, pp. 21–22,2001.
[21] R. Howe and Y. Matsuoka, “Robotics for surgery,” Annual Review of Biomedical Engi-neering, vol. 1, no. 1, pp. 211–240, 1999.
[22] W. Ng and C. Tan, “On safety enhancements for medical robots,” Reliability Engineering& System Safety, vol. 54, no. 1, pp. 35–45, 1996.
[23] G. Duchemin, P. Poignet, E. Dombre, and F. Peirrot, “Medically safe and sound [human-friendly robot dependability],” Robotics & Automation Magazine, IEEE, vol. 11, no. 2,pp. 46–55, 2004.
[24] S. B. Ellenby, “Safety issues concerning medical robotics,” in IEE Colloquium on Safetyand Reliability of Complex Robotic Systems, pp. 3/1–3/4, IET, 1994.
48
[25] B. Fei, W. Ng, S. Chauhan, and C. Kwoh, “The safety issues of medical robotics,” Relia-bility Engineering & System Safety, vol. 73, no. 2, pp. 183–192, 2001.
[26] Y. Hu, T. Podder, I. Buzurovic, K. Yan, W. Ng, and Y. Yu, “Hazard analysis of EUCLID-IAN: An image-guided robotic brachytherapy system,” in Engineering in Medicine andBiology Society, 2007. EMBS 2007. 29th Annual International Conference of the IEEE,pp. 1249–1252, IEEE, 2007.
[27] P. Varley, “Techniques for development of safety-related software for surgical robots,”Information Technology in Biomedicine, IEEE Transactions on, vol. 3, no. 4, pp. 261–267, 1999.
[28] B. Halder and N. Sarkar, “Robust fault detection of a robotic manipulator,” The Interna-tional Journal of Robotics Research, vol. 26, no. 3, pp. 273–285, 2007.
[29] R. Mattone and A. De Luca, “Relaxed fault detection and isolation: An application to anonlinear case study,” Automatica, vol. 42, no. 1, pp. 109–116, 2006.
[30] M. McIntyre, W. Dixon, D. Dawson, and I. Walker, “Fault identification for robot manip-ulators,” IEEE Transactions on Robotics, vol. 21, no. 5, pp. 1028–1034, 2005.
[31] V. Verma, G. Gordon, R. Simmons, and S. Thrun, “Real-time fault diagnosis [robot faultdiagnosis],” Robotics & Automation Magazine, IEEE, vol. 11, no. 2, pp. 56–66, 2004.
[32] R. Lutz and I. Carmen Mikulski, “Operational anomalies as a cause of safety-criticalrequirements evolution,” Journal of Systems and Software, vol. 65, no. 2, pp. 155–161,2003.
[33] J. Whittaker and M. Thomason, “A Markov chain model for statistical software testing,”IEEE Transactions on Software Engineering, vol. 20, no. 10, pp. 812–824, 1994.
[34] G. Baah, A. Podgurski, and M. Harrold, “Causal inference for statistical fault localiza-tion,” in Proceedings of the 19th international symposium on Software testing and analy-sis, pp. 73–84, ACM, 2010.
[35] T. Dean and K. Kanazawa, “A model for reasoning about persistence and causation,”Computational intelligence, vol. 5, no. 2, pp. 142–150, 1989.
[36] L. Breiman, J. Friedman, R. Olshen, and C. Stone, Classification and Regression Trees.Monterey, CA: Wadsworth and Brooks, 1984.
[37] O. Bebek, M. J. Hwang, B. Fei, and M. C. Cavusoglu, “Design of a small animal biopsyrobot,” in Engineering in Medicine and Biology Society, 2008. EMBS 2008. 30th AnnualInternational Conference of the IEEE, pp. 5601–5604, IEEE, 2008.
49
[38] M. J. Hwang, O. Bebek, F. Liang, B. Fei, and M. C. Cavusoglu, “Kinematic calibrationof a parallel robot for small animal biopsies,” in IEEE/RSJ International Conference onIntelligent Robots and Systems (IROS), pp. 4104–4109, IEEE, 2009.
[39] MathWorks, “Simulink.” http://www.mathworks.com/products/simulink/, April2012.
[40] R. C. Jackson and M. C. Cavusoglu, “Modeling of needle-tissue interaction forces duringsurgical suturing,” in Proceedings of the IEEE International Conference on Robotics andAutomation (ICRA), St. Paul, MN, USA, May 14-18 2012. To appear.
[41] J. Quinlan, C4.5: programs for machine learning. Morgan Kaufmann, 1993.