3D Organ Modeling - Clarkson Universitysonarav/Files/MThesis.pdf · element method) on the 3D model...
Transcript of 3D Organ Modeling - Clarkson Universitysonarav/Files/MThesis.pdf · element method) on the 3D model...
Clarkson University
3D Organ Modeling
A thesis by
AJAY V. SONAR
Department of Electrical and Computer Engineering
Submitted in fulfillment of the requirements
for the degree of
MASTERS OF SCIENCE
(ELECTRICAL ENGINEERING)
DATE
Accepted by the Graduate School
Date Dean
The undersigned have examined the thesis entitled:
3D Organ Modeling
presented by Ajay Sonar, a candidate for the degree of Master of Science
and hereby certify that it is worthy of acceptance.
__________ ________________________ Date Advisor Dr. James J. Carroll Associate Professor Electrical and Computer Engineering Department Examining Committee _________________________ Dr. Sunil Kumar Assistant Professor Electrical and Computer Engineering Department _________________________ Dr. Robert J. Schilling Professor Electrical and Computer Engineering Department
i
Abstract A lot of research has been going on in the field on 3D modeling in the recent
years. Image based reconstruction from multiple views is a challenging problem. It has
application in various fields. One of them is the medical field. 3D models of the body
organs/parts assist in understanding the mechanism in a much better way than the
conventional 2D MRI or CT scan images or by gross pathologic examination. In this
ongoing study a biomechanical approach in understanding the mechanisms involved in
Abdominal Aortic Aneurysm (AAA) pathogenesis to help improve the ability to identify
those that have a high risk of rapture and hence aid clinical management is being tried.
Rapture of AAA is the 13th leading cause of death in the United States. AAA is a disease
that affects the large blood vessel, abdominal aorta, in the abdomen. In some patients,
these vessels start to bloat and will keep on bloating until they are either surgically
repaired (by implanting an artificial tube in its place) or until they rupture. The project
ultimately has 2 goals: 1) understand AAA disease and 2) develop ways to predict when
an AAA will rupture. This is possible by applying the numerical techniques (e.g., finite
element method) on the 3D model of the organ, to estimate the distributions of stress in
the walls of different aneurysms. These kind of studies can aid vastly in understanding
how the disease progresses.
ii
Contents 1. INTRODUCTION
1.1. Overview
1.2. Related Work
2. CAMERA CALIBRATION
2.1. Introduction to Camera Calibration
2.2. Parameters
2.2.1. Intrinsic Parameters
2.2.2. Extrinsic Parameters
2.3. Calibration Steps
2.4. Setup
3. IMAGE ACQUISITION
3.1. Background Subtraction and Silhouette Extraction
4. 3D RECONSTRUCTION
4.1. Introduction to Voxel Carving
4.2. Voxel Carving by Silhouette Extraction
4.3. Voxel Carving by Coloring
4.3.1. Color Invariants
4.3.2. Ordinal Visibility Constraint
4.3.3. Voxel Coloring by Layered Scene Decomposition
4.3.4. Single Pass Algorithm
4.4. Surface Reconstruction
4.4.1. Problems Associated with Marching Cube Algorithm
4.4.1.1. Ambiguous Faces
4.4.1.2. Internal Ambiguities
4.4.2. Resolving Ambiguities
4.4.2.1. Resolving Ambiguities on the Face
4.4.2.2. Resolving Internal Ambiguities
iii
5. RESULTS
5.1. On the Phantom Model
5.1.1. Reconstruction Accuracy
5.2. On the Actual Specimen
6. CONCLUSION AND FUTURE WORK
6.1. Ways to Improve Results
6.1.1. Extended Lookup Table for the Marching Cube Algorithm
6.1.2. Voronoi-based Surface Reconstruction
6.2. Other Applications
REFERENCES
APPENDIX 1
APPENDIX 2
APPENDIX 3
iv
List of figures Figure 1: Calibration grid of one of the calibration images
Figure 2: The Setup
Figure 3(a): Reference Background
Figure 3(b): Image of the Object
Figure 3(c): Image after Background Subtraction
Figure 3(d): Image after Thresholding
Figure 4: Reconstruction from three views
Figure 5: The Voxel space.
Figure 6: Projecting a voxel on to the camera image plane
Figure 7: Bounding Box
Figure 8: Projection of voxels on the silhouette on one of the camera views
Figure 9: Carved Voxels after silhouette intersection test
Figure 10: Voxel Coloring. Given a set of basis images and a grid of voxels, color values
to voxels have to be assigned in a way that is consistent with all images
Figure 11: Example of Spatial ambiguity. Both voxel colorings appear identical from
these two viewpoints, despite having no colored voxels in common
Figure 12: Example of Color ambiguity. Both voxel colorings appear identical from these
two viewpoints. But the second row, center voxel has different color assignment in the
two scenes
Figure 13: Each of the six voxels has the same color in every consistent scene in which it
is contained. The collection of all such color invariants forms a consistent voxel coloring
denoted by S
Figure 14: Compatible camera configurations. (a) An overhead inward-facing camera
moving 360 degrees around the object. (b) An array of outward facing cameras.
Figure 15: 2D Layered scene traversal. Voxels can be partitioned into a series of layers of
increasing distance from the camera volume
Figure 16: 3D Layered Scene Traversal. Starts with L0 through L3
Figure 17: Result of voxel coloring algorithm alone
Figure 18: Final carved voxels
v
Figure 19: Triangulate Voxels
Figure 20: Indexing convention of the vertices and edges of a voxel
Figure 21: Vertex 1 is inside the surface and the rest of the vertices are outside the
surface
Figure 22: Final Surface Reconstructed model
Figure 23: Formation of hole in the surfaces Figure 24: Two different configuration of triangulation with the same set of intersection
point
Figure 25: Extended Lookup Table
Figure 26: Resolving the ambiguity on a face
Figure 27: Two configurations of case 4
Figure 28(a-d): Four Different Views of the Phantom Model
Figure 29(a-c): Three Different Views of the Actual Specimen
Figure 30: Projection of the Carved Voxels on the Silhouettes
Figure 31: 3D point-cloud obtained from the Marching Cube algorithm
1
1.INTRODUCTION 1.1.Overview
The problem of acquiring 3D model from a set of input images is a challenging
task. Due to new graphics oriented applications like tele-presence, virtual walkthroughs
and virtual view synthesis it has grabbed attention in the computer vision community.
Different approaches have been adapted, reconstruction from stereo images [2], [3], [4],
[5], or from multiple images from a single camera [6], [7], [14] to achieve this depending
on the applications. In this project a combination of voxel carving from silhouettes and
voxel coloring method [1], [6] in reconstructing 3D model is used by taking multiple
images of the phantom model of the stomach affected by Aortic Aneurysm. The
foreground, which contains the image of the organ, is separated from the background by
calculating an appropriate background model. A voxel model is generated from those
images and a triangulated surface model is generated from the voxel model. Further
Finite Element Analysis has to be done on the model to study the mechanisms involved
in the AAA disease.
2
1.2.Related work Different methods have been tried to reconstruct 3D graphical model of real
objects. Reconstruction from the silhouette based volume intersection is one of the
methods [7] [14]. The silhouette based method constructs the approximate visual hull of
the object. Some excess volume is produced in this approximated visual hull due to the
concavities existing on the object and the insufficient camera viewing angles. Rather than
using the binary silhouette images shape from photo consistency employs the additional
photometric (color) information [1], [6]. The method is similar to Space-Sweep approach
presented in [8] and [9] which performs an analogous scene traversal. In [9] a plane is
swept through the scene volume and votes are accumulated for points on the plane that
project to edge features in the images. Scene features are identified by modeling the
statistical likelihood of accidental accumulation and thresholding the votes to achieve a
desired false positive rate. This approach is useful in the case of limited occlusions, but
does not provide a general solution to the visibility problem.
A similar edge-based voting technique that uses linear subspace intersections
instead of a plane sweep to obtain feature correspondences is described in [28]. In this
approach, each point or line feature in an image “votes” for the scene subspace that
projects to that feature. Votes are accumulated when two or more subspaces intersect,
indicating the possible presence of a point or line feature in the scene. A restriction of this
technique is that it detects correspondences only for features that appear in all input
images.
In [10] and [11] a dome of cameras were built to capture real world dynamic
scenes. The intensity images and depth maps from each of the camera view at each time
instant are combined to form a Visible Surface Model (VSM) using Multibaseline Stereo
Algorithm. VSM encodes the structure of the scene visible to a camera (view dependent).
By merging the depth maps from different cameras in a common volumetric space a
Complete Surface Model (view independent) is generated. Generating CSM from VSM
works well when VSM’s to be merged are individually accurate. A disadvantage is that
original images are not used in the merging process so it is difficult to assess the photo
integrity of the reconstruction.
3
2. CAMERA CALIBRATION 2.1. Introduction to Camera Calibration
The objective of the camera calibration is to determine a set of camera parameters
that describe the mapping between 3D reference coordinates and 2D image coordinates.
Camera calibration in the context of 3-Dimensional machine vision is the process of
determining the internal camera geometric and optical characteristics (internal
parameters) and the 3D position and orientation of the camera reference frame relative to
certain world coordinate system (external parameters). The overall performance of the
system strongly depends on the accuracy of the camera calibration.
The pinhole camera model is used through the calibration procedure. This model
is based on the principle of co-linearity, where each point in the object space is projected
by a straight line though the projection center into the image plane. The pinhole model is
only an approximation of the real camera projection. It is not valid when high accuracy is
required and therefore a more comprehensive camera model must be used. The pinhole
model is a basis that is extended with some corrections for the distorted image
coordinates. The most commonly used correction is for the radial lens distortions that
cause the actual image to be displaced radically in the image plane. Also the centers of
curvature of lens surfaces are not always strictly collinear which introduces another
common distortion type, de-centering distortion which has both a radial and tangential
components. A proper camera model for accurate calibration can be derived by
combining the pinhole model with the correction for the radial and tangential distortion
components [12, 13].
The calibration is done in two steps: initialization and then nonlinear
optimization. The initialization step computes a close-form solution for the calibration
parameters not including any lens distortion. The non-linear optimization step minimizes
the total reprojection error over all the calibration parameters. The optimization is done
by iterative gradient descent with an explicit computation of Jacobian matrix.
4
2.2. Parameters 2.2.1. Intrinsic Parameters
The internal camera model is very similar to that used by Heikkil [13]. The
internal parametes of the camera are:
• Focal length (fc): focal length in pixels
• Principal point (cc): the center of camera image plane in pixels
• Skew coefficient (alpha_c): angle between the x and y pixel axis
• Distortions (kc): the image distortion coefficients (radial & tangential distortions)
Definition of the intrinsic parameters:
Let P be a point in space of coordinate vector ];;[ CCCC ZYXXX = in the
camera reference frame. Projection of the point P on the image plane according to the
intrinsic parameters will be (correct the sentence)
Let xn be the normalized pinhole image projection:
⎥⎦
⎤⎢⎣
⎡=⎥⎦
⎤⎢⎣⎡=
yx
ZcYcZcXcxn
// - (1)
Let 222 yxr += - (2)
After including the lens distortion, the new normalized point coordinate xd is
defined as follows:
dxxrkcrkcrkcxx
x nd
dd ++++=⎥
⎦
⎤⎢⎣
⎡= ))5()2()1(1(
)2()1( 642 - (3)
Where, dx is the tangential distortion vector:
⎥⎥⎦
⎤
⎢⎢⎣
⎡
++
++=
xykcyrkcxrkcxykc
dx)4(2)2)(3(
)2)(4()3(222
22
- (4)
Once the distortion is applied, the final pixel coordinates ];[_ ypxppixelx = and
the normalized coordinate vector xd are related to each other through the linear equation:
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡=
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
1)1()1(
1d
d
p
p
yx
KKy
x
- (5)
5
Where KK is known as the camera matrix:
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡=
100)2()2(0)1()1(*_)1(
ccfcccfccalphafc
KK - (6)
2.2.2. Extrinsic Parameters The extrinsic parameters are the rotation and translation matrices. Consider the
calibration grid of one of the calibration image.
O
XY
Z
Image points (+) and reprojected grid points (o)
100 200 300 400 500 600
50
100
150
200
250
300
350
400
450
Figure 1: Calibration grid of one of the calibration images
Let P be a point in space of coordinate vector ];;[ ZYXXX = in the grid
reference frame (Figure 1). Let ];;[ CCCC ZYXXX = be the coordinate vector of P in the
camera reference frame. Then XX and cXX are related to each other through the
following rigid motion equation:
ccc TXXRXX += * - (7)
The translation vector CT is the coordinate vector of the origin of the grid pattern
(O) in the camera reference frame, and the third column of the matrix CR is the surface
normal vector of the plane containing the planar grid in the camera reference frame.
6
2.3. Calibration Steps For the calibration process the Camera Calibration Toolbox for Matlab developed
by the Vision group at CALTECH [12] is used. A checker board pattern is used to
calculate the intrinsic and extrinsic parameters of the camera (see Appendix 3). The
calibration is done in two steps, initialization and nonlinear optimization. The
initialization step computes a closed form solution for the calibration parameters which
do not include any lens distortion. The nonlinear optimization step minimizes the total
reprojection error over all the calibration parameters. The optimization is done by
iterative gradient descent
2.4. Setup Scorpion B&W CCD camera having a resolution of 640 by 480 is used to capture
the images. The camera has a 12 pin GPIO interface which can be used to trigger the
camera to capture images (see Appendix 1). An RT-12 indexed rotary positioning table
is used to capture the images of the object from different angles. A programmable
controller is used to drive the Mdrive 23 stepper motor which controls the position of the
rotary table in precise steps (see Appendix 2). The trigger signal on the GPIO pin can be
synchronized with the position of the turntable such that the camera is triggered when the
turntable rotates by certain fixed step angles and stops. By doing so we can be sure that
the turntable is in exactly the same position, while capturing the image of the object and
while capturing the image of the checker board pattern to calculate the extrinsic
parameters. Figure 2 shows the setup.
Figure 2: The Setup
7
3. IMAGE ACQUISITION Once all the required calibration parameters are acquired, the next step is to take
images of the object to be modeled. The object space is illuminated by a set of lights and
is assumed to be approximately Lambertian scene with constant illumination. A set of 20
images are captured without the object of interest in the space by a camera (in this case a
Scorpion B&W CCD camera having a resolution of 640 by 480 was used). The average
of these is taken to remove any salt and pepper noise that might be introduced in the
image acquisition process. This image is used as a reference background.
A programmable controller is used to drive a motor which controls the indexed
turntable in precise steps, e.g., 10 degrees. The object to be modeled is placed at the
center of the turntable. Images of this object are captured at the predefined intervals, e.g.,
36 images for 10 degree steps.
3.1. Background Subtraction and Silhouette Extraction
The problem of extracting an object from an image or a video sequence is a
fundamental and crucial problem of many vision systems that include video surveillance,
object detection and tracking or human-machine interface. Typically, the approach for
discriminating objects from the background scene is background subtraction. The idea of
background subtraction is to subtract the current image from a reference image, which is
acquired from a static background during a period of time. The subtraction leaves only
non-stationary or new objects, which include the objects entire silhouette region. This
technique has been used for several years in many vision systems as a preprocessing step
for object detection [15].
One problem of generating silhouette by background subtraction is to remove
shadows effectively. In our setup the object space is illuminated by a set of lights which
would result is an approximately Lambertian space that is each point in the scene has
approximately constant illumination. Hence there are no shadows cast by the object. An
appropriate threshold is chosen and the images are segmented into background and
foreground.
8
A background model is created by averaging 20 frames taken at regular interval
as shown in Figure 3(a). This is done to cancel the effects of ambient light flickering and
salt and pepper noise that may be introduced in the images. Figure 3(b) shows the image
of the object is placed at the center of turntable. The background subtraction is performed
to extract the object from the entire scene. The extracted object is shown in Figure 3(c).
This image is thresholded to get a binary image. An appropriate value for thresholding is
chosen by trial and error for one of the images. Since the light intensity remains almost
the same throughout the image acquisition process, the same value is used for the rest of
the images. The resulting image is the silhouette image as shown in Figure 3(d). A
silhouette image is a binary image, with the value at a point indicating whether or not the
visual ray from the optical center through that image point intersects an object surface in
the scene. Thus each pixel is either a silhouette point or a background point.
(a) Reference Background (b) Image of the Object
(c) Image after Background Subtraction (d) Image after Thresholding
Figure 3.
9
4. 3D RECONSTRUCTION 4.1. Introduction to Voxel Carving
Reconstructing a 3D shape using 2D silhouettes from multiple images is also
called voxel carving, volume intersection or shape from silhouettes. The intersection of
the cones associated with a set of cameras/camera views defines a volume of scene space
in which the object is guaranteed to lie. The volume only approximates the true 3D shape,
depending on the number of views, the position of the viewpoints, and the complexity of
the object. Since concave patches are not observable in any silhouette, a silhouette based
reconstruction encloses the true volume. Laurentini [16] characterized the best
approximation, obtainable by an infinite number of silhouettes captured from all
viewpoints outside the convex hull of the object, as the visual hull.
Many methods have been developed for constructing volumetric models from a
set of silhouette images [7, 16, 17, 18, 19, 21, 22]. Starting from a bounding volume that
is known to enclose the entire scene, the volume is discretized into voxels and the task is
to create a voxel occupancy description corresponding to the intersection of back-
projected silhouette cones. The main step in these algorithms is the intersection test.
Some methods back project the silhouettes, creating an explicit set of cones that are then
intersected either in 3D [18, 19], or in 3D after projecting voxels into the images [20, 21].
Alternatively, it can be determined whether each voxel is in the intersection by projecting
it into all or the images and test whether it is contained in every silhouette [22]
In practice only a finite number of silhouettes are combined to reconstruct the
scene, resulting in an approximation that includes the visual hull as well as other scene
points. Figure 4 shows an example of the volume reconstructed form 3 silhouettes. The
generalized cones associated with the three images result in a reconstruction that includes
the object (black), points in concavities that are not visible from any view points (brick
texture), and points that are not visible from any of the three given views (gray).
10
Figure 4: Reconstruction from three views
Shapes reconstructed form silhouettes have been used successfully in a variety of
applications, including virtual reality [23], real-time human motion modeling [24, 25] and
building an initial coarse scene model [26]. In applications such as real-time image based
rendering of dynamic scenes, where an explicit scene model is not essential intermediate
step, new views can be rendered directly using visual ray intersection test [27].
11
4.2. Voxel Carving by Silhouette Extraction The volume of interest is divided into 80*80*180 equal sized voxels, each of 1
mm cubes. Figure 5 illustrates the voxel space created.
Figure 5: The Voxel space
The voxels are projected on a particular image plane using the intrinsic and
extrinsic parameters of the camera calculated as described in section 2. The projection of
one voxel on a plane results in 8 points corresponding to the 8 vertices of the voxel cube.
A bounding box containing all the 8 points is calculated. Figure 6 illustrates the process
of projecting a particular voxel on to the camera image plane.
Figure 6: Projecting a voxel on to the camera image plane [7]
12
The following algorithm explains the process of projecting each of the voxels on a
camera image plane and calculating the bounding box associated with that voxel.
Internal Parameters: fc is the focal length of the camera, cc is the image plane center,
_alpha c is the skew coefficient, kc is a 1x5 matrix containing the radial and tangential
distortion coefficients.
External Parameters: _Tc ext is the translation matrix and _Rc ext is the rotational
matrix.
_ _ * _ _cam coordinates Rc ext voxel coordinates Tc ext= + - (8)
_ (1)_ (2)_ (3)
Xc cam coordinatesYc cam coordinatesZc cam coordinates
===
- (9)
_ ;Xc Ycnormalized projectionZc Zc
⎡ ⎤= ⎢ ⎥⎣ ⎦ - (10)
2 2
_ (1)_ (2)
x normalized projectiony normalized projection
r x y
==
= +
- (11)
2 2
2 2
[2* (3)* * (4)*( 2* );(3)*( 2* ) 2* (4)* * ]
dx kc x y kc r xkc r y kc x y
= + +
+ + - (12)
2 4_ (1 (1)* (2)* )*[ ; ]distortion coordinates kc r kc r x y dx= + + + - (13)
_ (1)_ (2)
xd distortion coordinatesyd distortion coordinates
==
- (14)
_ ( (1)*( _ * ) (1))_ ( (2)* (2))
pixel x round fc xd alpha c yd ccpixel y round fc yd cc
= + += +
- (15)
13
The above algorithm is implemented for each of the 8 vertices of all the
voxels. _pixel x and _pixel y contains the x and y coordinates of the projection of a
voxel’s vertex onto the image plane.
From the eight points that are obtained from the above algorithm, the maximum
and the minimum values of the x and y coordinates is calculated. From this a bounding
box, which is a square that contains all the 8 projected points is constructed.
The Figure 7 shows the bounding box calculated from the projected voxels on one
of the camera image planes from one particular view. The red dots are the 8 vertices of a
voxel and the blue square is the bounding box that encloses the projected voxel vertices
on the image plane.
352 354 356 358 360 362 364 366 368 370185
190
195
200
205
210
Figure 7: Bounding Box
Each voxel is classified as either or outside the silhouette by checking for
overlapping region of the bounding box associated with that voxel with the silhouette.
This is called the voxel intersection test. The voxels that are outside are discarded. Figure
8 shows the projection of three of the voxels on the silhouette of the object taken from a
zero degrees. One of them is completely inside the silhouette, second voxel is on the
surface and the 3rd voxel is completely outside. The first two voxels are kept and the third
voxel is discarded.
14
Figure 8: Projection of voxels on the silhouette on one of the camera views
This process is repeated for all the camera views. At the end the voxels that are
retained represent the visual hull of the object. The result is shown in Figure 9.
Figure 9: Carved Voxels after silhouette intersection test
4.3. Voxel Carving by Coloring As explained in section 4.1 the voxel carving algorithm generates only the visual
hull of the object. This is an approximation of the actual object. The algorithm cannot
handle any concavities that might be present on the surface of the object.
Surface Voxel
Inside Voxel
Outside Voxel
15
Scene reconstruction by voxel coloring [1] is another technique different from
other approaches in its ability to cope with large changes is visibility and occlusions.
Voxel coloring problem is to assign colors (radiances) to voxels in a 3D volume so as to
achieve consistency with a set of basis images as illustrated in Figure 10. It is assumed
that the scene is composed of approximately Lambertian surface under fixed illumination.
Under these conditions, the radiance at each point is isotropic and therefore be described
by a scalar value which is called color. A 3D scene S is represented as a set of opaque
Lambertian voxels, each of which occupies a finite homogeneous scene volume centered
at a point S∈V , and has an isotropic radiance ),( Scolor ν . It is assumed that the scene is
entirely contained within a known, finite bounding volume. The set of all voxels in the
bounding volume is referred to as the voxel space and denoted with the symbolν . An
image is specified by the set I of all its pixels, each centered at a point I∈p , and having
irradiance ),( Scolor p .
Figure 10: Voxel Coloring. Given a set of basis images and a grid of voxels, color values
to voxels have to be assigned in a way that is consistent with all images [1]
Given an image pixel I∈p and scene S , we refer to the voxel S∈V that is
visible in I and projects to p by )(pV S= . A scene S is said to be complete with respect to
a set of images if, for every image I and every pixel I∈p , there exists a voxel S∈V
16
such that )(pV S= . A complete scene is said to be consistent with a set of images if, for
every image I and every pixel I∈p
)),((),( SScolorIcolor pp = - (16)
If N denotes the set of all consistent scenes, then the voxel coloring problem can
be defined as:
• Given a set of basis images I0, …, In of a static Lambertian scene and a voxel
space ν , determine a subset ν⊂S and a coloring ),( Scolor V , such
that NS ∈ .
Two issues that have to be addressed in this case are:
• Uniqueness: Multiple voxel coloring may be consistent with a given set of
images
• Computation: How to compute voxel coloring from a set of input images
without combinatorial search
Consistent voxel coloring exists, corresponding to the set of points and colors on
surfaces of the true Lambertian scene. But the voxel coloring is rarely unique, given that
a set of images can be consistent with more than one 3D scene. By Spatial Ambiguity a
voxel contained in one scene may not be contained in another as illustrated in Figure 11.
And by Color Ambiguity a voxel may be contained in two consistent scenes, but have
different colors in each as illustrated in Figure 12. Hence additional constraints are
needed to make the problem well defined.
S S’
Figure 11: Example of Spatial ambiguity. Both voxel colorings appear identical from
these two viewpoints (S and S’), despite having no colored voxels in common [1]
17
S S’
Figure 12: Example of Color ambiguity. Both voxel colorings appear identical from these
two viewpoints (S and S’). But the second row, center voxel has different color
assignment in the two scenes [1]
4.3.1. Color Invariants The only way to recover intrinsic scene information is through invariants—
properties that are satisfied by every consistent scene. For instance, consider the set of
voxels that are contained in every consistent scene. Laurentini [29] described how these
invariants, called hard points, could be recovered by volume intersection from silhouette
images. Hard points provide absolute information about the true scene but are relatively
rare; some images may yield none. A more frequently occurring type of invariant is
related to color rather than shape. A voxel V is said to be color invariant with respect to a
set of images if:
• V is contained in a scene consistent with the images
• For every pair of consistent scenes S and 'S , 'SS ∩∈V
implies )',(),( ScolorScolor VV = .
Unlike shape invariance, color invariance does not require that a point be
contained in every consistent scene. As a result, color invariants are more prevalent than
hard points. The union of all color invariants itself yields a consistent scene, i.e., a
complete voxel coloring as illustrated in Figure 13.
18
Figure 13: Each of the six voxels has the same color in every consistent scene in which it
is contained. The collection of all such color invariants forms a consistent voxel coloring
denoted by S [1]
4.3.2. Ordinal Visibility Constraint
Color invariants are defined with respect to the combinatorial space N of all
consistent scenes. In order to compute the color invariants by a single pass through the
voxel space, it is necessary for the input camera configurations to satisfy the ordinal
visibility constraint which is stated as:
• Let P and Q are scene points in an image from a camera view then there
exists a real non-negative function ℜ⇒ℜ3:D such that for all scene points
P andQ , and input images I, P occludes Q in I only if )()( QDPD < .
Figure 14 shows the two possible camera configurations satisfying the constraint.
In our experiments the configuration shown in Figure 14 (a) is used.
19
(a) (b)
Figure 14: Compatible camera configurations. (a) An overhead inward-facing
camera moving 360 degrees around the object. (b) An array of outward facing
cameras [1]
4.3.3. Voxel Coloring by Layered Scene Decomposition The ordinal visibility constraint limits the possible camera view configurations,
but the visibility relationships are simplified. It becomes possible to partition the scene
into a series of voxel layers that obey a visibility relationship, i.e., for every input image,
voxels only occlude other voxels that are in subsequent layers. Hence the visibility
relationships are resolved by evaluating voxels one layer at a time.
To formulate the idea of visibility ordering, the following partition of the 3D
space into voxel layers of uniform distance from the camera volume is defined.
Ur
id
d
i
dVDV
1
})(|{
=
=
==
νν
ν
- (17)
Where, rdd ,...,1 is an increasing distance from the camera volume.
For the sake of illustration, consider a set of views positioned along a line facing a
two-dimensional scene as shown in Figure 15. Choosing D to be orthogonal distance to
the line gives rise to a series of parallel linear layers that move away from the cameras.
Notice that for any two voxels P andQ , P can occlude Q from a basis viewpoint only
20
if Q is in a higher layer than P . The linear case is easily generalized for any set of
cameras satisfying the ordinal visibility constraints.
Figure 15: 2D Layered scene traversal. Voxels can be partitioned into a series of layers of
increasing distance from the camera volume [1]
Decomposition of a 3D scene can be done in a similar manner. In the 3D case the
layers become surfaces that expand outward from the camera volume as shown in Figure
16.
-40-39
-38-37
-36-35
-40-39
-38-37
-36116
117
118
119
120
Figure 16: 3D Layered Scene Traversal. Starts with L0 through L3
L0
L3
d2 d3
21
To compensate for the effects of image quantization and noise, suppose that the
images are discretized on a grid of finite non-overlapping pixels. If a voxel V is not fully
occluded in image jI , its projection overlaps a nonempty set of image pixels, jπ .
Without noise or quantization effects, a consistent voxel should project to a set of pixels
with equal color values. In the presence of these effects, the correlation of Vλ of the pixel
colors is evaluated to measure the likelihood of voxel consistency. In this case, the value
of Vλ was chosen to be the standard deviation of the image pixels onto which the voxel V
projects.
4.3.4. Single Pass Algorithm In order to evaluate the consistency of a voxel, first the set of pixels jπ , that
overlap V’s projection in jI is calculated. Neglecting occlusions, it is straightforward to
compute a voxel’s image projection, based on the voxel’s shape and the known camera
configuration (intrinsic and extrinsic parameters). The term footprint [36] is used to
denote this projection, corresponding to the intersection with the image plane of all rays
from the camera center intersecting the voxel. Accounting for occlusions is more
difficult, and only those images and the pixel positions should be included from which V
is visible. This difficulty is resolved by using the ordinal visibility constraint to visit
voxels in an occlusion compatible order and marking pixels as they are accounted for.
Initially all pixels are unmarked. When a voxel is visited, jπ is defined to be the
set of unmarked pixels that overlap V’s footprint. When a voxel is evaluated and found to
be consistent, all pixels in jπ are marked. Because of the occlusion compatible order of
voxel evaluation, this strategy is sufficient to ensure that jπ contains only the pixels from
which each voxel is visible. By assumption voxels within layer do not occlude each other.
The complete voxel coloring algorithm can be stated as follows [1]:
22
The threshold corresponds to the maximum allowable correlation error. A very
small value will result in an accurate but incomplete reconstruction. On the other hand, a
large value yields a more complete reconstruction but includes some erroneous voxels.
Instead thresholding correlation error, it is possible to optimize for model completeness.
A completeness threshold may be chosen that specifies the minimum allowable
percentage of image pixels left unmarked. For instance, a completeness threshold of 75%
requires that at least 3/4th of the image pixels correspond to the projection of the colored
voxels.
23
The result of the voxel coloring algorithm alone applied on the voxel space to
carve out the object is shown below in Figure 17. The result shows some extra voxels that
are present in the final carved out voxels. This is because the algorithm assumes that the
scene space is composed of Lambertian surface under fixed illumination. This is the ideal
condition, and is not true in practical scenario as the surface of the object might reflect
light and the scene space might be illuminated by ambient light. The extra voxels are the
result of these reasons. Better results are achieved by first obtaining the convex hull of
the object by silhouette intersection and then applying the voxel coloring algorithm on
the voxel set representing the convex hull of the object. The Figure 18 shows the result of
carving by first applying the silhouette intersection test and then the voxel coloring
algorithm.
Figure 17: Result of voxel coloring
algorithm alone
Figure 18: Final carved
voxels
24
4.4. Surface Reconstruction The next step in the process is constructing a surface model out of the voxel
carved model generated from previous steps. Depending on the application several
approaches to the 3D surface generation problem have been proposed [30, 31, 32, 33].
The most popular approach to generate triangular surfaces, when a sampled scalar data is
structured on a cubical grid, is the Marching Cubes [30, 31].
Marching cube uses a divide and conquer approach to locate the surface in a
logical cube (voxel). The algorithm determines how the surface intersects this voxel, then
moves (or marches) to the next voxel. To find the surface intersection on the voxel, we
assign a value of one to the voxel’s vertex if the data value at that vertex exceeds (or
equals) the value of the surface we are constructing. These vertices are inside or on the
surface. Voxel vertices with values below the surface receive a value of zero and are
outside the surface. The surface intersects those voxel edges where one vertex is outside
the surface (one) and the other is inside the surface (zero). With this assumption, the
surface topology within a voxel is determined.
Since there are 8 vertices in each voxel and two states, inside and outside, there
are 28=256 ways a surface can intersect the cube. By enumerating these 256 cases, a
lookup table that stores the surface-edge intersection is created. The table contains the
edges intersected for each case.
Triangulating the 256 cases is possible but tedious and error prone. Two different
symmetries of the cube reduce the number of cased from 256 to 14 patterns. Figure 19
shows the triangulation of the 14 patterns. Permutation of these 14 basic patterns using
complementary and rotational symmetry produces the 256 cases.
25
Figure 19: Triangulate Voxels
The indexing convention used to number the edges and vertices in our algorithm
is shown below in Figure 20.
Figure 20: Indexing convention of the vertices and edges of a voxel
26
If vertex 1 is below or inside the isosurface, i.e. having a value of zero, and all
other vertices are above the isosurface, i.e. having a value of one, then we would create a
triangular surface that cuts the edges 1, 4 and 9 as shown in Figure 21. And the exact
position of the vertices of the triangular surface depends of the values at the vertices 1, 2,
4 and 5.
Figure 21: Vertex 1 is inside the surface and the rest of the vertices are outside the
surface.
Depending on the user specified threshold value the vertices of the voxel are
defined to be either inside or outside the surface. An 8-bit binary number, voxel index, is
generated according to that. For example if vertex 1 is inside the surface and the other
vertices are outside then the voxel index would be 11111110 where a value of zero
indicates that the vertex is inside and a value of indicates that the voxel is outside the
surface. The positions of the 1’s are the vertex numbers. Thus if vertices 1, 2, 4 and 8 are
inside then the voxel index would be 01110100. A look up table of the intersecting edges
is made. Given the voxel index, the corresponding entry in the edge table gives the edges
that will be intersected by the triangulated surface. For example, if the voxel index is
11111110 then the corresponding entry in the edge table is 000100001001. That is if the
vertex 1 is inside the surface then edges 1, 4 and 9 are intersected by the surface. From
this information, a triangulated surface model, according to the convention shown in
Figure 19 is generated.
The intersection points of the surface on the edges of the voxels can be calculated
by linear interpolation. If P1 and P2 are the vertices of a cut edge and V1 and V2 are the
scalar values at each of the vertices, the intersection point P is given by
Sruface
Vertex below the surface
27
)/())(( 121211 VVPPVthresholdPP −−−+= - (18)
The last part of the algorithm involves forming the correct facets from the
positions where the surface intersects the edges of the voxel. Again a table is used which
makes use of the same voxel index but allows the vertex sequence to be looked up for as
many triangular facets are necessary to represent the surface within the voxel. Figure 22
shows the final surface reconstructed model.
Figure 22: Final Surface Reconstructed model
28
4.4.1. Problems Associated with Marching Cube Algorithm The main problems, with the Marching Cubes are the ambiguities inherent to the
data sampling. Those ambiguities can appear on the face or inside a voxel and may lead
to small holes appearing in the reconstructed surface [34, 35].
4.4.1.1. Ambiguous Face The ambiguity arises when a face has two diagonally opposite vertices inside the surface
(with a value 0) and the other two diagonally opposite vertices outside the surface (with a
value 1). For ambiguous faces, the information on the vertices is insufficient to decide
how to connect the intersection point on the edges. One such example is shown in Figure
23. When two adjacent voxels, one, of the form of case 3 and the other of the form of
case 6 are joined, it forms a hole in the middle.
Figure 23: Formation of hole in the surfaces
4.4.1.2. Internal Ambiguities The same set of intersection points on the edges may lead to different configuration of
tiling. One such example is shown in Figure 24.
Figure 24: Two different configuration of triangulation with the same set of
intersection point
29
4.4.2. Resolving Ambiguities To resolve the ambiguities the basic lookup table is extended, as shown in Figure
25, with the additional cases and the correct topology is selected by solving for
ambiguities as explained in [35].
Figure 25: Extended Lookup Table
30
4.4.2.1. Resolving Ambiguities on the Face For each configuration in the lookup table the Marching Cube method used only
one isosurface topology, while a trilinear function often permits several different variants.
The trilinear function is given below.
)()1()1(
)1)(1(
)1()1()1()1)(1()1)(1)(1(),,(
111
011
101
001
110
010
100
000
qstFstqFtsqF
tsqF
tqsFtsqFtsqFtsqFtsqF
+−+−+
−−+
−+−−+−−+−−−=
- (19)
Where, q, s and t represent the local coordinates of the voxel varying from 0 to 1,
and F000 … F111 represent the values at the vertices of the voxel. The function F varies
binlinearly over a face or any plane parallel to a face. By fixing one of the variable, for
example making q=q0, the equation take the form
)()1()()1()()1()()1(
,
)1(()1()1)(1(),(
01110011
1010001
01100010
01000000
qFqFDqFqFCqFqFBqFqFA
where
DsttsCtBstsAtsF
o
+−=+−=+−=+−=
+−+−+−−=
- (20)
On a face A, B, C and D are equal to the values at the corners of the face. On an
ambiguous face let A and C be inside the surface and B and D be outside the surface. In
order to determine which nodes are joined, it is sufficient to compare the two products
AC and BD. So if AC>BD, then the nodes that are inside are joined and the outside
31
nodes are separated, otherwise the nodes that are outside are joined as shown in Figure
26.
Figure 26: Resolving the ambiguity on a face
4.4.2.2. Resolving Internal Ambiguities There are different methods of solving internal ambiguities. One of them is the
comparison of hyperbolas on the opposite faces of the voxel where the internal
ambiguities exist. If two areas of the same sign (inside of outside the surface) are joined
inside the voxel, then the projections of the hyperbolas must intersect each other. Figure
27 shows two different configurations of the case 4 are shown [34, 35].
Figure 27: Two configurations of case 4
32
5. RESULTS The reconstruction algorithm has been tried on both the phantom model and on
the actual specimen. Section 5.1 shows the results of the phantom model and section 5.2
shows the results on the actual specimen.
5.1. On the Phantom Model Figure 28 (a-d) shows the AAA phantom image (left), the voxel carved model
(middle) and the surface model (right).
(a)
(b)
33
(c)
(d)
Figure 28: Four Different Views of the Phantom Model
5.1.1. Reconstruction Accuracy Voxel Size (mm) Volume of the
Phantom Model (cc)
Calculated Volume
of the Voxel Model
(cc)
% Error
2mm x 2mm x 2mm ~180 191.48 6.37%
1mm x 1mm x 1mm ~180 182.773 1.54%
34
5.2. On the Actual Specimen Figure 29(a-c) shows the results of voxel carving on the actual specimen.
(a)
(b)
(c)
Figure 29: Three Different Views of the Actual Specimen
35
Accurate reconstruction was not achieved on these images because of the errors in
the calibration parameters that were induced by the movement of the specimen with
respect to the calibration grid in the sequence of images taken.
The projection of the final carved voxels on the object silhouettes are shown in
Figure 30. These images show the error in reconstruction.
Figure 30: Projection of the Carved Voxels on the Silhouettes
36
6. CONCLUSION AND FUTURE WORK Surface model of the AAA model has been successfully reconstructed. The model
has to be refined to get a watertight surface model.
6.1. Ways to Improve Results Two different approaches can be taken to improve the results obtained.
6.1.1. Extended Lookup Table for the Marching Cube
Algorithm The holes on the surface could be patched by using an extended lookup table for
the marching cube algorithm as explained in sections 4.4.1 and 4.4.2.
6.1.2. Voronoi-based Surface Reconstruction Another method of reconstructing a smooth surface from a finite set of
unorganized sample points by using 3-dimensional Voronoi diagrams [37]. The
algorithm is based on 3-dimensional Voronoi diagram and Delaunay triangulation. The
output of the Marching Cube algorithm is also a set of sample points. Figure 31 shows
the 3D point-cloud obtained from the Marching Cube algorithm. Thus the Voronoi based
surface reconstruction could be applied to such a dataset. A through literature survey has
yet to be done on this technique.
Figure 31: 3D point-cloud obtained from the Marching Cube algorithm
37
6.2. Other Applications Influence of Microstructure on Conditions for Vertebral Compression Fractures
The scope of this research is to examine the influence of microstructure in
creating conditions favorable for the occurrence of vertebral compression factures. The
investigation will be conducted using synthetic trabecular bone microstructure core
samples and synthetic vertebrae, manufactured from microCT scans of human trabecular
bone and vertebrae, using stereolithography rapid prototyping equipment.
The objectives are:
Develop a process for designing, manufacturing and testing synthetic vertebrae
and vertebral trabecular bone specimens, including development of a systematic,
repeatable process for creating osteoporosis-affected microstructure from healthy
microstructure.
Use the above processes to conduct a systematic study on the influence of
microstructural variability and deterioration on the mechanical response of
vertebrae and vertebral trabecular bone, with emphasis on quantifying and
understanding microstructural characteristics and deformation mechanisms that
lead to vertebral compression fractures.
Examine the applicability of a theoretical bifurcation approach to strain
localization, for predicting the occurrence of vertebral compression fractures.
One of the aspects of the project is to provide a 3-dimensional representation of
the microstructure from MicroCT scans. 3-dimensional surface reconstruction can be
achieved by building a voxel model and by applying the marching cube surface
triangulation algorithms [38].
38
References [1] Steven M. Seitz and Charles R. Dyer, “Photorealistic Scene Reconstruction by
Voxel Coloring”, Int. Journal of Computer Vision, Vol. 35, No. 2, pp.151-173,
1999.
[2] Paul E. Debevec, Camillo J. Taylor, and Jitendra Malik, “Modeling and rendering
architecture from photographs: A hybrid geometry and image based approach”, In
Proc. SIGGRAPH 96, pp. 11-20, 1996.
[3] P. J. Narayanan, Peter W. Rander, and Takeo Kanade, “Constructing virtual
worlds using dense stereo”, Proc, Sixth Int. Conf. on Computer Vision, pp. 3-10,
Jan 1998.
[4] Daniel Scharstein, “Stereo vision for view synthesis”, In Proc. Computer Vision
and Pattern Recognition Conf, pp. 852-858, 1996.
[5] H. Fuchs, G. Bishop, K. Arthur, L. McMillan, R. Bajcsy, S. Lee, H. Farid, and T.
Kanade, “Virtual Space Teleconferencing Using a Sea of Cameras”, Proc. First
Int. Conf. on Medical Robotics and Computer Assisted Surgery, June, 1994, pp.
161-167.
[6] W. Bruce Culbertson, Thomas Malzbender and Gregory G. Slabaugh,
“Generalized Voxel Coloring”, Proc. International Workshop on Vision
Algorithms, Sep 1999, pp. 100-115.
[7] Adem Yasar Mulayim, Ulas Yilmaz and Volkan Atalay, “Silhouette based 3D
Model Reconstruction from Multiple Images”, IEEE Transactions on Systems,
Man and Cybernetics, Part B.
[8] Robert T. Collins, “A Space-Sweep Approach to True Multi-image Matching”,
Proc. Computer Vision and Pattern Recognition Conf., pp. 358-363, 1997.
[9] Robert T. Collins, “Multi-image Focus of Attention for Rapid Site Model
Construction”, Proc. Computer Vision and Pattern Recognition Conf., pp. 575-
581, 1997.
[10] P. J. Narayanan, Peter W. Rander and Takeo Kanade, “Constructing Virtual
Worlds Using Dense Stereo”, Proc. Sixth IEEE Int. Conf. on Computer Vision.
pp. 3-10, 1998.
39
[11] Takeo Kanade, Peter Rander and P. J. Narayanan. “Virtualized Reality:
Constructing Virtual Worlds from Real Scenes”, IEEE Multimedia, 4(1): pp. 34-
46, 1997.
[12] Jean-Yves Bouguet, ‘Camera Calibration toolbox for Matlab’,
http://www.vision.caltech.edu/bouguetj/calib_doc/
[13] ‘A Four-step Camera Calibration Procedure with Implicit Image Calibration’ –
Heikkila.
[14] W. Niem, J. Wingbermuble, “Automatic reconstruction of 3D objects using a
mobile monoscopic camera”, In International Conf. on Recent Advances in 3D
Imaging and Modeling, May 1997, pp. 173-180.
[15] Horprasert, T., Harwood, D., and Davis, L.S. “A statistical approach for real-time
robust background subtraction and shadow detection”. In Proc. IEEE ICCV’99
FRAME-RATE Workshop, Kerkyra, Greece.
[16] A. Laurentini. “The visual hull concept for silhouette based image
understanding”, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 16,
No. 2. pp. 150-162, 1994.
[17] A. Laurentini. “How far 3D shapes can be understood from 2D silhouettes”, IEEE
Trans. Pattern Analysis and Machine Intelligence, Vol. 17, No. 2, pp. 188-195,
1995.
[18] H. Noborio, S. Fukada, and S. Arimoto. “Construction of the octree
approximating three-dimensional objects by using multiple views”, IEEE Trans.
on Pattern Analysis and Machine Intelligence, 10(6):769-782, 1988.
[19] S. K. Srivastava and N. Ahuja. “Octree generation from object silhouettes in
perspective views”, Computer Vision, Graphics and Image Processing, 49:68-84,
1990.
[20] T. H. Hong and M. Shneier. “Describing a robot’s workspace using a sequence of
views from a moving camera”, IEEE Trans. Pattern Analysis and Machine
Intelligence, 7:721-726, 1985.
[21] M. Potmesil. “Generating octree models of 3D objects from their silhouettes in a
sequence of images”, Computer Vision, Graphics and Image Processing, 40:1-20,
1987.
40
[22] R. Szeliski. “Rapid octree construction from image sequences”, Computer Vision,
Graphics and Image Processing: Image Understanding, 58(1):23-32, 1993.
[23] 9 S. Moezzi, L-C. Tai, and P. Gerard. “Virtual view generation for 3D digital
video”, IEEE Multimedia, 4(1):18-26, 1997.
[24] G. K. M. Cheung, T. Kanade, J-Y. Bouguet, and M. Holler. “A real time system
for robust 3D voxel reconstruction of human motions”, In Proc. Computer Vision
and Pattern Recognition Conf. Vol 2. pp. 714-720, 2000.
[25] S. Moezzi, A. Katkere, D. Kuramura, and R. Jain. Reality modeling and
visualization from multiple video sequences. IEEE Computer Graphics and
Applications, 16(6):58–63, 1996.
[26] G. Cross and A. Zisserman. Surface reconstruction from multiple views using
apparent contours and surface texture. In A. Leonardis, F. Solina, and R. Bajcsy,
editors, Confluence of Computer Vision and Computer Graphics, pages 25–47.
Kluwer, 2000.
[27] W. Matusik, C. Buehler, R. Raskar, S. J. Gortler, and L. McMillan. Image-based
visual hulls. In Proc. SIGGRAPH 2000, pages 369–374, 2000.
[28] Steven M. Seitz and Charles R. Dyer. “Complete structure from four point
correspondences”. In Proc. Fifth Int. Conf. on Computer Vision, pages 330–337,
1995.
[29] Aldo Laurentini. How far 3D shapes can be understood from 2D silhouettes. IEEE
Trans. on Pattern Analysis andMachine Intelligence, 17(2):188–195, 1995.
[30] W. Lorensen, H. Cline. “Marching Cubes: A high resolution 3D surface
construction algorithm”, ACM Computer Graphics, Vol. 21, No. 4, pages: 163-
170, July 1987.
[31] C. Montani, R. Scateni, R. Scopigno. “Discretized Marching Cubes”,
Porceedings, Visualization 1994.
[32] B. Wunsche, J. Z. Lin. “An Efficient Topological Correct Polygonisation
Algorithm for Finite Element Data Sets”, Proc. Of IVCNZ, Nov. 2003.
[33] C-C. Ho, F-C. Wu, B-Y. Chen, Y-Y. Chuang, M. Ouhyoung. “Cubical Marching
Squares: Adaptive Feature Preserving Surface Extraction from Volume Data”,
EUROGRAPHICS, Vol. 24, Nov 2005.
41
[34] T. Lewiner, H. Lopes, A. W. Vieira and G. Tavares. “Efficient Implementation of
Marching Cubes’ cases with Topological Guarantees”, Journal of Graphics Tools,
Vol. 8, No. 2, pp. 1-15. 2003.
[35] E. V. Chernyaev. “Marching Cubes 33: Construction of topologically correct
Isosurfaces”, Technical Report CN/95-17, CERN, 1995
[36] Lee Westover. “Footprint evaluation for volume rendering”. In Proc. SIGGRAPH
90, pages 367–376, 1990.
[37] N. Amneta, M. Bern, M. Kamvysselis. “A New Voronoi-Based Surface
Reconstruction Algorithm”, Computer Graphics Proceeding, SIGGRAPH -98.
[38] R. Muller, T. Hildegrand and P. Ruegsegger. “Non-invasive bone biopsy: a new
method to analyse and display the three-dimensional structure of trabecular bone”,
Phys. Med. Biol. Vol. 39. 1994.
42
Appendix 1: Scorpion PGR 1394 Camera Model: SCOR-03NS
Figure 1: Picture of Scorpion Camera Module
Camera Specifications:
Table 1: Camera Specifications
43
Sensor Specifications:
Table 2: Sensor Specification
General Purpose Input/Output (GPIO) Pins: The Scorpion has a set of 8 GPIO pins that can be accessed by the Hirose HR10
(12 pin) external interface. These IO pins can be configured to accept an input signal to
externally trigger the camera or to send an output signal or strobe to an external device.
To configure the GPIO pins, consult the PGR IEEE-1394 Digital Camera Register
Reference.
GPIO Connector Pin Layout: The following diagram shows the pin layout for the Hirose HR10 12 pin female
circular connector (Manufacturer Part Number: HR10A-10R-12SB) used on all Scorpion
models. The male counter part Manufacturer Part Number is HR10A-10P-12P.
44
Figure 2: GPIO Pin Layout
GPIO Electrical Characteristics: The Scorpion GPIO pins are TTL 3.3V pins protected by two diodes to +3.3V and
GND in parallel. There is also a 10K resistor in series to limit current. When configured
as input, the pins can be directly driven from a 3.3V or 5V logic output. For output, each
GPIO pin has almost no drive strength (they are high impedance) and needs to be
buffered with a transistor or driver to lower its impedance. The IO pins are protected
from both over and under voltage.
45
Appendix 2: • Rotary Positioning Table
Figure 3: RT-12 Rotary Positioning Table
The RT-12 rotary positioning table can be used to position a variety of payloads
such as cameras or test fixtures. The 12” diameter aluminum top plate has 24 tapped
holes to attach the application. The RT-12 has a home switch that provides feedback to
the motion control system which tells the exact position of the table.
46
• Mdrive 23 Stepper Motor Mechanical Specifications:
Figure 4: Rotary MDrive23 Mechanical Specifications
Electrical Specifications:
Table 3: Electrical Specifications
47
Appendix 3: Camera Calibration Generate the calibration pattern
Generate a checker board pattern and paste it on a flat panel. Measure the X and
Y dimensions of the squares. To make these values as default the dX_default and
dY_default values in click_calib.m and click_calib_no_read.m (in the calib folder) can be
changed.
Camera Calibration Steps:
Start the main Matlab calibration function by typing calib_gui. All the calibration
images are loaded in to memory once and are not read from the disk again. Thus
increases the speed by reducing the number of disk access. But if the images are large or
the number of images is more Matlab may run out of memory. In such cases the
calibration toolbox has an option to run the calibration function in memory efficient
mode. In this mode the images are loaded one by one in to memory. But it takes more
time than the standard mode since the disk is accessed multiple times. The two modes of
operation are totally compatible and interchangeable.
The mode of operation can be specified at the Matlab command prompt as
calib_gui(0) for standard mode or calib_gui(1) for memory efficient mode.
Capture any number of images of the checker board pattern held in different
position and orientation and store them in a common folder. The image names must have
a common base name followed by numbers in sequence. A few of the images are shown
below. The image names are imgint1.jpg, imgint2.jpg, imgint3.jpg, imgint4.jpg and so
on.
48
Figure 5: Images of checker board pattern taken for camera calibration.
Corner extraction and calibration
Once the images are captured and stored start the calibration function by typing
calib_gui(1) at the Matlab command prompt. That will open a gui as shown below.
Figure 6: Matlab GUI for calibration.
Click on ‘Read Images’ button and when prompted enter the base name of the
image and the image format, in this case it would be ‘imgint’ and ‘jpg’ respectively.
Basename camera calibration images (without number nor suffix): imgint
Image format: ([]='r'='ras', 'b'='bmp', 't'='tif', 'p'='pgm', 'j'='jpg', 'm'='ppm') >> j
49
Checking directory content for the calibration images (no global image loading in
memory efficient mode)
Found images:
1...4...5...6...7...8...10...11...12...13...14...16...17...18...20...22...24...25...26...28...29...30...
done
To display the thumbnail images of all the calibration images, you may run
mosaic_no_read (may be slow)
Click on ‘Extract grid corners’ button. At the prompt hit enter without any
arguments to select all images. Choose the default window size for corner finder, i.e.
wintx=winty=5 by pressing enter without any arguments.
Extraction of the grid corners on the images
Number(s) of image(s) to process ([] = all images) =
Window size for corner finder (wintx and winty):
wintx ([] = 5) =
winty ([] = 5) =
Window size = 11x11
Do you want to use the automatic square counting mechanism (0=[]=default)
or do you always want to enter the number of squares manually (1,other)?
The corner extraction algorithm includes an automatic mechanism for counting
the number of squares in the grid. This is useful while working with large number of
images. In some cases the code may not predict the exact number of squares. This would
happen when calibrating lenses with extreme distortions. When prompted for automatic
square counting press enter to choose the options. At the end images with problems can
be reprocessed by manually counting the squares.
The images are then displayed on the screen for corner extraction. The first image
is shown below.
Processing image 1...
Loading image imgint1.jpg...
50
Using (wintx,winty)=(5,5) - Window size = 11x11 (Note: To reset the window size,
run script clearwin)
Click on the four extreme corners of the rectangular complete pattern (the first
clicked corner is the origin)...
Figure 7: first calibration image.
The first clicked point is associated with the origin of the reference frame attached
to the grid. The other three points can be selected in any order. Selecting the first point is
especially important while calibrating multiple cameras in space. While dealing with
multiple cameras, the same grid pattern reference frame needs to be selected for different
camera images, i.e. grid points need to correspond across all the different camera views.
An example of how to select the corner points is shown below.
51
Figure 8: extraction of corner
Once all the four corners are extracted, the algorithm prompts for the size of the
squares on the checkerboard pattern. The default value is 30mm. In this case the size is
30mm. so press enter without entering any arguments.
Size dX of each square along the X direction ([]=30mm) =
Size dY of each square along the Y direction ([]=30mm) =
The algorithm makes an initial guess of the corners and automatically counts the
number of squares in both the dimensions (shown in the figure below with red crosses)
52
Figure 9: initial guess for corner extraction.
If the predicted corners are close to the actual corners then the next step may be
skipped (initial guess for distortion). The corners are extracted using those positions as
the initial guess.
If the guessed grid corners (red crosses on the image) are not close to the actual
corners,
it is necessary to enter an initial guess for the radial distortion factor kc (useful for
subpixel detection)
Need of an initial guess for distortion? ([]=no, other=yes)
Corner extraction...
Done
The final extracted corners are shown below. The origin of the reference frame is
marked with ‘O’.
53
Figure 10: extracted corners.
Sometimes the predicted corners are not close enough to the actual corners to
allow for an effective corner extraction. In such cases it is necessary to refine the
predicted corners by entering a guess for lens distortion coefficient. An example is shown
below.
54
X
Y
O
The red crosses should be close to the image corners
100 200 300 400 500 600
50
100
150
200
250
300
350
400
450
Figure 11: The predicted Corners are not close to the real corners
If the predicted corners are far enough from the real grid corners, it might result
into wrong extractions. The main cause for this being image distortion. In order to help
the system to make better guess of the corner locations, the user can manually input a
guess for the first order lens distortion coefficient kc ( it is the first entry of the full
distortion coefficient vector kc). In order to input a guess for the lens distortion
coefficient, enter a non-empty string to the question Need of an initial guess for
distortion? (For example, 1). Enter then a distortion coefficient of kc=-0.3 (in practice,
this number is typically between -1 and 1).
If the guessed grid corners (red crosses on the image) are not close to the actual
corners,
it is necessary to enter an initial guess for the radial distortion factor kc (useful for
subpixel detection)
Need of an initial guess for distortion? ([]=no, other=yes) 1
Use number of iterations provided
Use focal provided
Estimated focal: 2727.4256 pixels
Guess for distortion factor kc ([]=0): -0.3
Satisfied with distortion? ([]=no, other=yes) 1
55
Corner extraction... The red crosses should be on the grid corners...
100 200 300 400 500 600
50
100
150
200
250
300
350
400
450
Figure 12: New predicted corners
OdX
dY
Xc (in camera frame)
Extracted corners
100 200 300 400 500 600
50
100
150
200
250
300
350
400
450
Figure 13: Final extracted corners
After corner extraction, the matlab data file calib_data.mat is automatically
generated. This file contains all the information gathered throughout the corner extraction
stage (image coordinates, corresponding 3D grid coordinates, grid sizes).
56
During calibrations, when there is a large amount of distortion in the image, the
program may not be able to automatically count the number of squares in the grid. In that
case, the number of squares in both X and Y directions have to be entered manually.
Main Calibration Step:
After the corner extraction, click on the Calibration button on the Camera
Calibration Tool, to run the main camera calibration procedure. Calibration is done in
two steps: initialization and nonlinear optimization. The initialization step computes a
closed form solution for the calibration parameters which do not include any lens
distortion. The nonlinear optimization step minimizes the total reprojection error over all
the calibration parameters. The optimization is done by iterative gradient descent with an
explicit computation of the Jacobian matrix.
Aspect ratio optimized (est_aspect_ratio = 1) -> both components of fc are estimated
(DEFAULT).
Principal point optimized (center_optim=1) - (DEFAULT). To reject principal
point, set center_optim=0
Skew not optimized (est_alpha=0) - (DEFAULT)
Distortion not fully estimated (defined by the variable est_dist):
Sixth order distortion not estimated (est_dist(5)=0) - (DEFAULT) .
Main calibration optimization procedure - Number of images: 22
Gradient descent iterations: 1…2…3…4…5…6…7…8…9…10…11...done
Estimation of uncertainties...done
Calibration results after optimization (with uncertainties):
Focal Length: fc = [ 1051.16340 1048.19912 ] ± [ 13.85102 13.49534 ]
Principal point: cc = [ 356.27700 237.85069 ] ± [ 20.24375 19.47471 ]
Skew: alpha_c = [0.00000] ± [ 0.00000 ] => angle of pixel axes = 90.00000 ±
0.00000 degrees
Distortion: kc = [ -0.30324 0.31284 -0.00228 0.00140 0.00000 ] ± [ 0.04881
0.33390 0.00318 0.00321 0.00000 ]
Pixel error: err = [ 0.27289 0.28010 ]
Note: The numerical errors are approximately three times the standard deviations
(for reference).
57
Observe that only 11 gradient descent iterations are required in order to reach the
minimum. This means only 11 evaluations of the reprojection function + Jacobian
computation and inversion. The reason for that fast convergence is the quality of the
initial guess for the parameters computed by the initialization procedure.
The 3D position of the calibration grid with respect to the camera (Figure 14) and
the position of the camera with respect to the grid reference frame (Figure 15) calculated
from the obtained parameters.
Figure 14: Position of the grid with respect to the camera
Figure 15: Position of the camera with respect to the grid reference frame