integrated optimal control and parameter estimation algorithms for ...
Optimal Electric Field Estimation and Control for Coronagraphy
Transcript of Optimal Electric Field Estimation and Control for Coronagraphy
Optimal Electric Field Estimation and
Control for Coronagraphy
Tyler D. Groff
A Dissertation
Presented to the Faculty
of Princeton University
in Candidacy for the Degree
of Doctor of Philosophy
Recommended for Acceptance
by the Department of
Mechanical and Aerospace Engineering
Adviser: Dr. N. Jeremy Kasdin
September 2012
© Copyright by Tyler D. Groff, 2013.
All Rights Reserved
Abstract
Detecting and characterizing extrasolar planets has become a very relevant field in Astro-
physics. There are several methods to achieve this, but by far the most difficult and po-
tentially most rewarding approach is direct imaging of the planets. Coronagraphs can be
used to image the area surrounding a star with sufficient contrast to detect orbiting planets.
However, coronagraphs exhibit an extreme sensitivity to optical aberrations which causes
starlight to leak into the search area. To solve this problem we use deformable mirrors to
correct the field, recovering a small search area of high contrast (commonly referred to as a
”dark hole”) where we can once again search for planets.
These coronagraphs require focal plane wavefront control techniques to achieve the nec-
essary contrast levels. These correction algorithms are iterative and the control methods
require an estimate of the electric field at the science camera, which requires nearly all of
the images taken for the correction. In order to maximize science time the amount of time
required for correction must be minimized, which means reducing the number of exposures
required for correction. Given the large number of images required for estimation, the ideal
choice is to use fewer exposures to estimate the electric field. With a more efficient monochro-
matic estimation in hand, we also seek to apply this correction over as broad a bandwidth
as possible. This allows us to spectrally characterize a target without having to repair the
field for every wavelength.
This thesis derives and demonstrates an optimal estimator that uses prior knowledge to
create the estimate of the electric field. In this way we can optimally estimate the electric field
by minimizing the number of exposures required to estimate under an error constraint. With
an optimal estimator in place for monochromatic light, we also demonstrate a controller that
can suppress the field over a bandwidth when provided with this monochromatic estimate.
The challenges, current levels of performance, and future directions of this work are discussed
in detail.
iii
Acknowledgements
Many people have contributed towards my successes in life over a very long period of time.
First, I thank my adviser, Dr. N. Jeremy Kasdin. You have been an excellent mentor over
the years and I consider you a great friend. Thank you for trusting me to use your laboratory,
with a resource like that it is hard not to be successful. Your enthusiasm for this work has
kept me engaged and made working with you very enjoyable. I greatly appreciate the time
I have spent debating and discussing all technical aspects of our work. I have learned a lot
from you and it has most certainly left a great impression.
I also thank my committee and readers, Drs. David Spergel, Robert Vanderbei, Mike
Littman, and Craig Arnold. Your continued advice over the past five years has been greatly
appreciated, as has the unique perspective each of you has taken on the work I present
here. I will always be grateful for your continuous presence and willingness to discuss any
challenge that I have faced. In addition to my adviser and committee, the faculty here at
Princeton have been incredibly supportive. Dr. Dick Miles’ continued support and interest
in my graduate career has been to my great benefit. Much of my understanding of optics
can be credited to him. Dr. Robert Stengel has taught me a great deal about estimation and
control, and I have very much appreciated his ongoing interest in my research. I also thank
Ed Turner for providing so many opportunities to work at the Subaru telescope, which has
played a substantial role in my career. I also thank the support staff in the department,
particularly Jessica O’Leary, Jill Ray, and Candy Reed, who always seem to be able to solve
any problem.
The post-docs in our group over my time have been limited to Mike McElwain and Alexis
Carlotti. I have enjoyed working with both of you, and look forward to more. In addition
to great friendship, I owe a debt of gratitude to the older students from my research group.
Drs. Amir Give’on, Laurent Pueyo, Jason Kay, Eric Cady, and Dmitry Savransky have all
contributed to my understanding and development in this field. Amir, we did not overlap at
Princeton but your continued presence at JPL has afforded us the opportunity to compare
iv
notes and work together several times. Laurent, I love getting to banter about math and
wavefront control with you. I always walk away knowing more, and I look forward to seeing
you whenever I can. Jason Kay, it feels like an eternity since I walked in the door and you
taught me everything about the lab, and I miss the loud banter and hatred for equipment
crashes. I will never forget our train ride to Boston. Eric Cady, I value all of our work
and conversations together and I am happy to see you so often as a colleague and friend.
Best enabler ever. Dmitry Savransky, I also appreciate our continued friendship and I am
grateful to see you on a regular basis. Thanks for tolerating my general impatience with
computers. Our symbiotic approach to understanding optics and computers is sorely missed.
To the younger students in our research group, it’s been fun having you around. A J, you
are picking up all the little quirks in the lab quickly and you have made my job much easier.
It’s been nice working with someone in there again and I’m glad that isn’t over.
One nice aspect of this field is its closeness and friendliness. As a result I have spent
a great deal of time with many individuals outside of Princeton and I consider them my
extended research group. Drs. Olivier Guyon, Ruslan Belikov, Remi Soummer, Frantz
Martinache, and Stuart Shaklan in particular have all made significant contributions to the
quality of the work presented here, and have given me much to think about. They have
been very generous in lending advice, providing thoughtful conversation, and have quickly
become people I consider to be very good friends.
I have many friends from my time at Princeton that are not part of my group; Mac Haas,
Mike Burke, Andy Stewart, Josh Proctor, James Michael, Will Larrison, Katie Quaranta,
my fellow fifth years, the bonfire attendees, the softball team, and rock climbing crews have
all made my time at Princeton quite enjoyable.
From my college years at Tufts University I would like to thank the ME faculty there,
particularly my advisers Dr. Gary Leisk and Dr. Robert White. They have gone above and
beyond, and I am glad to stay in contact with them to this day. I also would like to thank the
folks at DFM Engineering, where I truly learned how to do mechanical design and learned
v
to love building telescopes. They are truly a family, and I have learned so much from them.
I especially thank Dr. Frank Melsheimer and Kate Melsheimer for opening their home to
me and taking me under their wing. I also thank the Astrogeeks of OELS. Steve Lee, Dave
Olson, Ben Reed, and Poti Doukas have been a constant point of support. Steve, Dave,
and Ben have been some of the most important and constant mentors in my entire life, and
Poti quickly joined their ranks. I also thank all Astrogeeks and the entirety of the Outdoor
Education Laboratory Schools. It was through this program that I discovered astronomy,
and my continued participation has helped keep my wonder of the universe alive. It is in
this spirit that I consider the work presented here to merely be part of a life series entitled
“I Wonder....”
I end by thanking my wonderful family. My Grandparents Roy, Sally, Shirley, Jim, Dean.
My Parents Dean and Lauri. Your unwavering support through my entire life has gotten me
to this point, and I could not have done it without you. My Brother Shawn. Thank you
for serving our country, particularly in a time of war. Keep throwing rocks down hills, just
make sure nobody is at the bottom...To the rest of my family, aunts and uncles and in-laws
I thank you for your support as well, and for adding richness to my life.
My beautiful and intelligent wife, Kimberley. Your support has been unparalleled by
anyone. Your constancy, kindness, and intelligence have made my life (and this thesis) so
much better. I like hanging out with you, and I want you to know that you are the love of
my life.
None of the work would have been possible without substantial NASA funding. This
work was funded by NASA Grant #NNX09AB96G and the NASA Earth and Space Science
Graduate Fellowship.
This dissertation carries the number T-3243 in the records of the Department of Mechanical
and Aerospace Engineering.
vi
To the love of my life, Kimberley.
This would all be meaningless without you.
vii
Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
List of Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi
Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
1 Introduction 1
1.1 Science Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Coronagraphs and Wavefront Control . . . . . . . . . . . . . . . . . . . . . . 3
1.3 The High contrast Imaging Laboratory . . . . . . . . . . . . . . . . . . . . . 6
1.4 Two Deformable Mirrors in Series . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 Fourier Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5.1 Propagation: Fresnel Transform . . . . . . . . . . . . . . . . . . . . . 14
1.5.2 Imaging: Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . 16
1.6 Controllability of Amplitude and Phase . . . . . . . . . . . . . . . . . . . . . 20
1.6.1 Pupil Plane Controllability: Angular Spectrum . . . . . . . . . . . . 21
1.6.2 Image Plane Controllability: The Propagation Factor . . . . . . . . . 25
1.7 Numerical Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.8 Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
1.9 Chapter Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
viii
2 Focal Plane Wavefront Control 42
2.1 Monochromatic Wavefront Control . . . . . . . . . . . . . . . . . . . . . . . 43
2.2 Wavelength Dependence of the Image Plane . . . . . . . . . . . . . . . . . . 53
2.3 Continuous Bandwidth Constraint . . . . . . . . . . . . . . . . . . . . . . . . 55
2.4 Windowed Stroke Minimization . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.5 Extrapolating Estimates in Wavelength . . . . . . . . . . . . . . . . . . . . . 61
2.6 Chapter Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3 Batch Process Electric Field Estimation 68
3.1 Linearity of the Electric Field . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.2 Pairwise Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.3 DM Diversity: Batch Process Estimation . . . . . . . . . . . . . . . . . . . . 71
3.4 Probe Shapes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4 Kalman Filter Estimation 77
4.1 Constructing the Optimal Filter . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.2 Sensor and Process Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.3 Iterative Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.4 Optimal Probes: Using the Control Signal . . . . . . . . . . . . . . . . . . . 93
4.5 Chapter Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5 Laboratory Results 96
5.1 Monochromatic Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.1.1 DM Diversity Performance . . . . . . . . . . . . . . . . . . . . . . . . 97
5.1.2 Kalman Filter Performance . . . . . . . . . . . . . . . . . . . . . . . 98
5.2 Broadband Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.2.1 Prior to Single Mode Photonic Crystal Fiber . . . . . . . . . . . . . . 102
5.2.2 Photonic Crystal Single Mode Fiber Upgrade . . . . . . . . . . . . . 104
5.3 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
ix
6 Sources and Propagation of Error 115
6.1 Precision of a Contrast Measurement . . . . . . . . . . . . . . . . . . . . . . 116
6.2 Estimation Algorithms and Propagating Error . . . . . . . . . . . . . . . . . 120
6.3 Accuracy of Wavelength Extrapolation . . . . . . . . . . . . . . . . . . . . . 126
6.4 DM Controllable Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.5 Experiment Stability and Laser Power . . . . . . . . . . . . . . . . . . . . . 127
6.6 Stability of Laser Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
6.7 Final Remarks on Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
7 Conclusions and Future Directions 132
7.1 Parameter Adaptive Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . 133
7.2 Dual Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
7.3 Including Alternate Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
7.3.1 Establishing a Reference . . . . . . . . . . . . . . . . . . . . . . . . . 136
7.3.2 Applying Reference to the Time Update . . . . . . . . . . . . . . . . 136
7.4 Bias Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.5 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Bibliography 139
x
List of Tables
1.1 Coordinates in each plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.1 Definition of all Kalman Filter Matrices . . . . . . . . . . . . . . . . . . . . . 86
4.2 Definition of Kalman Filter Vectors . . . . . . . . . . . . . . . . . . . . . . . 88
xi
List of Figures
1.1 Telescope Diffraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Atmospheric Adaptive Optics . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 HCIL Optical Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Ideal vs. Aberrated PSF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 HCIL Filter Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.6 Single DM FPWC Experiments . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.7 Light Propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.8 Fourier Imaging Schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.9 Angular Spectrum Schematic . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.10 DM Nominal Shapes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.11 Controllability of Amplitude Aberrations . . . . . . . . . . . . . . . . . . . . 30
1.12 Numerical Dimension of Planes . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.1 DM Actuation Over Control History . . . . . . . . . . . . . . . . . . . . . . 51
2.2 Extrapolating Estimates in Wavelength . . . . . . . . . . . . . . . . . . . . . 65
4.1 Feedback Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.2 Detector Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.1 Monochromatic Correction with DM-Diversity . . . . . . . . . . . . . . . . . 97
5.2 Monochromatic Correction with Kalman Filter: 4 Image Pairs . . . . . . . . 98
5.3 Monochromatic Correction with Kalman Filter: 3 Image Pairs . . . . . . . . 99
xii
5.4 Monochromatic Correction with Kalman Filter: 2 Image Pairs . . . . . . . . 100
5.5 Monochromatic Correction with Kalman Filter: 1 Image Pair . . . . . . . . . 101
5.6 Broadband Correction: Pre-PCSM Extrapolated Results . . . . . . . . . . . 103
5.7 Broadband Correction: Pre-PCSM Extrapolate Individual Filters . . . . . . 110
5.8 Broadband Correction: Extrapolated Results . . . . . . . . . . . . . . . . . . 111
5.9 Broadband Correction: Extrapolated Estimate Individual Filters . . . . . . . 112
5.10 Broadband Correction: Direct Estimate Results . . . . . . . . . . . . . . . . 113
5.11 Broadband Correction: Direct Estimate Individual Filters . . . . . . . . . . . 114
6.1 Phase of Propagation Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . 122
6.2 Interferometric Measurement of Superposition . . . . . . . . . . . . . . . . . 125
6.3 Interferometric Measurement of the Influence Function . . . . . . . . . . . . 125
6.4 Contrast Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
6.5 Performance Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
7.1 Sensor Schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
xiii
List of Symbols
i Imaginary Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
λ Wavelength . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
E The complex valued electric field . . . . . . . . . . . . . . . . . . . . . . . 13
n Normal Vector, any vector normal to the Optical Axis . . . . . . . . . . . 13
q Point from which the field is known . . . . . . . . . . . . . . . . . . . . . 13
p Point of the unknown field . . . . . . . . . . . . . . . . . . . . . . . . . . 13
rp/q Vector from the unknown to the known field . . . . . . . . . . . . . . . . 13
Σ The Surface of Integration in the Rayleigh-Sommerfeld Integral . . . . . . 13
Σ′ The Surface of Integration in the Rayleigh-Sommerfeld Integral . . . . . . 13
S Differential Surface Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
z Free space propagation distance . . . . . . . . . . . . . . . . . . . . . . . 13
L{·} The lens operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
f Focal length of an imaging optic . . . . . . . . . . . . . . . . . . . . . . . 17
ξ First coordinate in an Intermediate Plane . . . . . . . . . . . . . . . . . . 17
η Second coordinate in an Intermediate Plane . . . . . . . . . . . . . . . . . 17
u First coordinate in the Pupil Plane . . . . . . . . . . . . . . . . . . . . . . 18
v Second coordinate in the Pupil Plane . . . . . . . . . . . . . . . . . . . . 18
x First coordinate in the Image Plane . . . . . . . . . . . . . . . . . . . . . 18
y Second coordinate in the Image Plane . . . . . . . . . . . . . . . . . . . . 18
F{·} The Fourier Transform Operator . . . . . . . . . . . . . . . . . . . . . . . 18
xiv
A Amplitude Distribution, Typically at a Pupil Plane . . . . . . . . . . . . . 21
φ Phase Difference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
D The diameter of the pupil . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
· Shorthand for the Fourier Transform . . . . . . . . . . . . . . . . . . . . . 34
pup Subscript Indicating the Pupil Plane . . . . . . . . . . . . . . . . . . . . . 44
g(u, v) Arbitrary Aberrated Field (Complex) . . . . . . . . . . . . . . . . . . . . 44
im Subscript Indicating the Image Plane . . . . . . . . . . . . . . . . . . . . 45
C{·} Arbitrary Linear Operator . . . . . . . . . . . . . . . . . . . . . . . . . . 45
I Intensity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
DH Subscript Indicating the Dark Hole . . . . . . . . . . . . . . . . . . . . . . 45
IDH A Scalar, Average Dark Hole Intensity . . . . . . . . . . . . . . . . . . . . 45
< Real Part of a Complex Variable . . . . . . . . . . . . . . . . . . . . . . . 46
λ0 The Central Wavelength Being Estimated . . . . . . . . . . . . . . . . . . 47
aq Amplitude for a Single DM Actuator . . . . . . . . . . . . . . . . . . . . . 48
h Deformable Mirror Physical Height Change . . . . . . . . . . . . . . . . . 48
= Imaginary Part of a Complex Variable . . . . . . . . . . . . . . . . . . . . 48
u Vector of DM Actuation Signals . . . . . . . . . . . . . . . . . . . . . . . 48
M Matrix Mapping DM Actuation to Image Plane Intensity . . . . . . . . . 48
b Matrix Operator Mapping Deformable Mirror-Aberrated Field to Image
Plane Intensity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
d Inner Product of the Aberrated Field . . . . . . . . . . . . . . . . . . . . 48
J Cost Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
µ Lagrange Multiplier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
uopt The Optimal DM Command . . . . . . . . . . . . . . . . . . . . . . . . . 49
w(λ) Intensity Weight as a Function of Wavelength . . . . . . . . . . . . . . . . 54
∆λ Bandwidth [Meters] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
λ1 The Lower Bounding Wavelength . . . . . . . . . . . . . . . . . . . . . . . 57
xv
λ2 The Upper Bounding Wavelength . . . . . . . . . . . . . . . . . . . . . . 57
δ Scalar Weight on the Lagrange Multiplier . . . . . . . . . . . . . . . . . . 58
α Amplitude of the Aberrated Field at the Pupil . . . . . . . . . . . . . . . 62
β Phase of the Aberrated Field at the Pupil . . . . . . . . . . . . . . . . . . 62
z Noisy Sensor Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . 72
x The Current State, or Electric Field . . . . . . . . . . . . . . . . . . . . . 72
n Sensor Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
H Observation Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
k Current Discrete Point in Time in Estimation . . . . . . . . . . . . . . . . 78
P Covariance of the Electric Field . . . . . . . . . . . . . . . . . . . . . . . . 78
R Sensor Noise Covariance Matrix . . . . . . . . . . . . . . . . . . . . . . . 79
K Gain Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Φ Time Update in a Discrete Time Filter . . . . . . . . . . . . . . . . . . . 83
Γ Linear Propagation of Control to Image Plane Electric Field . . . . . . . . 83
w Process Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Λ Linear Propagation of Process Noise to Image Plane Electric Field . . . . 83
Q Process Noise Covariance Matrix . . . . . . . . . . . . . . . . . . . . . . . 85
xvi
Notation
• Search Area: The area in the image plane where the coronagraph has been designed
to produce high contrast for the detection of dim companions.
• Dark Hole: A region in the aberrated image where wavefront control has been used to
recover high contrast.
• Focal Plane Wavefront Control (Controller): This terminology refers to the control law
being used in the correction algorithm
• Focal Plane Wavefront Correction: This encompasses the entire algorithm used to
correct the wavefront, including the state estimator and control law.
• < ·, · > Is the Inner Product of any Matrix and is used to evaluate the Intensity, IΣ,
of the electric field in a given area of the image plane.
• Matrix Inner Product: an inner product that produces a scalar value, intended to
describe a scalar value of the intensity in the dark hole, IDH .
• Scalar Inner Product: an element by element inner product of each value in a vector,
intended to find the intensity distribution of the electric field in the dark hole, Iim(x, y).
xvii
Chapter 1
Introduction
1.1 Science Motivation
Detecting and characterizing extrasolar planets has become a major point of focus in the as-
trophysical community. Directly imaging solar systems opens up a parameter space unavail-
able to current indirect detection methods such as radial velocity and transit photometry.
These methods are highly successful, but biased towards large planets very close to their
parent star. Even with impressive advances in these detection limits, they are only sensitive
to orbits that intersect our line of sight with the parent star. Apart from astrometry, direct
detection is the only method sensitive to face on orbits that do not cross our line of sight. It
is also capable of taking reliable spectral measurements of the planet and does not require
an orbital fit to certify the planets existence. So long as we can efficiently detect planets
through direct imaging, we can dramatically increase the number of detectable systems.
Much like galactic astronomy, planetary science relies on a large number of observations of
many systems to build up our understanding of the time evolution in solar systems. Such
models that describe the formation of solar systems and their major orbiting bodies need
as much and as detailed a data set as possible. The increased parameter space and the
spectral information that direct imaging has to offer makes it a very compelling method for
1
this purpose.
Planets are generally classified as jovian or terrestrial bodies, which in our solar system
also correlates to their mass and proximity to the Sun. It is expected that the same mass
correlation will hold, making current detection methods more sensitive to jovian planets. In
fact, over 700 planets have been detected, but the vast majority are between 100 and 10,000
times the mass of Earth [1]. This is largely due to the fact that most detection methods are
more sensitive to higher mass planets with a small orbital period. These detection schemes
do not directly image the orbiting body, but measure the effect of the planet on the stellar
signal in the form of a periodic doppler shift or drop in intensity. The Kepler mission uses
transit photometry to detect orbiting bodies, and has been highly successful [13, 12]. It is
even capable of detecting Earth analogs (and has already come close)[4], but is incapable
of spectrally characterizing the planet in any way. A major disadvantage of the indirect
detection techniques is that they are fundamentally finding a best fit solution of periodic
data to a Keplerian orbit. This requires observations over at least one orbit to make a
detection, which means these observations are biased to high mass planets very close to
their parent star. Directly imaging a planet does not exhibit the same sensitivity to mass
(though reflected area does play a role in the intensity of the planet) and only requires
enough observations to rule out the possibility of the body being a background star. Direct
imaging also opens a new parameter space of orbits that are observed to be face on, which are
undetectable by the indirect techniques. Since we are gathering light directly from the planet
we can also directly measure the spectra of a planet, and directly observe the projection of
its separation from the parent star.
To spectrally characterize any detectable planet, we seek imaging methods that are ca-
pable of directly imaging Earth-analogs. The Terrestrial Planet Finder (TPF) telescopes in
the late 1990’s to early 2000’s were NASA’s original space-based concepts for such a mission.
One was an imaging interferometer with a satellite constellation to produce a long baseline,
commonly referred to as TPF-I [2]. The second was a coronagraphic imager based on a
2
4× 8 m elliptical mirror, referred to as TPF-C. These did not become funded missions, but
the science requirements developed for them are still used as a baseline for today’s mission
concepts. More recently, another concept has been developed where the telescope is flown in
formation with a second satellite mounted to a starshade, or occulter, designed to create a
diffraction limited shadow from the star, allowing the off-axis planet to pass by unobstructed
into the telescope [16, 71]. Simultaneously, more advanced coronagraphic imaging concepts
have been developed, making these the two leading concepts for a direct imaging mission
[67, 42, 38, 73, 33, 37, 15]. Each mission has its own set of challenges, but the main objective
of each is to mitigate the diffractive effects of the telescope’s finite aperture. In all likeli-
hood, a combined mission concept will maximize performance with regard to the number of
targets than can be detected and characterized, and will mitigate risk involved with the mis-
sion. Of the two missions, the coronagraph applies most broadly to both ground and space
instrumentation. This thesis focuses on the technology development for the coronagraph
concept.
1.2 Coronagraphs and Wavefront Control
The two primary obstacles to imaging very dim objects orbiting extremely close to their
parent star are diffraction effects from the telescope and aberrations to the field from im-
perfections in the optical system. The finite aperture of the telescope results in a diffraction
pattern that leaks starlight into the region where a dim companion would otherwise be seen.
As shown in Fig. 1.1, this is not an issue of angular resolution. Neglecting any errors, an
aperture larger than approximately 2 meters is capable of distinguishing two objects sepa-
rated by 1 astronomical unit (AU) within 10 parsecs of the Earth. Fig. 1.1 shows that it
is the relative intensity of the planet and the diffracted light that limits the detectability
of a planet. This typically limits the detectable intensity of a companion to roughly one or
two orders of magnitude dimmer than its parent star. Most generally, a coronagraph can
3
−5 0 5 10 15 20 2510
−15
10−10
10−5
100
λ/D
Norm
alizedIn
tensity
2 meter Telescope
−5 0 5 10 15 20 2510
−15
10−10
10−5
100
λ/D
Norm
alizedIn
tensity
10 meter Telescope
Figure 1.1: Intensity profile of an image from a circular aperture with diameter (a) 2 meters,and (b) 10 meters. The off-axis source with unitary amplitude (red-dashed line) indicatesthat the object is resolvable by either telescope. The off-axis source with a peak intensity10−10 lower than the star’s peak intensity (solid-green line) shows the relative power of anEarth like planet and the diffracted energy from the star. If we were to solve the diffractionproblem by making the telescope larger the aperture would have to be greater than 1 km indiameter.
be defined as an optical system contained within the telescope that modifies the diffraction
pattern imposed by the telescope’s finite aperture. By attenuating the diffracted light at
small angular separations, the coronagraph lowers the detection limit of a dim companion.
The degree of suppression is quantified as contrast, a dimensionless parameter in the image
plane. The contrast of any point in the image plane is defined as a fraction of the peak
power in the point spread function (PSF) of the unobstructed aperture. A region of high
contrast is commonly referred to as the search area and the closest point to the star’s centroid
that achieves the targeted contrast level is defined as the inner working angle (IWA) of the
coronagraph. As the designed contrast and IWA decrease, the coronagraph becomes more
difficult to manufacture and simultaneously becomes more sensitive to optical aberrations.
Since a coronagraph is fundamentally sensitive to perturbations in the incident field (e.g.
a second slightly off-axis source) it exhibits an extreme sensitivity to optical aberrations that
4
distort the field, as demonstrated in Fig.1.4. These can be aberrations that are incident on
the telescope (as is the case when considering atmospheric turbulence) or they can be due to
imperfections in the optical system itself. In either case we seek to correct these wavefront
errors using deformable mirrors (DMs), computer controlled mirrors with high precision
actuators bonded to the back surface. There are many DM technologies, with varying levels
of actuator stroke, density, and precision. In all cases the purpose is to produce achromatic
phase shifts that vary arbitrarily across the plane (limited by the number spacing of the
actuators).
Adaptive optics (AO) attempt to correct the atmospheric distortion incident on the
telescope. The challenges of atmospheric AO are in the speed of correction (≈ 100’s -
1000’s of Hz), correcting over a large field of view, and the potential for a high degree of
nonlinearity [53, 51]. As shown in Fig. 1.2, a typical AO system samples the beam going
Figure 1.2: Diagram of a generic ground-based atmospheric adaptive optics system. Thewavefront sensor and DM are capable of correcting high speed aberrations that appear asphase at the DM (commonly a pupil) plane. All residual static and quasi-static speckleuncontrollable by the AO system leaves a residual that must be corrected with focal planewavefront control techniques.
to the science instrument using a dichroic mirror to measure the field at the pupil plane
5
with some form of wavefront sensing device. For the purpose of this thesis we will consider
the atmospheric AO problem to be solved by an upstream AO system, or by considering a
space-based observatory. To correct imperfections in the optical system, we are trying to
repair the wavefront to a significantly higher degree of precision (since we are conceptually
trying to repair residual static errors). This generally allows us to take approximations that
make the problem linear, but the correction time is much slower. As a result, we require
common-path techniques that account for the aberrations that reach the image plane. This
thesis focuses on model based methods to estimate the electric field at the image plane
using only the science camera, and control laws based on the estimated electric field at the
image plane (rather than pupil plane) measurements. This estimation and control problem
is commonly referred to as focal plane wavefront correction (FPWC).
1.3 The High contrast Imaging Laboratory
The High Contrast Imaging Laboratory (HCIL) at Princeton tests coronagraphs and wave-
front control algorithms for quasi-static speckle suppression. The collimating optic is a six
inch off-axis parabola (OAP) followed by two first generation Boston Micromachines kilo-
DMs in series and a shaped pupil coronagraph, which is imaged with a second six inch OAP
(Figure 1.3). We use a shaped pupil coronagraph, shown in Figure 1.4(a), and described
in detail in Belikov et al.[5]. This coronagraph produces a discovery space with a theoreti-
cal contrast of 3.3 × 10−10 in two 90◦ regions as shown in Figure 1.4(b). At the Princeton
HCIL, the aberrations in the system result in an uncorrected average contrast of approxi-
mately 1 × 10−4 in the area immediately surrounding the core of the point spread function
(PSF), which agrees with the simulations shown in Figure 1.4(d). Since the coronagraph is
a binary mask, its contrast performance is fundamentally achromatic, subject only to the
physical scaling of the PSF with wavelength. The lab can be configured with either a 635 nm
monochromatic laser diode input, or a Koheras supercontinuum source. As shown in Fig. 1.5,
6
Figure 1.3: Optical ayout of the Princeton HCIL. Collimated light is incident on two DMsin series, which propagates through a Shaped Pupil, the core of the PSF is removed with animage plane mask, and the 90◦ search areas are reimaged on the final camera.
before the supercontinuum source is injected into the laboratory experiment, it is first col-
limated by a 90◦ off axis parabolic element designed specifically for collimating/coupling of
polychromatic light from a fiber. After the light is collimated it passes through a filter wheel
where a set of interference filters allows us to sample narrow bandwidths in a ∆λ/λ0 = 20%
range around λ0 = 635 nm. After the light passes through the filter wheel it is recoupled
with a second off axis element into a second fiber made by Koheras which is designed to be
continuously single mode over the entire visible and near-infrared spectrum. This allows us
to reproduce the wavelength nature of a light coming from a star, the importance of which
is discussed in Ch. 5. Since the collimating/coupling elements rigidly attach the fiber tips
to the 90◦ OAPs, alignment of the beam is determined entirely by tip-tilt variation of the
collimated beam. To preciselly recouple the light back into the delivery fiber, the collimat-
ing element is rigidly mounted to the filter wheel and the coupling element is mounted to a
7
(mm)
(mm)
Shaped Pupil
−5 0 5
−5
0
5
(a) Shaped Pupil
λ0/D
λ0/D
Normalized PSF from Ripple3
−15 −10 −5 0 5 10 15
−15
−10
−5
0
5
10
15 −10
−8
−6
−4
−2
0
(b) Ideal PSF
(mm)
(mm)
Shaped Pupil After DMs
−5 0 5
−5
0
5
(c) Aberrated Pupil
λ0/D
λ0/D
Aberrated PSF from Ripple3
−10 −5 0 5 10
−10
−5
0
5
10
−10
−8
−6
−4
−2
0
(d) Aberrated PSF
Figure 1.4: Example of the effect of an aberrated field incident on a Shaped Pupil corona-graph. The aberrations are simulated by Fresnel propagating the measured nominal shapesof the HCIL DMs to the pupil plane. Other sources of aberrations are not included be-cause they have not been measured. (d) The PSF of the shaped pupil with the simulatedaberrations. The figures are in a log scale, and the log of contrast is shown in the colorbars.
tip-tilt stage. To eliminate ghosts, all interference filters have a small wedge between their
exterior surfaces. To guarantee a quality alignment for all of the filters for a fixed tip-tilt,
they must all be clocked inside the filter wheel so that when they are positioned within the
beam, the wedge is aligned in the same direction. To guarantee stability of the coupling
(which is sensitive on the sub-micron level) the entire optical train is sealed from the outside
environment, eliminating any air flow through the system. With the system very compact
and light, sealed, and highly rigid (since the tip-tilt mechanism is very stiff) we observe that
the coupling is reliable over a period of weeks to months once it has been aligned. Since
8
Figure 1.5: Optical Layout of the Princeton HCIL’s Filter Wheel. The light from the SuperKsupercontinuum fiber is collimated by a Thorlabs reflective collimator (c, passes through afilter wheel which contains narrow band interference filter, and is recoupled into a KoherasPhotonic Crystal continuously Single Mode (PCSM) fiber with another reflective coupler(RC08FC). The system is rigid with the exception of the tip-tilt mechanism, which is usedto align the beam for coupling back into the PCSM fiber that delivers light onto the bench.
the original HCIL experiment had proved to be limited by the stability of its old HeNe laser
and its free space coupling into a fiber this was a critical design parameter for the filtering
scheme.
The two source configurations allows for testing of control algorithms in both monochro-
matic and broadband light (typically ∼ 10− 20% of the central wavelength).The monochro-
matic experiments allow us to test controller performance very quickly, while leaving the
results independent of any chromatic effects. Once a particular algorithm has been proven
in monochromatic light, we can use the polychromatic configuration to test its performance
over a larger bandwidth.
9
1.4 Two Deformable Mirrors in Series
Focal plane wavefront control techniques have primarily been developed and tested at the
Jet Propulsion Laboratory (JPL) High Contrast Imaging Testbed (HCIT) [24], the Subaru
telescope’s Phase Induced Amplitude Apodization (PIAA) testbed (which has been decon-
structed) [33], more recently at the NASA Ames Coronagraph Experiment (ACE) [6], and
Princeton’s HCIL described in §1.3. JPL’s HCIT is the only experiment in vacuum, and
has tested several coronagraphs using the Electric Field Conjugation (EFC) algorithm. The
primary goal at the HCIT is to test the limit of ultimate achievable contrast and IWA of
each coronagraph and estimation scheme. The Subaru telescope’s PIAA testbed was used
for the initial verification of the PIAA coronagraph [33], which uses a pair of highly aspheric
mirrors as a pupil remapping system to achieve nearly lossless apodization of the pupil for
high contrast imaging at low inner working angle. This experiment has moved to JPL’s
HCIT [34] and progress with the PIAA coronagraph at the Subaru telescope has shifted
to the Subaru Coronagraphic Extreme Adaptive Optics (SCExAO) system [50]. The ACE
experiment focuses on low inner working angle coronagraphy using the PIAA coronagraph,
primarily as a technology demonstrator for critical hardware required in an Explorer class
mission. Princeton’s HCIL is unique compared to these in that we focus on the development
of estimation and control schemes, their efficacy for a true observatory environment, and
their ability to relax coronagraphic tolerance. One of the most unique components of the
HCIL, the two DMs in series, means that both amplitude and phase aberrations are fully
controllable over the entire image plane [55]. The experiments at JPL, Ames, and Subaru
only use one DM. This means they are only capable of correcting phase perturbations on
both sides of the image plane, and energy from amplitude aberrations can only be shifted
from one side of the PSF to the other. As a result they are only capable of reaching high
levels of contrast on a single side of the image plane, as shown in Fig. 1.6. The ability to
correct symmetrically in the image plane allows us to double the discovery space for planets,
but makes the control problem (particularly in broadband light) much more challenging. The
10
(a) JPL’s HCIT [23] (b) Subaru PIAA [33] (c) ACE [7]
Figure 1.6: Single DM FPWC results from (a) JPL’s HCIT [23], (b) Subaru PIAA testbed[33], and (b) NASA’s ACE [7]. All three facilities use a single DM for, which is why all theresults only exhibit a dark hole on a single side of the image plane. To achieve symmetriccorrection, as will be shown in Ch. 5, at least two DMs are needed to achieve any amountof amplitude controllability for symmetric dark hole correction.
presence of two DMs at planes that are not conjugate to the pupil plane will be an under-
lying theme to the mathematical development for wavefront estimation and control, as well
as many of the experimental challenges addressed in this thesis. By doubling the discovery
space two DMs in series increases the likelihood of detecting an exoplanet in any mission,
and adds redundancy in the wavefront control system. As a result, many wavefront control
architectures for planet finding missions assume a 2-DM system [43, 69, 31, 45, 44, 70, 14].
With Princeton’s HCIL being the only laboratory with this capability, the limitations and
results of our work provides a unique and relevant body of information for future coron-
agraphic instrumentation. Pueyo [59] proved that, to first order, two-DMs in series could
correct both amplitude and phase, showed it was achievable over a bandwidth [54], and has
indicated its necessity for coronagraphy on segmented apertures [56]. Kay [40] developed a
DM independent estimation scheme to avoid the compounding effect of optical model error
in a two-DM system and used this to generate symmetric dark holes over the largest pub-
lished areas of the image plane, albeit at more modest contrast levels. We have also begun to
develop algorithms that are capable of creating symmetric dark holes over finite bandwidths
[29]. Overall, these experiments represent the only body of work dedicated to symmetric
11
dark hole generation and this thesis is a continuation of that effort.
1.5 Fourier Optics
All of the control software and coronagraph designs that appear in this thesis were produced
using Fourier optics. To more fully appreciate their validity and limitations, the relevant inte-
grals are derived here beginning with the Rayleigh-Sommerfeld diffraction integral. Looking
Figure 1.7: Relevant coordinates, vectors, and frames to propagate light from Σ to Σ′.
at Fig. 1.7, we begin by assuming a field on an arbitrary surface, Σ, originating from the
point O. We wish to propagate this field to a plane centered about O′, a distance z away.
Using a vector notation from Kasdin and Paley [36], we define the chief ray from O to O′ as
n =rO′/O
||rO′/O||. (1.5.1)
The Rayleigh-Sommerfeld integral evaluates the incident electric field on an arbitrary point,
p, in the second plane, Σ′, from every point, q, in the first plane, Σ. We define the vector
12
from O to q and from O to p as
rq/O = u0eu + v0ev, and (1.5.2)
rp/O = ueu + vev + zez. (1.5.3)
Thus, we evaluate the vector from q to p as
rp/q = rp/O − rq/O (1.5.4)
= (u− u0)eu + (v − v0)ev + zez. (1.5.5)
With all propagation vectors in place, the Rayleigh-Sommerfeld diffraction integral [25] de-
scribes the field at point p as
E(p) =1
iλ
∫Σ
E(q)cos(n, rp/q)
||rp/q||ei
2πλ||rp/q ||dS (1.5.6)
=1
iλ
∫Σ
E(q)n · rp/q||rp/q||
ei2πλ||rp/q ||dS, (1.5.7)
where
rp/q =rp/q||rp/q||
(1.5.8)
is the unit vector pointing from q to p. Even though the result is scalar, the integral is
vector-based and quite complicated to solve. We simplify Eq. 1.5.7 by first applying the
paraxial approximation. This assumes that lateral deviation from the chief ray is so small
that every ray vector has propagated the same distance as the chief ray, making
||rp/q|| ≈ z. (1.5.9)
13
We also assume that for any combination of p and q, rp/q is parallel to n, which means
n · rp/q ≈ 1. (1.5.10)
Applying these simplifications to the terms preceding the exponential in Eq. 1.5.7, the
Rayleigh-Sommerfeld diffraction integral simplifies to
E(p) =1
iλz
∫Σ
E(q)ei2πλ||rp/q ||dS, (1.5.11)
which is the vector form of the Huygens-Fresnel integral. The application of the paraxial
approximation means that Eq. 1.5.11 is limited to narrow-field imaging. Fortunately in
coronagraphy, we are concerned with very small angular separations.
1.5.1 Propagation: Fresnel Transform
It may at first appear strange that we do not apply the approximation of Eq. 1.5.9 to the
exponential and further simplify Eq. 1.5.11. This is because the 1/λ term is of order 106−107,
which amplifies small errors due to the approximation and causes rapid 2π periodic errors
in the phase of the integrand. We instead make the Fresnel approximation by taking a first
order expansion of the ||rp/q|| term in the exponential. In scalar coordinates, this becomes
||rp/q|| = z
√1 +
(u− u0
z
)2
+
(v − v0
z
)2
(1.5.12)
≈ z
[1 +
1
2
[(u− u0)2 + (v − v0)2
]−O(2) + . . .
]. (1.5.13)
14
With this approximation we find that the Fresnel Integral is
E(u, v, z) =ei
2πλz
iλz
+∞∫∫−∞
E(u0, v0)eiπλz [(u−u0)2+(v−v0)2]du0dv0. (1.5.14)
= Fz{E(u0, v0)} (1.5.15)
These approximations only hold for a narrow field, and the following equations will not
accurately reflect the image distortion at large angular separation. The validity of the Fresnel
approximation can be determined by evaluating the next highest term in Eq. 1.5.13. The
propagation distance required for this term to contribute << 1 radian in the exponential of
Eq. 1.5.15 is
z3min >>
π
4λ
[(u− u0)2 + (v − v0)2
]2. (1.5.16)
Goodman [25] points out that this is a conservative estimate and makes arguments for
a softer constraint if the aperture has fine structure and is illuminated by uniform plane
waves. Otherwise it is safer to apply Eq. 1.5.16 to determine the validity of the integral.
In the presence of deformable mirrors and shaped pupil coronagraphs the HCIL does have
sinusoidally varying transmission and phase across virtually any plane, requiring that we
must in fact check that Eq. 1.5.16 is satisfied. Thus, for the HCIL’s standard 10mm pupil
and 550nm light (the shortest wavelength used in the laboratory), a Fresnel transform only
accurately reflects the diffraction pattern beyond a distance of zmin = 38.5cm. The shortest
free space propagation in the laboratory is between the two DMs, a distance of∼ 47cm, which
exceeds zmin by a factor of 1.2. Knowing this is a conservative calculation, this is likely large
enough, particularly since the incident field on DM1 is a uniform plane wave and there are
no amplitude variations. The distance from DM2 to the pupil plane is ∼ 1.13 m ≈ 3zmin,
which is the last plane that the Fresnel integral must be applied. Since the amplitude
non-uniformity at progressive planes is a direct result of phase to amplitude mixing, larger
15
distances would actually make the non-uniformity in the field worse. Thus it is likely that
larger propagation distances for such a small aperture would not solve the problem, and the
only solution is to use a higher fidelity integral if more precision is required. To date, the
fresnel integral has not been found to be a limiting factor but this should be taken as a point
of caution since two DMs in series have never proven dark holes below the levels reported in
this thesis.
1.5.2 Imaging: Fourier Transform
The advantage of Fourier optics is the simplicity with which we can relate the pupil and
image planes. To show this, we will consider a field incident on a lens, Ein, and evaluate
the field one focal length downstream of the optic. We first define a lens operator, which
describes the field exiting the lens as
Eout = L{Ein} (1.5.17)
In a true optic with finite thickness (even a mirror), the operator would be nonlinear and
require a continuous integral in z over the entire displacement of the lens surface, or sag.
This complicates computing the diffractive effect of a lens, so we seek an approximation to
simplify the computation. We will first assume that the lens has a parabolic profile. In
ray optics, a parabolic lens has the merit of maintaining an equal path length between the
focal point and collimated plane of the optic, regardless of the ray we choose. This implies
a constant phase change over the entire field at the output, but to see a similar advantage
in diffractive optics we must assume that the optic is infinitely thin. With its effect being
confined to a plane, its parabolic shape only contributes a phase to the incident field. Since
the primary imaging optic is an F/10 parabolic surface with a 6 inch diameter, and we under
fill the mirror by a factor of ≈ 15, the sag of the optic is less than a millimeter (< 10%) of the
relevant aperture. Additionally, we are observing sufficiently on-axis that any affect on the
16
image from the mirror curvature will not be significant, making the infinitely thin, parabolic
phase assumption valid in our system. Defining this parabolic phase to be a function of the
focal length of the optic, f , the lens operator simplifies to
L{Ein(ξ, η)} = Ein(ξ, η)e−iπλf
(ξ2+η2). (1.5.18)
The thin lens approximation is increasingly valid for larger F/# = f/D systems because the
sag is small. To propagate the output field from lens to the image plane using Eq. 1.5.15,
we will also assume that the optic is of infinite extent. The most rigorous way to quantify
this is to show that nearly 100% of the energy is contained within the optic, which typically
requires that the imaging optic be oversized compared to the incident beam by at least a
factor of two. Since we overfill by a factor of 15, we are in fact containing nearly all of the
energy from the pupil. Applying the Fresnel integral (Eq. 1.5.15) to the field L{E1(ξ, η)},
we find the field one focal length downstream of the optic (Fig. 1.8) to be
Eim(x, y, f) =ei
2πλf
iλf
+∞∫∫−∞
L{E1(ξ, η)}ei πλf [(x−ξ)2+(y−η)2]dξdη (1.5.19)
=ei
2πλf
iλf
+∞∫∫−∞
E1(ξ, η)e−iπλf
(ξ2+η2)eiπλf [(x−ξ)2+(y−η)2]dξdη
=ei
2πλf
iλfei
πλf
(x2+y2)
+∞∫∫−∞
E1(ξ, η)e−i2πλf
[ξx+ηy]dξdη
=ei
2πλf
iλfei
πλf
(x2+y2)F {E1(ξ, η)} . (1.5.20)
Neglecting the piston in phase, this shows that the electric field at this final plane is a
quadratic phase factor multiplied by the Fourier transform, F{·}, of the incident field on
an infinitely thin parabolic lens of infinite spatial extent. We define this plane as the image
plane, which corresponds to the focal point of the optic in a ray trace.
Now we will consider the special case shown in Fig. 1.8, where we generate the incident
17
Figure 1.8: The relevant planes to Fourier imaging in astronomical optics is the field, E0 anarbitrary plane a distance z prior to the imaging optic and the field immediately incident onthe plane of the optic, E1. In both cases the image plane electric field, Eim, is located onefocal length, f , downstream of the optic.
field on the optic, E1, by Fresnel propagating an arbitrary field, E0, from one focal length
upstream of the lens. In order to describe the image plane field in Eq. 1.5.20 as a function
of E0 we will relate it to the field E1 via the Fresnel integral (Eq. 1.5.15). It is pointed out
by Goodman [25] (and many optics textbooks) that this is simply the convolution
E1(ξ, η) = E0 ∗ h =
+∞∫∫−∞
E0(u, v)h(ξ − u, η − v)dudv, (1.5.21)
where the kernel of this integral is
h(ξ, η) =ei
2πλf
iλfei
πλf [ξ2+η2]. (1.5.22)
Recognizing that we ultimately seek the Fourier transform of E1(ξ, η), not E1(ξ, η) itself,
we take the Fourier transform of Eq. 1.5.21 directly. By applying the Fourier convolution
18
theorem, we find
F{E1(ξ, η)} = F{E0} F{h} (1.5.23)
=ei
2πλf
iλfF{E0}
+∞∫∫−∞
eiπλf [ξ2+η2]e−i
2πλf
[ξx+ηy]dξdη
=ei
2πλf
iλfF{E0}e−i
πλf
(x2+y2)
+∞∫∫−∞
eiπλf [(x−ξ)2+(y−η)2]dξdη
= ei2πλfe−i
πλf
(x2+y2)F{E0}. (1.5.24)
Applying Eq. 1.5.24 to Eq. 1.5.20, the image plane field becomes
Eim(x, y, f) =ei
4πλf
iλfF {E0(u, v)} . (1.5.25)
Thus, apart from a piston phase term, Eq. 1.5.25 shows that the electric field one focal length
downstream of the optic, defined earlier as the image plane, is an exact Fourier transform of
the electric field incident one focal length upstream of the optic. We define this as the pupil
plane. Throughout this thesis the image and pupil are defined as exact Fourier conjugates
of one another. For the purposes of FPWC we will use the image plane as a reference,
requiring that a pupil be exactly one focal length in front of an imaging optic. The constant
phase term in Eq. 1.5.25 is of no consequence since the original field is based off an arbitrary
reference field with zero phase, and is often neglected in many texts. The pupil plane at
the HCIL is 10 mm in diameter, designed to inscribe the kilo-DM aperture. We use a 152.4
mm diameter OAP with a 1.524 meter focal length to form the primary image. The system
is F/152.4 at the primary image, and the overfill factor of the imaging optic is 15.24, more
than adequate to use the infinite optics approximation in the control algorithms.
19
1.6 Controllability of Amplitude and Phase
Before we start producing control algorithms, we must prove that by placing two DMs in
non-conjugate planes we make both amplitude and phase aberrations controllable, allowing
us to create symmetric dark holes in the image plane. We will do so in two different ways.
First, we follow the work of Pueyo [59] and make an argument based in the pupil plane that
two DMs in series are capable of correcting both amplitude and phase. Since we ultimately
seek controllability at the image plane, we will also prove controllability by directly modeling
the effect on the image plane electric field from an arbitrary perturbation at a non-conjugate
plane. In addition to proving the controllability of both amplitude and phase at the image
plane, the result will also provide a model that we can use in the estimation and control
algorithms of Chs. 2, 3, and 4. In both cases we will treat the modulation of the DM shape
as a perturbation to the electric field at an arbitrary plane, p, upstream of the pupil, as
shown in Fig. 1.9. We account for the fact that the DM is of finite size by including the
aperture function, Ap, but the perturbation induced by the DM is exclusively in phase.
Figure 1.9: Phase perturbations at plane p are propagated to the pupil plane, defined asbeing on focal length away from the imaging optic. The lens acts as a Fourier transformingdevice, producing the image plane electric field one focal length after the optic.
20
1.6.1 Pupil Plane Controllability: Angular Spectrum
Beginning with the controllability proof by Pueyo [59], we decompose the intermediate plane,
p, shown in Fig. 1.9 into a Fourier series. We then Fresnel propagate this field to the pupil
plane and apply the pupil function, Apup, to find the pupil plane electric field, Epup. Since
we are ultimately decomposing the solution to the Fresnel integral into a Fourier series, this
is commonly referred to as the angular spectrum approximation. Given a unitary input field
incident on a DM at plane p that is inducing a phase perturbation, φ, over the aperture, Ap,
the electric field is given by
Ep(ξ, η) = Ap(ξ, η)eiφ(ξ,η) (1.6.1)
Next we expand the exponential as a Taylor series and take a first order approximation,
assuming that the phase perturbations are small enough that the second order term in the
expansion is negligible. The linearized field is
Ep(ξ, η) ≈ Ap(ξ, η) [1 + iφ(ξ, η)] (1.6.2)
We now decompose the phase perturbations at plane p (Fig. 1.9), φ(ξ, η), into a sum of spatial
frequencies. Following notation similar to Pueyo [59], we describe φ as a Fourier series in
Cartesian coordinates. Summing over the integers (n,m) ∈ [−∞,∞] with amplitude bm,n,
the linearized phase perturbation induced by the DM is
φ(ξ, η) =∑m,n
bm,nei 2πD
(mξ+nη). (1.6.3)
21
Applying Eq. 1.6.3 to Eq. 1.6.2, the linearized field at plane p becomes
Ep(ξ, η) ≈ Ap(ξ, η) + iAp(ξ, η)∑m,n
bm,nei 2πD
(mξ+nη) (1.6.4)
, Enom + Epert, (1.6.5)
where we have defined the unperturbed, or nominal, component of the field as Enom and
the perturbed component of the field induced by the DM, Epert. The field due to the DM
perturbation can now be propagated a distance z to a second plane via a Fresnel Trans-
formation. For simplicity in notation we assume that this is the pupil plane, as shown in
Fig. 1.9. Applying Eq. 1.5.15, the effect of the perturbation, Epert, at the pupil plane is
Epert,pup(u, v) = i∑m,n
bm,nFz{Ap(ξ, η)ei2πD
(mξ+nη)} (1.6.6)
= i∑m,n
bm,nei
2πλz
iλz
+∞∫∫−∞
Ap(ξ, η)ei2πD
(mξ+nη)eiπλz
[(u−ξ)2+(v−η)2
]dξdη. (1.6.7)
Applying the coordinate transformations,
(u− ξ)2 =
(u− ξ − mλz
D
)2
−(mλz
D
)2
+ 2u
(mλz
D
)− 2ξ
(mλz
D
)(v − η)2 =
(v − η − nλz
D
)2
−(nλz
D
)2
+ 2v
(nλz
D
)− 2η
(nλz
D
),
the sum of spatial frequencies can be pulled out of the integral, making the field
Epert,pup(u, v) = i∑m,n
bm,nei 2πD
(mu+nv)e−iπλzD2 (m2+n2)
· ei 2πλz
iλz
+∞∫∫−∞
Ap(ξ, η)eiπλz
[(u−ξ+mλz
D )2+(v−η+nλz
D )2]dξdη.
(1.6.8)
22
The result is the angular spectrum approximation. It is the same perturbation expansion
as Eq. 1.6.7, but simplifies the integrand of the Fresnel transform by using shifted output
coordinates,
u′ = u− mλz
D(1.6.9)
v′ = v − nλz
D. (1.6.10)
We indicate this coordinate shift in our operator notation with a subscript, Fz{·}(u′,v′).
Applying this notation, Eq. 1.6.8 is written as
Epert,pup(u, v) = i∑m,n
bm,nei 2πD
(mu+nv)e−iπλzD2 (m2+n2)Fz{Ap(ξ, η)}(u′,v′). (1.6.11)
Eq. 1.6.11 shows that the perturbed field at plane p can be computed as a sum of shifted and
complex weighted values of the fresnel transformed nominal field, Ap. Each term is shifted
to the corresponding spatial frequency in the summation. Examining the value of this shift
for the given laboratory parameters,
λmax ≈ 700 nm
nmax,mmax = 16 cycles/aperture
zmax ≈ 1.6 m
D = 10.8 mm,
we see that the maximum shift relevant for wavefront control is
nmaxλmaxzmaxD
= 1.65 mm. (1.6.12)
In our nominal control configuration at 10 cycles/aperture in 635 nm light this value becomes
less than 1 mm. Even so, this is a significant fraction of the 10 mm pupil diameter used
23
in the laboratory so these shifts should not be neglected if this transformation were to be
used to compute the field for control. Interestingly, the sharp edges of a shaped pupil mean
that we cannot truncate the series to a small set for fear of introducing a Gibbs effect into
numerical model. This precludes the utility of the angular spectrum factor for control, but it
is convenient for rigorously proving that (to first order) two DMs can control both amplitude
and phase aberrations. Recalling from Eq. 1.6.4 that the entire field incident on the pupil is
a sum of the nominal field and the perturbed component, we write the incident field on the
pupil plane as
Epup(u, v) = Fz{Ap(ξ, η)}+ i∑m,n
bm,nei 2πD
(mu+nv)e−iπλzD2 (m2+n2)Fz{Ap(ξ, η)}(u′,v′). (1.6.13)
Applying the pupil function, Apup(u, v), to the incident field we find
Epup(u, v) =Apup(u, v)Fz{Ap(ξ, η)}+
i∑m,n
bm,nei 2πD
(mu+nv)e−iπλzD2 (m2+n2)Apup(u, v)Fz{A(ξ, η)}(u′,v′).
(1.6.14)
We will make one more simplification to Eq. 1.6.14 by assuming that the aperture function
of the DM, Ap(ξ, η), sufficiently overfills the pupil plane aperture, Apup, such that
Apup(u, v)Fz{Ap}(u,v) ≈ ApupFz{Ap}(u′,v′) ≈ Apup(u, v). (1.6.15)
This assumption is non-intuitive because we have decomposed the field into spatial frequen-
cies, but it is equivalent to assuming that the Fresnel ringing from the edges of the DM is
negligible because it is blocked by the aperture at the pupil plane, Apup(u, v). Under this
assumption the perturbed field at the pupil plane, Eq. 1.6.14, simplifies to
Epup(u, v) ≈ Apup(u, v)
[1 + i
∑m,n
bm,nei 2πD
(mu+nv)e−iπλzD2 (m2+n2)
]. (1.6.16)
24
To prove controllability of amplitude and phase, Pueyo [59] first points out that if the phase
perturbations are small enough we may approximate the aberrated electric field, Eabr, with
arbitrary amplitude aberrations, Aabr, and phase aberrations, φabr, as
Eabr(u, v) = Aabr(u, v)eiφabr(u,v) (1.6.17)
≈ Aabr(u, v)(1 + iφabr(u, v)). (1.6.18)
Thus, in the linear approximation, phase aberrations are purely imaginary and amplitude
aberrations are purely real in the pupil. He then points out that the quadratic exponential
in Eq. 1.6.16 effectively rotates the phase of each term in the series, making the contribution
of each term a mixed complex value rather than a purely imaginary term. By linearizing
this quadratic term and assuming there is a second DM exactly at the pupil plane, Pueyo
[59] used a phasor representation to show that both real and imaginary aberrations can be
perfectly conjugated at the pupil plane if we use a second DM at plane p to perturb the
field. Since each component of the series summation can exactly conjugate an aberration at
the pupil, It follows that these aberrations have been suppressed in the image plane [59].
1.6.2 Image Plane Controllability: The Propagation Factor
The argument made in Pueyo [59] was the first rigorous proof that DMs at non-conjugate
planes were capable of controlling both amplitude and phase, but it required several approx-
imations and linearizations of field perturbations at both the intermediate and pupil planes.
We would like to understand the effect of a truly arbitrary perturbation from a DM at plane
p on the image plane because this is ultimately where we want to control the field. Doing
so necessitates propagating a non-conjugate plane all the way to the image plane so that we
can compute the control effect, and evaluate how well we can correct a complex-valued field
as a function of location in the image plane.
Since we are also trying to produce a numerical model for control, we begin by assess-
25
x (mm)
y(m
m)
DM1 Nominal Surface
−5 −4 −3 −2 −1 0 1 2 3 4 5
−5
−4
−3
−2
−1
0
1
2
3
4
5
Surface
Pro
file
(nm)
0
100
200
300
400
(a) DM1 Nominal Shape
x (mm)
y(m
m)
DM2 Nominal Surface
−5 −4 −3 −2 −1 0 1 2 3 4 5
−5
−4
−3
−2
−1
0
1
2
3
4
5
Surface
Pro
file
(nm)
−300
−200
−100
0
100
200
300
(b) DM2 Nominal Shape
Figure 1.10: Interferometric measurement of the uncontrolled shape of the HCIL’s two kilo-DMs. Note the complex structure of the nominal surface. The amplitude of the high spatialfrequency component is ≈ 20− 30 nm on each surface. Low order spherical and cylindricalmodes have amplitudes on the order of 100’s of nanometers.
ing the accuracy of numerically propagating the DM surfaces via the Fresnel integral. As
shown in Fig. 1.10, the nominal surface of each DM is quite complex and contains high
spatial frequency errors with amplitudes of approximately 5% of the wavelength used in the
experiment. Indeed, the high resolution of the DM actuation also implies commands that
oscillate rapidly across the aperture. Capturing this well would require very high sampling
when we solve the Fresnel integral numerically, dramatically increasing our sensitivity to
numerical error and making the computation very slow (a bad thing for real-time control).
We seek to simplify our transformation from plane p to the image plane so that we will be
less sensitive to the discretization of our numerical integrator. This simplification will also
clearly demonstrate how the image plane electric field, Eim(x, y), changes as we move the
DM a distance z from the pupil.
Beginning with Eq. 1.6.1, we Fresnel propagate an arbitrary field at plane p to the pupil
26
plane. Applying the pupil function Apup, the total field at the pupil plane is
Epup(u, v) = Apup(u, v)Fz{Ap(ξ, η)eiφ(ξ,η)} (1.6.19)
= Apup(u, v)ei
2πλz
iλz
+∞∫∫−∞
Ap(ξ, η)eiφ(ξ,η)eiπλz [(u−ξ)2+(v−η)2]dξdη (1.6.20)
=[ApupApe
iφ]∗ h(u, v), (1.6.21)
where the kernel to the convolution is
h(u, v) =ei
2πλz
iλzei
πλz
(u2+v2). (1.6.22)
Using the Fourier transform relationship between the image and pupil plane given by Eq. 1.5.25,
the field at the image plane is
Eim(x, y) =ei
4πλf
iλfF{[ApupApe
iφ]∗ h(u, v)
}(1.6.23)
=ei
4πλf
iλfF{ApupApe
iφ}F{h(u, v)}
(1.6.24)
= −ei 2πλ
(2f+z)
λ2fzF{ApupApe
iφ}+∞∫∫−∞
eiπλz (u2+v2)e−i
2πλf
[ux+vy]dudv (1.6.25)
= −ei 2πλ
(2f+z)
λ2fzF{ApupApe
iφ}+∞∫∫−∞
e−i πz
λf2 (x2+y2)eiπλz [(u−
zfx)2+(v− z
fy)2]dudv (1.6.26)
=ei
2πλ
(2f+z)
iλfF{ApupApe
iφ}e−i πz
λf2 (x2+y2). (1.6.27)
By applying a quadratic phase term at the image plane, Eq. 1.6.27 accounts for the prop-
agation of an arbitrary field from plane p to the pupil plane. We refer to this quadratic term
as the propagation factor. For control, the power of this result lies in the fact that we do not
have to integrate the field twice to compute the field at the image plane from a perturba-
tion at plane p. Note that in most scenarios, particularly with a shaped pupil coronagraph,
27
we can simplify Eq. 1.6.27 by recognizing that the aperture function will underfill the DM
aperture, making ApupAp ≈ Apup. This is not a critical simplification, but makes the result
slightly easier to understand since the PSF is dominated by the coronagraph.
The propagation factor in Eq. 1.6.27 mixes the effect of the DMs perturbation to the
electric field, φ, between real and imaginary parts in the image plane. One DM at a non-
conjugate plane cannot correct both amplitude and phase aberrations by itself, but rather
corrects a specific mixture of the two dictated by the propagation factor. Having established
one DM a distance z = zp away from the pupil, we will include a second DM at yet another
plane, q, a distance zq away from the pupil. By assuming that the contribution of each DM
at the image plane electric field is additive for sufficiently small stroke, as proven in Pueyo
and Kasdin [54] and Pueyo et al. [55], we aim to show sufficient coverage of the real and
imaginary parts of the electric field in the image plane. It is now important to note that
showing independent and simultaneous control of the real and imaginary parts of the field
is equivalent to demonstrating control over amplitude and phase. The choice of one over
the other is a matter of convenience in representing the field. We will find in the estimation
and control chapters that it is more convenient to represent the image plane in real and
imaginary parts, so for the sake of consistency we will also prove controllability of the image
plane electric field in the same manner.
The choice of the propagation distance for each DM, zp and zq, to the pupil is not critical
except to guarantee enough phase to amplitude mixing once the field reaches the pupil. In a
conventional system, one DM would be conjugate to the pupil and the second non-conjugate
to the pupil. However, this has the disadvantage of requiring additional re-imaging optics to
conjugate one mirror to the pupil. We will show that conjugating one DM to the pupil also
reduces our coverage of the image plane, meaning that we will have poor controllability over
both real and imaginary parts of the field across the entire image plane. To clarify this second
point, we examine the contribution of a DM in Eq. 2.1.5 when it is non-conjugate to the
pupil. Applying Euler’s formula to Eq. 1.6.27, we find the real and imaginary components
28
of the field to be
Eim(x, y) =ei
2πλ
(2f+z)
iλfF{[ApupApe
iφ]}
·[sin
(πz
λf 2(x2 + y2)
)+ i cos
(πz
λf 2(x2 + y2)
)].
(1.6.28)
We now see that the contribution of the propagation factor for a single DM oscillates between
being purely real and purely imaginary across the image plane. The consequence is that one
DM non-conjugate to the pupil exhibits regions in the image plane with poor mixing between
real and imaginary components of the field. Using relevant parameters from the HCIL, and
choosing zp = z1 for DM1 and zq = z2 for DM2, Fig. 1.11 shows the overall effect this has in
the image plane by plotting the magnitude of the real term in Eq. 1.6.28, | sin( πzλf2 (x2 +y2))|.
Fig. 1.11(a) and Fig. 1.11(b) shows that there is a rapid oscillation in the degree to which
the control effect becomes either purely real or purely imaginary, even at low to mid spatial
frequencies. By choosing z2 such that its propagation factor oscillates with a different period
of the image plane, we have the ability to simultaneously control the real and imaginary
parts of the field over the entire image plane. For example, Fig. 1.11(a) shows that the
magnitude of the real component of the first DM’s propagation factor is very close to zero
at approximately 10λ0/D. Looking at Fig. 1.11(b), we see that the magnitude of the real
term for the second DM’s propagation factor is much larger. The fact that the mixing from
each propagation factor is different means that we can arbitrarily correct either the real or
imaginary part of the field at this location of the image plane. Another way to see this is
to actuate the same spatial frequency, w, on each DM, each with its own amplitude, b1 and
b2. Scaling by the wavelength, λ, the aperture size, D, and the focal length of the imaging
optic, f , If we choose DM1’s perturbation to be
φ1(ξ, η) = b1 cos
(2πwDξ
λf
), (1.6.29)
29
λ0/D
λ0/D
Real Term for DM1
−10 −5 0 5 10
−10
−5
0
5
10
0
0.2
0.4
0.6
0.8
1
(a) DM1 Propagation Factor
λ0/D
λ0/D
Real Term for DM2
−10 −5 0 5 10
−10
−5
0
5
10
0
0.2
0.4
0.6
0.8
1
(b) DM2 Propagation Factor
λ0/D
λ0/D
Real Part, Overlay of Both DMs
−10 −5 0 5 10
−10
−5
0
5
10
0
0.2
0.4
0.6
0.8
1
(c) Real Part Combined
λ0/D
λ0/D
Imaginary Part, Overlay of Both DMs
−10 −5 0 5 10
−10
−5
0
5
10
0
0.2
0.4
0.6
0.8
1
(d) Imaginary Part Combined
Figure 1.11: Effect of the DM propagation on the real and imaginary parts of the electricfield at the image plane. (a) and (b) show the magnitude of the real part due to theangular spectrum factor. (c) and (d) overlay the contribution of both DMs to the real andimaginary parts of the field respectively, indicating that there is good coverage of both realand imaginary terms in search area up to the control limit of the DM. This indicates that thesystem has a high degree of controllability over both amplitude and phase in monochromaticlight.
where ξ is the physical coordinate in the DM1 plane. We also apply the same shape for
30
DM2, making its perturbation to the field
φ2(σ, τ) = b2 cos
(2πwDσ
λf
), (1.6.30)
where σ is the physical coordinate in the DM2 plane. We now assume that b1 and b2 are
small enough that
A1(ξ, η)eiφ1(ξ,η) ≈ A1(ξ, η)(1 + iφ1(ξ, η)) (1.6.31)
A2(σ, τ)eiφ2(σ,τ) ≈ A2(σ, τ)(1 + iφ2(σ, τ)) (1.6.32)
we can use Eq. 1.6.27 to write the perturbation incident on the image plane from DM1 and
DM2 as
Epert,DM1 =ei
2πλ
(2f+z1)
iλfF{iApupb1 cos
(2πwDξ
λf
)}e−i πz1
λf2 (x2+y2) (1.6.33)
=b1
2
ei2πλ
(2f+z1)
iλfe−i πz1
λf2 (x2+y2)F {Apup} ∗ (δ(x− wD) + δ(x+ wD)) (1.6.34)
=b1
2
ei2πλ
(2f+z1)
iλfe−i πz1
λf2 (x2+y2)[F {Apup}(x−wD) + F {Apup}(x+wD)
], (1.6.35)
and
Epert,DM2 =ei
2πλ
(2f+z2)
iλfF{iApupb2 cos
(2πwDξ
λf
)}e−i πz2
λf2 (x2+y2) (1.6.36)
=b2
2
ei2πλ
(2f+z2)
iλfe−i πz2
λf2 (x2+y2)F {Apup} ∗ (δ(x− wD) + δ(x+ wD)) (1.6.37)
=b2
2
ei2πλ
(2f+z2)
iλfe−i πz2
λf2 (x2+y2)[F {Apup}(x−wD) + F {Apup}(x+wD)
](1.6.38)
respectively. With the same spatial frequency applied to both DMs, we apply two shifted
copies of the PSF at the same locations in the image plane. They will each have an amplitude
chosen by the amplitude of the perturbation applied to the DM, and a phase dictated by
31
their propagation factor. Since their total contribution to the image plane is
Epert = Epert,DM1 + Epert,DM2 (1.6.39)
=ei
4πλf
i2λf
[b1e
i 2πλz1e−i πz1
λf2 (x2+y2) + b2ei 2πλz2e−i πz2
λf2 (x2+y2)]·[
F {Apup}(x−wD) + F {Apup}(x+wD)
], (1.6.40)
we can vary b1 and b2 relative to each other to produce whatever ratio of real and imaginary
parts we like at the image plane location (x, y) = (±wD, 0). If we describe the perturbation
from each DM surface as the sum of all controllable spatial frequencies we can extend this
effect to any point in the image plane. Thus, the coverage provided by complementary
propagation factors gives us sufficient degrees of freedom to simultaneously correct real and
imaginary components of the field. As seen in Eq. 1.6.40, it is this ability that allows us to
create symmetric dark holes in the image plane, as opposed to the single sided correction
shown in Fig. 1.6. To demonstrate the quality of coverage over the entire image plane,
Fig. 1.11(c) and Fig. 1.11(d) overlays the contribution of both DMs to the real and imaginary
parts of the field, respectively. Their relative separations favors coverage for the real part
of the field at small working angles, but both real and imaginary components exhibit good
coverage and never go to zero (note that inside the core of the PSF the value does not
matter). In the example shown, the only nulls are nearly at the 16λ/D controllable limit of
the the HCIL DMs (a consequence of the fact that the maximum spatial frequency a DM
can directly command is half of the number of actuators across the aperture).
We have now shown that the absolute value of zp and zq matters much less for con-
trollability than their relative magnitudes. Looking to Eq. 1.6.28, if zp/zq were a multiple
of 2π they would have identical propagation factors and we would have poor coverage in
certain areas of the image plane. This effect was also recognized by Shaklan and Green [63],
but in the pupil plane. He points out that if a particular spatial frequency of the field can
effectively be reconstructed if it is propagated by the Talbot distance, zt = 2D2/λw2 in our
32
notation. This is equivalent to the argument made here for controllability in the image plane.
Furthermore, had we chosen the DM at plane q to be conjugate to the pupil by including
re-imaging optics, zq would be zero and the DM at plane q would not be able to compensate
for regions where we have poor control with the non-conjugate DM at plane p.
Having demonstrated good controllability of both the real and imaginary parts of the
image plane electric field over the entire controllable space, we can guarantee good con-
trollability of both amplitude and phase aberrations incident on the system. Additionally,
Eq. 1.6.27 gives us a model for computing the control effect of a DM at a non-conjugate
plane. We will use this propagation factor extensively in lieu of applying multiple numerical
transformations to produce our estimation and control algorithms.
1.7 Numerical Transform
Since the purpose of the DMs is to produce arbitrary surfaces that correct an arbitrary set
of aberrations at the image plane (within the controllable limit of the DM), we cannot rely
on analytical integration to find solutions for the DM shapes. Thus, our control laws require
that we use numerical integration techniques to relate the DM, pupil, and image planes using
the Fourier optics techniques described in §1.5. To that end we will use a matrix formulation
to take the two-dimensional Fourier transform. For the purposes of defining zero at the
center of each plane we will consider the dimension of the pupil and image as a quarter
plane, defined by the number of elements (Npup,Mpup) and (Nim,Mim) respectively. The
dimension of each plane is defined in Table 1.1.
Pupil Coordinates Dimension
u (2 ∗Npup)× 1
v (2 ∗Mpup)× 1
Image Coordinates Dimension
x (2 ∗Nim + 1)× 1
y (2 ∗Mim + 1)× 1
Table 1.1: Coordinates in each plane
33
Keeping in mind the dimension of Table 1.1, we discretize the integration along the
coordinates (u, v) in the pupil as (uk1 , vk2), where (k1, k2) are integers from −Npup : Npup and
−Mpup : Mpup respectively. Likewise, we discretize the image plane coordinates, (x, y), as
(xj1 , yj2), where (j1, j2) are integers from −Nim : 0 : Nim and −Mim : 0 : Mim respectively.
Using this notation the pupil field, A, is discretized with indices Ak1,k2 .Using the discretized
pupil and image plane coordinates, we write the Fourier transform in Eq. 1.5.25 as a finite
sum. Following a method similar to [72, 15], we describe the discrete form of the Fourier
transform, denoted by ·, at a particular pixel location in the image plane as
Aj1,j2 =
Npup∑k1=−Npup
Mpup∑k2=−Mpup
e−i2πλfvk2
yj2Ak1,k2e−i 2π
λfuk1
xj1 . (1.7.1)
Using the column vectors defined in Table 1.1, each element of the image Aj1,j2 can
be directly encoded into a two-dimensional matrix by writing the elements of the Fourier
integrand as
A = e−i2πλf
(y·vT ) · A · e−i 2πλf
(u·xT ) · du · dvλf
, (1.7.2)
where du and dv are the physical dimension of each pixel in the pupil plane, λ is the wave-
length under consideration, and f is the focal length of the imaging optic. Describing an
34
individual element of A as ak1,k2 , the elements of the resultant matrices are
e−i2πλf
(y·vT ) = exp
−i2πλfy−Mim
v−Mpup . . . y−MimvMpup
......
yMimv−Mpup . . . yMim
vMpup
(1.7.3)
A =
a−Mpup,−Npup . . . a−Mpup,Npup
......
aMpup,−Npup . . . aMpup,Npup
(1.7.4)
e−i2πλf
(u·xT ) = exp
−i2πλfu−Npupx−Nim . . . u−NpupxNim
......
uNpupx−Nim . . . uNpupxNim
(1.7.5)
Where 1/λf scales the amplitude of the result and du · dv scales the resultant image plane
in physical units. Note that the constant phase factor and imaginary term (which itself is
only a −π/2 phase shift across the plane) have been left out because it is unnecessary for the
control code. The first summation of Eq. 1.7.1 is found for every element along the entire
first index by multiplying the last two matrices of Eq. 1.7.2. We define this intermediate
matrix as
G(j1, k2) = A · e−i 2πλf
(u·xT ) · du · dvλf
(1.7.6)
The rows of G(j1, k2) encode the values of the pupil index along the second dimension, k2, and
the columns encode the values of the image plane index along the first dimension, j1. Thus,
we index an individual element of G as gj1,k2 . The final matrix multiplication completes the
summation, making the two matrix multiplications equivalent to evaluating
gj1,k2 =
Npup∑k1=−Npup
Ak1,k2e−i 2π
λfuk1
xj1 (1.7.7)
Aj1,j2 =
Npup∑k2=−Npup
e−i2πλfvk2
yj2gj1,k2 (1.7.8)
35
for every discrete location in the image plane, (j1, j2). Comparing the two matrix multiplica-
tions in Eq. 1.7.2 to the scalar equations (Eq. 1.7.7, 1.7.8), each element described by gj1,k2
is encoded in the dimensionality of the matrix in G(j1, k2). The final multiplication will
encode the scalar value described by Aj1,j2 in each pixel of the matrix, thereby construct-
ing the two-dimensional Fourier transform. The computational merit of this method is well
described and compared to other numerical methods in [72]. One of the key advantages of
this method over a standard FFT is that the physical dimension of the second plane can
be changed, which is critical to astronomical imaging where the extent of the pupil plane is
orders of magnitude larger than the image plane. The flexibility of the numerical transform
with regard to dimension in Eq. 1.7.2 makes it much simpler to sample the image plane
appropriately.
With regard to choosing the spatial sampling in each plane, there are several factors to
consider. In the image plane, we will seek to normalize to the peak value of the PSF, which is
located at (0, 0). For this reason, the peak value may not be accurately reflected in an even set
of pixels since the exact center is not contained within any one pixel. Thus, an odd number
of pixels is chosen to sample the image plane. In the pupil plane, the choice is the opposite.
In many cases, we seek to exploit symmetry principles to simplify the computation of the
transform as in [15]. Two more practical reasons apply to the computation of the DM shapes
and manufacturing of the pupil. The DMs we use at the Princeton HCIL have an even set of
actuators across the surface which means that the functions it can generate are fundamentally
based on an even set across the aperture. Thus we can more accurately represent their surface
with an even data set. More importantly, real world manufacturing processes define shapes
based on the physical coordinates of a boundary, the precision of which is embedded in
the numerical precision of the coordinates. There is a subtle difference between defining
the aperture of the coronagraph as a set of boundaries (as in the manufacturing process)
and defining the aperture as a two-dimensional set of values (as the design process using
numerical transforms). Defining an odd set in the manner shown in Fig. 1.12(a) artificially
36
oversizes the pupil by a half a pixel in each direction. This can be repaired, but at the
cost of shifting the definition of the center within a particular discrete value. No matter
how you define the odd set, the edge and the center cannot share a common reference. If
one is edge defined, the other is guaranteed to be center defined. To solve this, an even
set of pixels is chosen, but Fig. 1.12(b) demonstrates that this can also be done incorrectly.
In this case the definition of the physical coordinates has resulted in a half pixel shift of
the pupil, which will artificially impose a tip-tilt discrepancy in the phase computed for
the image plane electric field. Fig. 1.12(c) demonstrates the proper definition of physical
coordinates for an even discretization in the pupil plane. This set produces pixels whose
physical coordinates are edge defined. Unlike the odd case both the center and edges of the
pupil are edge defined, dramatically simplifying the physical mapping of the pixel values to
the boundary coordinates required to manufacture the pupil.
In summary, the appropriate method for discretizing these planes is to create an edge-
defined even set for the pupil (or any intermediary plane) and a center-defined odd set in the
final image plane. The matrix representation of the numerical Fourier transform shown in
this section is highly reliable and efficient while maintaining a degree of flexibility necessary
to dimension each plane appropriately for astronomical imaging.
37
(a) Odd Set: Oversized Pupil (b) Bad Even Set: Shifted Pupil
(c) Good Even Set: Edge Defined
Figure 1.12: (a) and (b) represent poor choices for the numerical dimension of a pupil plane,whereas (c) demonstrates the ideal way to define an aperture for producing a coronagraph
38
1.8 Thesis Overview
In this thesis we will use the tools developed in this chapter to develop FPWC algorithms
to correct aberrations from the system optics that degrade the high contrast regions of our
coronagraphic image. By doing so, we will be able to recover small regions of high contrast
where we may once again search for planets. Ch. 2 develops the control laws and strategies
that suppress the aberrated field in both monochromatic and broadband light. To reduce
the number of exposures required for wavefront estimation we also develop an extrapolation
technique that uses a single monochromatic estimate in the broadband wavefront control
algorithm. In Ch. 3 we introduce the DM diversity estimation algorithm, the most common
estimation scheme used for FPWC at any laboratory. It is a batch process estimator that
computes the field via a left pseudo-inverse, which provides least squares minimal error on
the field. The measurements are differential measurements at the image plane with conjugate
DM shapes. We also address how to choose the probe shapes so that the problem is well
posed and we guarantee measurements with high signal-to-noise. In Ch. 4 we develop a new
estimation scheme to replace the DM diversity estimator, utilizing a Kalman filter. The
filter still uses the same probe scheme, but uses fewer measurements since we are able to
close the loop on the state estimate. We demonstrate how this allows us to operate more
efficiently and robustly, since we can rely on measurements taken when the aberrated field
was brighter. Finally we discuss a method by which we can use the control signal itself
to probe the field, enabling closed loop estimation using only a single measurement each
iteration. Ch. 5 and Ch. 6 show the experimental results using these estimation and control
schemes, and then discusses the modeling and experimental limitations that limit our ability
to supress the field. In Ch. 7 we make our final conclusions and point to possible future
directions of this work, particularly in the area of electric field estimation.
As might be surmised in this overview, the challenge of this problem is not just the op-
tics; the level of precision we must adhere to makes this an inherently challenging estimation
and control problem. Beyond that our fundamental measurement of error, the accuracy and
39
precision with which we can manipulate the electric field, is completely unobservable. As
such, our criteria for estimator and controller performance is not simply quantified by the
ultimate achievable contrast, but the efficiency and robustness of the correction algorithm
as well. At any point, this could be limited by the experiment, our model of the exper-
iment from which we derive the estimation and control laws, or the correction algorithm
itself. Improving our efficiency and robustness drives many of the choices we make in the
mathematical development in this thesis because without improvements in all three of these
categories a wavefront control system will never work in a real space observatory.
1.9 Chapter Assumptions
The following assumptions were made in the derivations of this chapter.
§1.5 - From Rayleigh-Sommerfeld to Huygens-Fresnel:
• The paraxial approximation; the magnitude of any propagation vector is equal to that
of the chief ray.
• The on-axis approximation, which assumes that the dot product of any ray with the
chief ray is equal to one.
§1.5.1 - From Huygens-Fresnel to the Fresnel Integral:
• A first order binomial expansion of the propagation distance in the exponential, the
validity of which is determined by Eq. 1.5.16.
§1.5.2 - From Fresnel to Fourier Transform Imaging:
• Infinitely thin optics of infinite extent (or large enough to effectively capture all of the
energy)
• The optic is modeled as only applying a quadratic phase to the field incident on the
optic
40
1.6.1 - Pupil Plane Controllability: Angular Spectrum:
• The phase perturbations at the pth plane upstream of the pupil are small enough that
a first order linearization may be taken in the exponential.
• The pupil aperture is undersized enough compared to the aperture at the pth plane
that its Fresnel propagation (and the shifted Fresnel integral) have no effect on the
field at the pupil plane. In principle this need only be undersized enough to cover the
Fresnel ringing induced by the propagation of the aperture at the pth plane.
1.6.2 - Image Plane Controllability: The Propagation Factor:
• The contribution of each DM perturbation is additive in the image plane.
• For the analytical proof of controllability using a single spatial frequency, we assume
that the control amplitudes are small enough that a first order linearization of each
DM plane is valid and that the DM can be described as a superposition of spatial
frequencies. This does not rule out controllability for larger stroke, in this scenario an
argument was made based on coverage of phase mixing.
• A continuous set of spatial frequencies are controllable up to the controllable limit of
the DM.
41
Chapter 2
Focal Plane Wavefront Control
The goal of any wavefront correction algorithm in high contrast imaging is to reduce the
intensity of the aberrated field to a level that makes a planet detectable. Quantifying a
detection limit is in itself complicated because in addition to the residual aberrated field
we must account for photon and detector noise, background from the exozodiacal light,
and integration time. For simplicity in the control algorithms and in quantifying their
performance we simply quote directly measured values of the normalized intensity relative
to the peak power of the PSF.
In contrast to conventional adaptive optics systems where a non-common path wavefront
sensor is used to control large perturbations in the field in a fast feedback loop, focal plane
wavefront control (FPWC) does not include a wavefront sensor. The non-common path
of the wavefront sensor in conventional AO makes it impossible to accurately measure the
electric field at the science camera to the required level of precision (better than a part in 105
for an Earth-like planet). Extreme-AO systems currently under development will attempt to
calibrate the non-common path to reach contrast levels of 10−6−10−7, but this has yet to be
proven [48, 8]. In order to eliminate all non-common path elements the field at the science
camera must be measured directly. Since the science detector can only measure intensity the
field must be estimated, which is the topic of Chs. 3 and 4. We also seek a control law for the
42
DMs, which are upstream of the pupil (Fig. 1.3), based on the estimated electric field at the
image plane. To do so requires that we map the effect of the DMs on the image plane electric
field, which will rely on the propagation equations derived in Ch. 1. As a result the control
laws derived in this chapter are model based, requiring that we have a good measurement of
critical physical parameters such as aperture sizes, focal lengths, and propagation distances
between critical planes. Most critical is the DM model, which describes the mirror shape as
a function of the applied voltages.
2.1 Monochromatic Wavefront Control
There are a variety of control laws we may choose for this problem. Energy Minimization is
one of the first attempts at a model-based control law; it computes actuator commands that
minimize the intensity in a region of the image plane [49, 11]. This controller proved to be
numerically unstable because the field at the image plane must be inverted to compute the
DM commands. Since this matrix is driven towards zero it comes close to being singular,
making the computed control highly inaccurate. Electric field conjugation regularizes the
inversion by driving the field to a targeted value (typically the theoretical PSF) rather than
attempting to drive the field to zero [23, 24]. This guarantees inversion, but is highly sensitive
to the linearity of the control since it does not regulate the actuator strength. There is also
no guarantee that the targeted field is reachable. Rather than using feedback control in this
manner, we will seek to optimize the control effect in some fashion. Rather than using our
control signal from the DM to minimize the average contrast in our dark hole, IDH , we seek
a solution that minimizes the deformation across the DM’s surface under the constraint that
this achieve our targeted contrast level, 10−C . The algorithm developed by Pueyo et al. [55]
achieves this by minimizing the sum of the squares for all the actuator strengths subject to
the constraint that it achieve a specified average contrast in the area we seek to create a
dark hole. To compute the affect of a DMs control signal and the incident aberrations on the
43
average contrast value in the image plane, we must propagate the resulting pupil plane field
to the image plane via Fourier and Fresnel transformations developed in Ch. 1. Assuming
an aberrated field being added to the nominal field incident on the pupil function, A(u, v),
the pupil plane electric field is given by
Epup(u, v) = A(u, v)(1 + g(u, v))eiφ(u,v), (2.1.1)
where g(u, v) is the complex aberrated electric field and φ(u, v) is the total phase perturbation
induced by the DM. The DM perturbation will be added to a pre-existing (also referred to as
“nominal”) DM shape, φ0(u, v), about which we will ultimately linearize the phase induced
by the DM. Recall that by applying the propagation factor, Eq. 1.6.27, we can account
for the fact that the DMs are not conjugate to the pupil after we have computed IDH .
Assuming that the phase perturbation from the DM, φ, is adequately small we take a first
order approximation of the exponential. Linearizing the pupil plane electric field about the
phase induced by the nominal DM surface, φ0, we find
Epup(u, v) ∼= A(u, v)(1 + g(u, v))eiφ0(1 + i(φ(u, v)− φ0)). (2.1.2)
Additionally, we assume that the product g(φ−φ0) is negligible compared to the other terms
since they will both be small and of approximately the same magnitude. Thus
Epup(u, v) ∼= A(u, v)eiφ0(1 + g(u, v) + i(φ(u, v)− φ0(u, v))). (2.1.3)
If we assume that we are starting from φ0 = 0, we eliminate the exponential making
Epup(u, v) = A(u, v)(1 + g(u, v) + iφ(u, v)). (2.1.4)
44
Applying Eq. 1.5.25, the linearized form of the image plane electric field about φ0(u, v) = 0
is
Eim(x, y) = F{A(u, v)}+ F{A(u, v)g(u, v)}+ iF{A(u, v)φ(u, v)}. (2.1.5)
The linearization has simplified our representation of the image plane electric field by
making the effect of each component additive. Potentially the most useful outcome of this
is the flexibility with which we can account for a non-conjugate DM. In this event, we may
generalize the Fourier transform of φ in Eq. 2.1.5 as a linear operator, C{·}, that includes
the propagation of the DM by applying the propagation factor in Eq. 1.6.27. Knowing this
field, we now compute the image plane intensity to be
Iim(x, y) =
∣∣∣∣C{A(u, v)}+ C{A(u, v)g(u, v)}+ iC{A(u, v)φ(u, v)}∣∣∣∣2. (2.1.6)
We then integrate the image plane intensity over the dark hole to find
IDH =
∫∫DH
|C{A}+ C{Ag}+ iC{Aφ}|2 dxdy. (2.1.7)
With the scalar intenisty, IDH , in place we seek to describe this in a way that will be useful
for control. We first discretize the integral in Eq. 2.1.7 as
IDH =∑i,j
∣∣∣∣C{A}(i,j) + C{Ag}(i,j) + iC{Aφ}(i,j)
∣∣∣∣2∆x∆y, (2.1.8)
where the pair (i, j) describes a discrete point in the two-dimensional plane. The physical
dimension of each point, (∆x,∆y), matches the pixel size of our detector. We now rewrite
the quadratic sum Eq. 2.1.8 as an inner product:
IDH = < C{A}, C{A} > + < C{Ag}, C{Ag} > + < iC{Aφ}, iC{Aφ} > +
2<{< C{A}+ C{Ag}, iC{Aφ} >}+ 2<{< C{A}, C{Ag} >}.(2.1.9)
45
Eq. 2.1.9 now gives us the ability to consider the contribution of each component to the
intensity. The middle three terms will fully describe the interaction of the DM actuation
with the aberrated field. Conventional wisdom says that by design the coronagraph will
achieve the required contrast levels when there are no aberrations. Thus by design, C{A}
should be negligible compared to the aberrated field and the DM actuation. Consequently,
the effect of < C{A}, C{A} >, < C{A}, C{Ag} >, and < C{A}, iC{Aφ} > on the value of
IDH are negligible compared to the other three terms in Eq. 2.1.9. Here we must note that
efforts are being made to improve coronagraph performance (e.g. IWA and throughput) by
relaxing the contrast level of the nominal PSF to be equal with the amplitude aberrations
present in the system [39]. In this case, the DMs are being used to reach contrast levels
below the PSF (being equivalent to pupil mapping [57, 58]) which requires that this term
be accounted for in the controller. As will be shown in Ch. 3, the estimator includes the
nominal PSF in the state estimate of the image plane electric field. If we compute the scalar
contrast value from a single image this will also include the contribution of the nominal PSF
in the measurement. Thus, we do in fact account for the contribution of the nominal field
in the cost function and the controller will be capable of suppressing beyond the nominal
value of the PSF. Reordering Eq. 2.1.9 so that each term reflects a measureable quantity in
the laboratory, the intensity to be used for our cost function is
IDH =< C{Aφ}, C{Aφ} > +2<{< C{A(1 + g)}, iC{Aφ} >}
+ < C{A(1 + g)}, C{A(1 + g)} > .
(2.1.10)
Now we impose a matrix inner product so that we can describe the electric field and the
DM control effect, our state and control variables, as column matrices. These are formed
by stacking the columns of the two-dimensional fields to create a single column matrix, an
approach used throughout this thesis. The state in our control algorithm is the column
matrix describing the aberrated field at the image plane, C{Ag}. However, we have yet to
46
parameterize the DM commands into a control matrix, u. At the moment we could solve
for C{Aφ} but what we seek are actuation commands for the DM, not its field at the image
plane. One control approach might be to solve for C{Aφ}, compute the inverse transform,
and use an arbitrary physical model of the the DM to compute the actuator strengths (which
relates the surface height to the voltage commands). This requires real time transformations
during the control loop, which will make the algorithm slower and more reliant on a lot of
computational power. Instead, we will introduce a physical model of the DM so that the
optimization can directly solve for the amplitude of each actuator on the DM. This will put
less demand on the computer, and each control step will be faster. Letting H(x, y) be the
height of the DM surface, the resulting phase perturbation induced by the DM, φ(u, v), is
φ(u, v) =2π
λ0
H(x, y). (2.1.11)
For control, we wish to describe DM surface height, H(x, y), as a combination of the two-
dimensional height maps imposed by each actuator. Since we are using a DM with a con-
tinuous face sheet, the contribution of any actuator will be highly localized but will still
deform the entire DM surface. As a result, we must describe the contribution of the qth
actuator as a two-dimensional phase map over the entire plane of the DM surface, hq(x, y).
The continuous membrane means that the combination of all actuators is nonlinear, and very
complicated to compute [9]. However, we will show later that we operate in an extremely low
stroke regime. We will show in Ch. 6 that even with actuation levels nearly 4 times larger
than our peak-to-valley actuator commands, the combination of actuators is close to linear
(Fig. 6.2). Thus, we can describe H(x, y) as a superposition of hq(x, y) over all actuators,
Nact. The phase contribution of the DM can then be described as
φ(u, v) =2π
λ0
Nact∑q=1
hq(u, v). (2.1.12)
Finally, we wish to make our control matrix, u, a column matrix made up of the control
47
signal from each actuator, uq. To do so we describe hq(u, v) as a characteristic shape with
unitary amplitude, commonly referred to as an influence function, fq(u, v). To find hq(u, v)
we simply multiply fq(u, v) by the control amplitude, aq. Describing hq(u, v) with influence
functions, the phase perturbation induced by the DM is
φ(u, v) =2π
λ0
Nact∑q=1
aqfq(u, v), (2.1.13)
which sums the qth 2-D phase map, or influence function fq(u, v), for all Nact actuators to
reconstruct φ(u, v). The strength of each influence function is determined by aq.
Substituting Eq. 2.1.13 into Eq. 2.1.10, we write C{Aφ} = C{Af}u, where u is a column
matrix of actuator strengths, u = [a1 . . . ak]T , and f is a matrix describing the perturbation
of each influence function, fq, at the pupil. In this formulation, we are specifically applying
a matrix inner product and C{Af} can be written as a matrix of dimension Npix×Nact. To
simplify the notation we define this matrix as
G = C{Af} ∈ [Npix ×Nact]. (2.1.14)
This allows us to write
< C{Aφ}, C{Aφ} >= uTG∗Gu. (2.1.15)
Applying the matrix form of the control amplitudes to Eq. 2.1.10 we find
IDH(λ0) =4π2
λ20
uTM0u+4π
λ0
uT={b0}+ d0. (2.1.16)
48
Where
M0 =< C{Af}, C{Af} >= G∗G (2.1.17)
b0 =< C{A(1 + g)}, C{Af} >= G∗C{A(1 + g)} (2.1.18)
d0 =< C{A(1 + g)}, C{A(1 + g)} >= C{A(1 + g)}∗C{A(1 + g)}. (2.1.19)
Conceptually, d0 is the column matrix of the intensity contribution from the aberrated field,
b is a matrix representing the interaction of the DM electric field with the aberrated field,
and M describes the additive contribution of the DM to the image plane intensity. Having
represented IDH in a quadratic form with regard to a control matrix, we can use Eq. 2.1.16
to produce an optimal control strategy. Recalling that the targeted contrast is 10−C , the
optimization problem in monochromatic light is stated as
minimizeN∑k=1
a2k = uTu
subject to IDH(λ0) ≤ 10−C .
(2.1.20)
To solve the optimization problem we create a cost function, J . Incorporating the constraint
for the central wavelength into the minimization via a Lagrange multiplier, µ0, yielding
J = uTu+ µ0(IDH − 10−C)
= uTu+ µ0
(4π2
λ20
uTM0u+4π
λ0
uT={b0}+ d0 − 10−C)
J = uT(I + µ0
4π2
λ20
M0
)u+ µ0
4π
λ0
uT={b0}+ µ0
(d0 − 10−C
). (2.1.21)
The cost function is quadratic in form, guaranteeing a single minimum. Recognizing that
M = MT , we take the partial derivative to find actuation that minimizes the cost function.
Evaluating
∂J
∂uT
∣∣∣∣uopt
= 2
(I + µ0
4π2
λ20
M
)uopt + µ0
4π
λ0
={b} = 0 (2.1.22)
49
and solving for for the optimal control input, we find
uopt = −µ0
(λ0
2πI + µ0
2π
λ0
M0
)−1
={b0}. (2.1.23)
To find the value of uopt, all that is left is to find the value of µ0 that minimizes the cost
function, Eq. 2.1.21. This is typically done via a line search on µ0, evaluating u with
Eq. 2.1.23 for each value of µ0 until Eq. 2.1.21 reaches a minimum. Pueyo et al. [55] have
rigorously shown that this optimization is in fact a quadratic subprogram to the full nonlinear
problem. They have also shown that for a single iteration, the controller is guaranteed to
achieve the targeted contrast level, 10−C , provided the electric field is known perfectly and
the DM control magnitude is small enough that it remains within the bounds of the current
linearization. Since the sub-program is convex, we can reach its global minimum if we re-
linearize about the new DM shape at each iteration of the correction algorithm (or every
time we apply a new DM command). This is computationally expensive, so we tend not to
do this in the experiment. However the nature of the controller is such that the solution
will not deviate dramatically from the optimization we get by re-linearizing each time. Since
the controller is trying to minimize stroke, the control tends to remain in the regime of a
particular linearization for as long as possible. Additionally, the magnitude of actuation is
proportional to the contrast it is trying to suppress. Since each control step is operating
on lower contrast levels the actuation magnitude will also tend to decrease, making the
deviation from the last control shape smaller with each iteration. As an example, Fig. 2.1
shows the evolution of the mode, median, and absolute peak to valley deformation for DM1
and DM2 as a function of the control history. The vast majority of the stroke is used to
eliminate strong abberations in the first 5 iterations. After this, the extrema and mode
reduce dramatically, frequently a factor of 10 less stroke than the first iteration. Throughout
the entire control history the median never deviates far from zero, meaning that a DC drift
never develops. Perhaps the best balance is to re-linearize when the contrast performance is
50
5 10 15 20 25 30−4
−2
0
2
4
6
DM1 Actuation Characteristics
Iteration
HeightChange(n
m)
Mode
Median
Max − Min
(a) DM1 Actuation Per Iteration
5 10 15 20 25 30
−5
0
5
10
DM2 Actuation Characteristics
Iteration
HeightChange(n
m)
Mode
Median
Max − Min
(b) DM2 Actuation Per Iteration
Figure 2.1: Time history of the mode, median, and peak-to-valley actuation levels for DM1and DM2. The starting contrast is 1 × 10−4 and final contrast is 2.3 × 10−7 at the 30th
iteration of the control algorithm.
poor (requiring large stroke), and discontinue re-linearizing as the contrast improves (since
the second order term neglected in Eq. 2.1.3 will become less significant). Neither of these
techniques guarantees that we reach the global optimum for the subprogram, but for a space
telescope it is probably worth the computational savings since it will not deviate significantly.
Pueyo et al. [55] also shows how to account for multiple DMs at non-conjugate planes. As
shown in §1.6.2, using two DMs in planes non-conjugate to the pupil we can take advantage
of their relative propagation to produce phase induced amplitude distributions at the pupil.
This makes both the real and imaginary parts of the image plane controllable, allowing
the creation of symmetric dark holes about the PSF. Thus, we can generate a dark hole
within the entire search area made available by the coronagraph. By virtue of the linearity
and small value approximations made to reach Eq. 2.1.4 and Eq. 2.1.5, the affect is purely
additive since the cross-terms, or cross-talk, between the DMs will be negligible. Including
the propagation factors from Eq. 1.6.27 into the transform for DM1, C1{·}, and DM2, C2{·},
51
the image plane electric field with two DMs in series is
Eim(x, y) = F{A(u, v)(1 + g(u, v))}+ iC1{A(u, v)φ1}+ iC2{A(u, v)φ2}. (2.1.24)
Applying the inner produce, we find the intensity at the image plane to be
IDH = < C1{Aφ1}, C1{Aφ1} > + < C2{Aφ2}, C2{Aφ2} >
+ < C1{Aφ1}, C2{Aφ2} > + < C2{Aφ2}, C1{Aφ1} >
+ 2<{< C{A(1 + g)}, iC1{Aφ1} >}+ 2<{< C{A(1 + g)}, iC2{Aφ2} >}
+ < C{A(1 + g)}, C{A(1 + g)} > .
(2.1.25)
Since IDH is a scalar value, we can maintain the same form for Eq. 2.1.16 and Eq. 2.1.21
by simply augmenting the control matrix, u = [uDM1 uDM2 . . . uDMi]T . Using the same
superposition principle and influence function set for the second DM, we define their control
effect matrices, GDM1 and GDM2, as
GDM1 = C1{Af1} (2.1.26)
GDM2 = C2{Af2}. (2.1.27)
With two DMs, the matrices in the control law given by Eq. 2.1.21 and Eq. 2.1.23 now take
the form
M0 =
G∗DM1GDM1 G∗DM1GDM2
G∗DM2GDM1 G∗DM2GDM2
(2.1.28)
b0 =
G∗DM1C{A(1 + g)}
G∗DM2C{A(1 + g)}
(2.1.29)
d0 = C{A(1 + g)}∗C{A(1 + g)}. (2.1.30)
52
Thus, the computation remains the same regardless of the number of DMs. The only change
is in the dimension of u, M0, and b0, which encode the different propagation factors for each
DM. Using this control law is not equivalent to other multi-DM concepts such as multi-
conjugate AO (MCAO) [52], woofer-tweeter concepts [48], or wide field AO [61]. As was
shown in §1.6, it is the different propagation distances that gives two DMs in series their
power in the control algorithm. It is also worth mentioning that in the event of a DM failure
this provides redundancy in the mission, mitigating the risk involved in flying an unproven
technology in space.
Having developed a monochromatic wavefront control algorithm we now have confidence
in its controllability with regard to both amplitude and phase aberrations in the image plane.
Once we are provided with an electric field measurement we may solve the optimal control
problem to suppress the field below a specified level on both sides of the image plane.
2.2 Wavelength Dependence of the Image Plane
The Stroke Minimization algorithm in §2.1 only operates on a single wavelength, λ0, which
means that suppression of the field is neither optimal nor guaranteed over a bandwidth.
Pueyo [59] showed through simulation that the bandwidth must be less than ≈ 1 − 2% of
the central wavelength. Practically, we can also define a “single” wavelength by our image
plane resolution, requiring that the difference in plate scale between the maximum and min-
imum wavelengths is less than a pixel (≈ 0.1 pixel width). However, the primary purpose
of directly imaging an exoplanet is to measure its spectra. Using the monochromatic con-
trol algorithm, we would have to correct for the aberrations at each wavelength separately
to obtain a full spectrum of the planet, which would take prohibitively long. Instead, we
would like to make the control algorithm effective over a bandwidth. This has the potential
to improve the efficiency of spectral characterization and enables detection in a broadband
image, critical in a photon limited system. Moving to broadband algorithms requires that
53
we have a good understanding of the nature of coherence from starlight. We already un-
derstand that the star is spatially coherent because of the large propagation distances; the
curvature of the wavefronts from the star are so large and the “point sources” from different
locations on the star are so close that we effectively image parallel, plane wavefronts. We
also understand that the light from a star has extremely short coherence length, requiring
equal path interferometers to interfere the light with itself. What we must address is how
to integrate in wavelength to compute the broadband image. Knowing that the emission of
a star is effectively from random radiators (in that they are not in phase with one another)
we can assume that each wavelength will not interfere with the other. As a result we may
integrate the intensity over wavelength to compute what our broadband image should be,
making it very simple to augment the optimization problem to accommodate the additional
wavelengths. Eq. 2.1.16 shows which terms are dependent for the specified wavelength in
monochromatic light, λ0. For arbitrary wavelength, λ, the dark hole intensity is
IDH(λ) = w(λ)4π2
λ2uTMλu+ w(λ)
4π
λuT={bλ}+ w(λ)dλ. (2.2.1)
We have included a normalization function, w(λ), in Eq. 2.2.1 to account for the fact that
the relative intensity of each wavelength will vary. To simplify the normalization, w(λ) is
defined so that w(λ0) = 1, where λ0 will be centered in the control bandwidth, ∆λ. It is
also important to note that in addition to the chromaticity of the coronagraph (if any) and
aberrations, Mλ, bλ, and dλ vary in wavelength because of the transform, C{·}, making
Mλ = < Gλ, Gλ > (2.2.2)
bλ = < Cλ{A(1 + gλ)}, Gλ > (2.2.3)
dλ = < Cλ{Agλ}, Cλ{Agλ} >, (2.2.4)
where Mλ is simply the transformation from u to image plane intensity. Every control effect
matrix, Mλ, may be precomputed for each wavelength, assuming it remains within the linear
54
regime of the DM shape. However, bλ requires a measurement of the aberrated field to be
computed. Since dλ is the intensity distribution of the aberrated field, this simply requires
an exposure of the aberrated field at that wavelength. With bλ and dλ requiring an estimate
of the current state, we will require more exposures if the field is to be evaluated at multiple
wavelengths.
2.3 Continuous Bandwidth Constraint
Knowing the wavelength dependence of the electric field in the image plane, we now seek
an optimization that suppresses the field to a specified contrast level over a bandwidth, ∆λ,
centered about our central wavelength, λ0. Thus, our statement of the problem becomes
minimizeNact∑k=1
a2k = uTu
subject to1
∆λ
λ0+∆λ/2∫λ0−∆λ/2
IDH(λ) dλ ≤ 10−C .
(2.3.1)
There are two problems with this formulation. First, we have produced a numerically in-
tractable solution, where the numerical minimization requires that a continuous integral be
evaluated many times. This also requires an electric field measurement (still assumed to
be perfectly known) over the full integral. Since the functional dependence in wavelength is
unknown (despite our assumption to this point that the field is provided to us) this drives the
number of required field measurements to infinity. Thus, the optimization problem should
be solved for a discrete set of n wavelengths, changing the optimization problem to
minimizeNact∑k=1
a2k = uTu
subject to 1Nλ
Nλ∑i=1
IDH(λi) ≤ 10−C
(2.3.2)
55
In this formulation there is an implicit assumption that if we constrain a discrete set of
wavelengths to fall under a targeted contrast value this will correspond to suppression of all
wavelengths between them. For example, if we are provided two monochromatic estimates
that bound a bandwidth we can find the cost function for Eq. 2.3.2 in the same manner as
§2.1 to find a set of DM commands that suppress those two wavelengths. However, there is no
guarantee that when we image the entire bandwidth we will maintain the targeted contrast
level. We must discretize the wavelengths in the optimization but we lose our ability to
guarantee suppression over the full bandwidth. In the end, successfully suppressing the
field depends upon the size of the band and the number of discrete wavelengths chosen. To
help guarantee suppression between the optimized wavelengths, we appeal to the concept of
maintaining small phase shifts so that there are no dramatic changes in the wavefront. For
any two wavelengths in the optimization, (λ1, λ2), the bandwidth between them should be
small enough that λ2 < 2λ1. Since we target bandwidths of ∆λ/λ0 = 10 − 20% there isn’t
a dramatic shift in the relative phase for these wavelengths, and our expectation is that the
contrast should be maintained across the entire band.
Looking further, the optimization does not guarantee a particular contrast level for each
wavelength, only that their sum be below 10−C . In other words, formulating the problem in
this way does not give us the freedom to weight the contribution of each wavelength to the
optimization. For characterization we need the speckles to be suppressed equally well over
all wavelengths, but suppression at one wavelength may be more important than at others.
For example, if we bound a spectral feature that is much dimmer we would require higher
contrast at this wavelength. While Eq. 2.3.2 is undoubtably the optimal solution with regard
to suppressing a bandwidth below a particular value, it will not necessarily guarantee our
ability to obtain a spectral measurement. Therefore we will continue to further constrain
the problem to get the desired properties out of the controller.
56
2.4 Windowed Stroke Minimization
As discussed in section 2.3 we seek to make the problem of correcting over a bandwidth
computationally tractable by discretizing the integral in Eq. 2.3.1 to a summation of finite
wavelengths. We cannot avoid the problem of guaranteeing suppression between wavelengths,
but we can try to guarantee suppression of each wavelength in Eq. 2.3.2 by using multiple
constraints in the optimization. Rather than simply summing the intensities we will impose
a separate constraint for each wavelength. In this formulation, we will choose three wave-
lengths, one at the center of the bandwidth, λ0, and two more providing the boundaries for
the problem, λ1, λ2, to define a window over which the correction will be made. Applying
three separate constraints, the optimization becomes
minimizeNact∑k=1
a2k = uTu
subject to: IDH(λ0) ≤ 10−Cλ0 ,
IDH(λ1) ≤ 10−Cλ1 ,
IDH(λ2) ≤ 10−Cλ2
where λ1 = γ1λ0
λ2 = γ2λ0.
(2.4.1)
Now we will find the optimal control law by augmenting the minimization with three La-
grange multipliers for each discrete value of the intensity. Following the same procedure as
57
in §2.1, we write the cost function as
J =uTu+ µ0(IDH(λ0)− 10−Cλ0 ) + µ1(IDH(λ1)− 10−Cλ1 ) + µ2(IDH(λ2)− 10−Cλ2 )
J =u
[I +
4π2
λ20
(µ0Mλ0 + µ1
w(λ1)
γ21
Mλ1 + µ2w(λ2)
γ22
Mλ2
)]uT
+4π
λ0
[µ0={bλ0}+ µ1
w(λ1)
γ1
={bλ1}+ µ2w(λ2)
γ2
={bλ2}]uT
+[µ0
(dλ0 − 10−Cλ0
)+ µ1w(λ1)
(dλ1 − 10−Cλ1
)+ µ2w(λ2)
(dλ2 − 10−Cλ2
)]. (2.4.2)
Taking the partial derivative of the resulting cost function yields the optimal DM command
for a subset of wavelengths spanning the entire bandwidth ∆λ. As before, the optimal
command is determined by performing a line search on µ. We now have an optimization
across three variables, complicating the task of minimizing the function. We still have the
same problem as in §2.3 that the globally optimal solution in the three dimensional space
(µ0, µ1, µ2) does not necessarily guarantee the targeted suppression at all three wavelengths.
However, with three Lagrange multipliers we can guarantee suppression of all three wave-
lengths by restricting the search to a single dimension. We write the Lagrange multipliers of
the two bounding wavelengths as weighted values of the first so that µ1 = δ1µ0 and µ2 = δ2µ0.
Applying this relationship, Eq. 2.4.2 becomes
J =u
[I + µ0
4π2
λ20
(Mλ0 + δ1
w(λ1)
γ21
Mλ1 + δ2w(λ2)
γ22
Mλ2
)]uT
+ µ04π
λ0
[={bλ0}+ δ1
w(λ1)
γ1
={bλ1}+ δ2w(λ2)
γ2
={bλ2}]uT
+ µ0
[(dλ0 − 10−Cλ0
)+ δ1w(λ1)
(dλ1 − 10−Cλ1
)+ δ2w(λ2)
(dλ2 − 10−Cλ2
)]. (2.4.3)
58
Taking the partial derivative and evaluating at zero,
∂J
∂uT
∣∣∣∣uopt
=0
=uopt
[2 I + 2µ0
4π2
λ20
(Mλ0 + δ1
w(λ1)
γ21
Mλ1 + δ2w(λ2)
γ22
Mλ2
)]+ µ0
4π
λ0
[={bλ0}+ δ1
w(λ1)
γ1
={bλ1}+ δ2w(λ2)
γ2
={bλ2}], (2.4.4)
we can use the value of µ0 that minimizes Eq. 2.4.3 to compute the following optimal com-
mand:
uopt =− µ0
[={bλ0}+ δ1
w(λ1)
γ1
={bλ1}+ δ2w(λ2)
γ2
={bλ2}]·[
λ0
2πI + µ0
2π
λ0
(Mλ0 + δ1
w(λ1)
γ21
Mλ1 + δ2w(λ2)
γ22
Mλ2
)]−1
. (2.4.5)
By parameterizing the three Lagrange multipliers we can weigh their effect on the cost
function, thus allowing us to control the degree to which each constraint is satisfied. If
we choose δ1 = δ2 = 1 we have made each contrast target equally important, making the
problem equivalent to solving Eq. 2.3.2. We can also control the degree to which achieving
the bandwidth affects the optimization. If we need a very soft correction outside of the
central wavelength then we can choose δ1 and δ2 to be less than one. We may also find
that we need to preferentially weight one side of the bandwidth to accommodate variance in
absorption and emission from the planet.
Eq. 2.4.3 and Eq. 2.4.5 have simplified the problem of optimal broadband correction by
writing a cost function with a single Lagrange multiplier, while leaving the degree of freedom
to weight the required performance of each wavelength. This parameterization constrains
the path of the original 3D optimization to lie along a vector. The direction of the vector
is arbitrary, set by the user based on the values chosen for (δ1, δ2). To evaluate this cost
function we must account for the wavelength dependence of each matrix, which triples our
59
computational cost. The matrices describing the impact of the DMs on the electric field,
Mλ, can be precomputed since this is simply a linear map from unitary DM actuation to
image plane intensity. The only difference between the three is the wavelength used in the
transform, Cλ{·}. They only need to be re-evaluated when the system is re-linearized. Since
bλ and dλ require a measurement of the current electric field, this cannot be pre-computed
and must be measured for each wavelength.
Practically, a windowed optimization allows us to seek commands to suppress over a
given filter set in the instrument. In the most conservative case the bounding wavelengths
would be the edges of the filter. In a more aggressive control mode the window could be
chosen to span the smallest and largest wavelengths across a set of filters. The disadvantage
of this approach is that it does not solve the problem of guaranteeing correction between
the intermediate wavelengths. This does, however provide the freedom to arbitrarily weight
the contrast performance separately for each wavelength. This degree of freedom is actually
rather useful, since the relative intensity of the planet to its parent star is a function of
wavelength. For example, in the visible spectrum Earth is 10−10 times dimmer than the sun
but in the infrared it is only 10−6 times dimmer, and Des Marais et al. [17] show that we
can expect order of magnitude fluctuations in the reflectance of a terrestrial body with an
atmosphere within a ≈20% band. Thus we can use our weighting values (δ1, δ2) to relax the
DM commands with respect to wavelengths that we do not expect to require such stringent
contrast levels, purely based on the blackbody spectrum of the parent star. With regard
to the control problem, we will find in the following sections that there is a significant
amount of error introduced in the estimates at bounding wavelengths, regardless of whether
they are provided by a direct estimate or an extrapolation method (§ 2.5). The ability to
underweight the bounding wavelengths in the cost function gives us the ability to soften
the effect of those errors. In fact, we often found the best performance from the results in
Ch. 5 when the bounding wavelengths were slightly underweighted. Noting this behavior,
an interesting adaptive control scheme would be to modulate δ1 and δ2 based on an estimate
60
of the error introduced by the extrapolated fields at the bounding wavelengths. This could
go so far as using these values to adjust a the functional form of the extrapolation that we
will derive in §2.5.
2.5 Extrapolating Estimates in Wavelength
The Windowed Stroke Minimization algorithm of §2.4 solves the bandwidth problem, but
its implementation can be quite complicated because its wavelength dependence requires
multiple transforms and field estimates. The wavelength dependent matrices, bλ and dλ,
represent the component of the electric field from coupling between aberrations and the
DM-induced perturbations and the intensity distribution of the aberrated field respectively.
Computing both requires an estimate of the electric field at each iteration of the quadratic
subprogram. In FPWC the estimate is what drives the correction time, not the controller.
If we could measure the field directly, one iteration of the correction algorithm would only
require one image per wavelength to measure the field plus one more to see the control effect.
However, as will be discussed in Ch. 3, the electric field cannot be measured directly but
must be estimated. This involves taking many exposures to estimate the field (the number
of which is a major topic in Ch. 3), between two and eight exposures per wavelength. Once
provided with an electric field estimate, we still only require one image to measure the
control effect. As such, we only save one exposure per wavelength (two for Windowed Stroke
Minimization) if we have to directly estimate each field. Thus we gain very little in controller
performance by going to such a complicated algorithm. In terms of correction efficiency we
would almost be better off correcting each wavelength individually using the monochromatic
control law in §2.1.
If we can eliminate our need to estimate every wavelength, we can reduce the number
of exposures per iteration by as much as ≈ 60%. We still require a field estimate at a
single wavelength, but from that we will attempt to extrapolate what the field is at other
61
wavelengths. To do this, we will first make some assumptions about the electric field to
write a functional relationship describing how the aberrated field evolves in wavelength as
it deviates from the estimate at the central wavelength, λ0. To include the wavelength
dependence of the transform, Cλ{·}, we will characterize the variance of the aberrations at
the pupil plane. An arbitrary aberrated field at the pupil may be described as
Epup,abr = α(u, v, λ)eiβ(u,v,λ). (2.5.1)
For the sake of computational efficiency, we will describe the functional form of α(u, v, λ)
and β(u, v, λ) by assuming that the errors are induced by optics and that these errors are
effectively located at the pupil plane. Uniform amplitude variation in wavelength (such
as the change in reflectivity of a mirror as a function of wavelength) will be absorbed by
the normalization of the field, since it is analagous to intensity fluctuation in wavelength.
Thus α(u, v, λ) describes the spatial variation of amplitude across the pupil as a function
of wavelength. If we assume the amplitude variations due to system optics are analogous
to reflectivity variations, i.e coating errors across the surface of the mirror, the amplitude
aberrations become independent of wavelength, making Eq. 2.5.1
Epup,abr = α(u, v)eiβ(u,v,λ). (2.5.2)
This assumption does not necessarily hold In the presence of reflectivity variations that exist
at non-conjugate planes because of phase induced amplitude errors that arrive at the pupil.
We acknowledge this as a limitation, but we will keep the assumption to maintain a simple
functional relationship since this computation must be made during the control loop. Our
task will be to see how effective this assumption is in the experiment.
Moving to β(u, v, λ), we assume that the phase errors are from shaping errors within
the system optics. We assume that these errors are exist at the pupil plane. Assuming a
particular height perturbation to the shape of he optic, h(u, v), we can write the phase errors
62
β as
β(u, v, λ) =2π
λh(u, v). (2.5.3)
This means that the phase errors in the pupil plane are inversely proportional to wavelength.
This makes intuitive sense since a fixed perturbation induced by an optic applies a smaller
phase disturbance as the wavelength increases. Defining the incident phase by our estimated
wavelength, λ0, as
β0(u, v) =2π
λ0
h(u, v). (2.5.4)
We can rewrite Eq. 2.5.3 as a function of β0(u, v), λ, and λ0. The phase perturbation at the
pupil then becomes
β(u, v, λ) = f(β0(u, v), λ)
=λ0
λβ0(u, v). (2.5.5)
Applying Eeq. 2.5.5 and Eq. 2.5.2 to Eq. 2.5.1, we find that the wavelength dependent
aberrations at the pupil can be approimated as
Epup,abr(u, v, λ0) = α(u, v)eiλ0λβ0(u,v). (2.5.6)
Assuming an estimate of the electric field at the image plane, Eest(x, y, λ0), that is only
from pupil plane aberrations, our estimate of the pupil plane aberrations is given by
g0(u, v) = F−1λ0{Eest(x, y, λ0)}. (2.5.7)
We now equate g0(u, v) to Eq. 2.5.6, making the phase of the pupil estimate
eiβ0 =g0
α
iβ0 = ln(g0
α
). (2.5.8)
63
Applying Eq. 2.5.5, we shift the phase found in Eq. 2.5.8 to get
iβλ = iλ0
λβ0
=λ0
λln(
g0
α)
α(u, v)eiβλ = αeλ0λ
ln(g0α
)
= αeλ0λ
(ln(g0)−ln(α))
= α1−λ0λ g
λ0λ
0
α(u, v)eiβλ =gλ0λ
0
|g0|λ0λ−1. (2.5.9)
Reapplying the linear transform for the new wavelength, Fλ{·}, we compute the extrapolated
field from λ0 to λ to be
Eextrap(x, y, λ) = Fλ{F−1λ0{Eest(x, y, λ0)}λ0
λ
|F−1λ0{Eest(x, y, λ0)}|λ0
λ−1.
}(2.5.10)
Using Eq. 2.5.10 we now having the ability to extrapolate an estimate made at λ0 to bounding
wavelengths. To minimize the bandwidth between estimates, we choose to estimate at the
central wavelength and extrapolate the field for the bounding wavelengths. In applying
the extrapolation, we run into a numerical complication. Since the estimate is finite, the
inverse transform of its shape will convolve with the field we seek, imposing itself on the
extrapolation. Fortunately the area being estimated and controlled is typically smaller than
the image, so we can mitigate the effect by filling in the unknown area with the square
root of the intensity found in the image measuring the control effect. Not knowing the
phase, we are only adding partial information in this region but it serves to soften the
effect of the finite area on the extrappolation. Fig. 2.2 shows an example of an estimate
extrapolation over a ∆λ = 10% bandwidth. Overlaid on the images are boxes defining the
estimation area, inside of which a complex field is provided in Fig. 2.2(b). We then use
Eq. 2.5.10 to compute the field at the bounding wavelengths, Fig. 2.2(a) and Fig. 2.2(c).
64
λ0/D
λ0/D
Lower Wavelength
−10 −5 0 5 10
−10
−5
0
5
10 −6
−5.5
−5
−4.5
−4
(a) |Eextrap(λ1)|2λ0/D
λ0/D
Central Wavelength
−10 −5 0 5 10
−10
−5
0
5
10 −6
−5.5
−5
−4.5
−4
(b) |Eest(λ0)|2λ0/D
λ0/D
Upper Wavelength
−10 −5 0 5 10
−10
−5
0
5
10 −6
−5.5
−5
−4.5
−4
(c) |Eextrap(λ2)|2
Figure 2.2: Example of wavelength extrapolation using Eq. 2.5.10 in a bandwidth of ∆λ =10%. The central wavelength(b) is used to extrapolate the field at a wavelength at thebottom (a) and top (c) of the window. The evolution of the aberrations is more complicatedthan a physical scaling law.
The extrapolated field estimates for the upper and lower wavelengths are then taken from
the area inside the boxes that defines the correction area. At this stage, it is possible to
mitigate our uncertainty in the phase outside of the estimation area with a Gerchberg-
Saxton like algorithm. We would recursively transform to each wavelength, replacing the
areas outside of the estimate with the newly computed complex field. However, the search
area defined by the image plane mask and the camera resolution may limit the accuracy
of a Gerchberg-Saxton loop. These algorithms have been shown to require upwards of 200
cycles, each involving multiple 2D Fourier transforms, and are very costly with regard to
computation time [40]. Additionally, the level of accuracy gained by such a technique is lost
by the time evolution of the aberrations. This would be a function of computational power
and the time scale of the aberrations. In a space observatory the computational power is
low and on a ground telescope the speckles evolve quickly, making speckle evolution a very
real possibility in either scenario. The idea is certainly worth pursuing in a highly stable
laboratory environment to test if it can improve performance, but is unlikely to be of benefit
as a true observation mode.
By extrapolating (bλ1 , dλ1), (bλ2 , dλ2) Eq. 2.5.10 provides all the necessary information for
the Windowed Stroke Minimization algorithm, Eq. 2.4.3 and Eq. 2.4.5. We can now attempt
65
broadband suppression using only a single monochromatic estimate. However, the simpli-
fication made for both amplitude and phase requires that errors in non-conjugate planes
have a negligible effect on the aberrations in the pupil. Reflectivity variation across any
mirror is generally very low since chemical vapor deposition is a very stable and reliable
process. These errors are typically so low that they only become a limiting factor in extreme
interferometric problems where the null must be very deep, such as the visible nuller coro-
nagraph [46]. In this case the variations in amplitude from non-conjugate surfaces, such as
the DMs (Fig. 1.3), will be negligible and Eq. 2.5.5 will be relatively accurate. However, by
virtue of the fact that we use two DMs to correct amplitude via the propagation of phase
deformations we cannot say the same for the assumption made on amplitude. The phase
induced amplitude aberrations from DM1 and DM2 are significant due to the large nominal
phase errors present on these surfaces. Accounting for these errors will add higher order
wavelength dependence to α(u, v, λ), complicating the form of the transformation we found
in Eq. 2.5.10. Since the spatial frequencies of these aberrations are mostly of very high order
we will continue with our original assumption and hope that this error does not contribute
significantly at the low spatial frequencies we are considering.
2.6 Chapter Assumptions
The following assumptions were made in the derivations of this chapter.
§2.1 - Monochromatic Wavefront Control:
• Linear approximation made for the DM field.
• The control effect relies on the angular spectrum factor when a DM is non-conjugate
to the pupil.
• g(u, v) and φ(u, v) are both small and the product gφ is negligible.
• For clarity, the control laws are written in a form that assumes φ0 = 0.
66
• By design, < C{A}, C{A} >, < C{A}, C{Ag} >, and < C{A}, iC{Aφ} > are negligible.
• The DM response is small enough that its response to voltage is linear, and superpo-
sition of influence functions holds.
• Re-linearization of the control matrices is not necessary at each control step because
of the rapid reduction of actuation levels during control.
• The effect of multiple DMs is additive, and symptom of the first three assumptions.
§2.3,§2.4,§2.5 - Broadband Wavefront Control and Extrapolation:
• The bandwidth, ∆λ/λ0, is small enough that wavelengths inside those constraining the
controller will also be suppressed.
• Amplitude variations are wavelength independent and fixed to the pupil plane.
• Phase variations scale as λ0/λ and are fixed to the pupil plane.
67
Chapter 3
Batch Process Electric Field
Estimation
The control algorithms developed in Ch. 2 require that we provide an estimate of the electric
field at the image plane. The required level of accuracy and precision that focal plane
wavefront correction requires from the electric field estimate generally precludes using a
separate wavefront sensor because it introduces non-common path errors. However, the
final science camera is only capable of imaging the magnitude squared of the electric field.
All phase information of the complex field is lost. Therefore we must modulate the field
in some manner to make both amplitude and phase observable at the science detector.
To accomplish this, there are generally two levels of algorithms that have been developed.
Nonlinear estimation schemes based on Gerchberg-Saxton algorithms can accommodate large
phase deformations (many multiples of the wavelength), but their uncertainty is too large
for the controllers of Ch. 2 to reach extremely high levels of contrast [40, 20, 10, 60, 3].
To create dark holes in high contrast images we require the second type, high precision
estimation schemes that only operate in the regime of small phase perturbations. One
approach is to image multiple planes and converge on the estimate using a more precise
version of a Gerchberg-Saxton type algorithm [40, 18, 19, 22]. Another approach is to use
68
algorithms that modulate the aberrations with the deformable mirror itself. This requires
a model accurate and precise enough to predict the DM’s effect on the image plane at
intensity levels equal to or lower than our desired contrast level, and must be measured
fast enough for the control to be effective. The speed is directly tied to the stability of the
field, which is in turn dependent on the instrument stability. As will be shown in §6.5, the
stability must be quantified as a function of the contrast and we will see that this affects the
performance of our broadband experiments in Ch. 5. For precision estimation, we have used
the DM-Diversity estimation scheme as a baseline to provide the electric field estimate to the
stroke minimization controller [11, 23] because of its widespread use and success in multiple
laboratories [7, 24, 28]. This chapter derives the algorithm and addresses its advantages and
limitations.
3.1 Linearity of the Electric Field
To produce the model for this estimation scheme, we begin as we did in Ch. 2 with a model
relating the electric field at the DM/pupil plane to the electric field at the image plane. Using
an arbitrary linear operator, C{·}, to account for the DM being at a plane non-conjugate to
the pupil we rewrite Eq. 2.1.5 as
Eim(x, y) = C{A(u, v)}+ C{A(u, v)g(u, v)}+ iC{A(u, v)φ(u, v)}. (3.1.1)
In the end we will still use matrix forms to compute the intensity distribution, but rather
than applying a matrix inner product to describe a single scalar value as in Eq. 2.1.9, we seek
the intensity at each pixel in the image. This requires calculating the magnitude squared of
each element in the image, so we will be evaluating the inner product for each scalar value in
the image. Given a particular DM shape, +φ, the intensity distribution at the image plane
69
is given by
I+ = < C{A}, C{A} > + < C{Ag}, C{Ag} > + < iC{Aφ}, iC{Aφ} > +
2<{< C{A}+ C{Ag}, iC{Aφ} >}+ 2<{< C{A}, C{Ag} >},(3.1.2)
We can now describe the interaction of DM actuation and aberrations over the entire control
area where we intend to create a dark hole. The only approximation made in this intensity
distribution is the linearization of the DM shape. Thus, while we have not actually lin-
earized about the aberrated field it must be small enough that the second order term in the
linearization used to produce Eq. 2.1.5 is negligible (in a single control step). Correspond-
ingly, we will find in the following sections that the estimate of the aberrated field is directly
dependent on the linearization of the DM shape.
3.2 Pairwise Images
In Ch. 2 we linearized about the DM shape so that we might create a quadratic cost function
to solve for the optimal control law provided a field. The goal of this chapter is to estimate
the electric field in the image plane given an image described by Eq. 3.1.2. We will do this
by modulating the DM and measuring the effect on the intensity distribution of the image
plane. The additive component of the DM in Eq. 3.1.2 is of no help, but the cross term of the
DM effect with the aberrated field will tell us how the modulation of the DM interacts with
the aberrated field to change the intensity distribution. To make this interaction the sole
observable quantity, we must eliminate all the other terms since they will add bias and noise
to the measurement. As pointed out by Borde and Traub [11], we cannot simply subtract
the image taken prior to applying the probe (making φ = 0) because this does not eliminate
the additive component of φ. As described by Borde and Traub [11], applying the negative
70
of the DM shape, −φ, the intensity distribution becomes
I− = < C{A}, C{A} > + < C{Ag}, C{Ag} > + < iC{Aφ}, iC{Aφ} > −
2<{< C{A}+ C{Ag}, iC{Aφ} >}+ 2<{< C{A}, C{Ag} >}.(3.2.1)
If we subtract Eq. 3.2.1 from Eq. 3.1.2 we find the residual to be
I+ − I− = 4<{< C{A}+ C{Ag}, iC{Aφ} >}. (3.2.2)
Thus taking difference images leaves us with the product of the DM probe field, C{Aφ}, with
the aberrated and nominal field. Difference imaging has the added benefit of removing any
static incoherent light sources, such as detector bias, stray light, and planet light, leaving
only the coherent component of the field to be measured. The only residual left in the
measurement is the interaction of the DM probe with the nominal field, < C{A}, iC{Aφ} >.
When the aberrations are much larger than the nominal field this will have a negligible
effect. Once the aberrations have been suppressed to a level close to that of the nominal
field this will become significant. However, recent progress showing that wavefront control
is equivalent to computing the profile for a pupil mapping coronagraph [30, 73] has shown
that we should in principle be capable of using our control to go below the nominal field the
coronagraph is designed for [39]. In this case it is not a residual because we want to include
this component of the field in the estimate so we can suppress it.
3.3 DM Diversity: Batch Process Estimation
The final step is to manipulate Eq. 3.2.2 to separate out the aberrated field in a matrix
form. To do so, we recognize that we only want the real part of the scalar inner product
71
between the two quantities and rewrite Eq. 3.2.2 as
I+ − I− = 4 (<{C{A}+ C{Ag}} · <{iC{Aφ}}+ ={C{A}+ C{Ag}} · ={iC{Aφ}})
= 4
[<{iC{Aφ}} ={iC{Aφ}}
]<{C{A}+ C{Ag}}
={C{A}+ C{Ag}}
. (3.3.1)
Eq. 3.3.1 separates the probe and aberrated fields into independent matrices but this equation
alone still leaves us with an underdetermined system, meaning the solution is non-unique.
The estimate will have a minimal norm, ||x||min, solution via the right pseudo-inverse instead
of providing an estimate with minimal least-squares error. To complete the DM-Diversity
estimator developed by Give’on et al. [23], we must produce an overdetermined system so
that we can write it as an unweighted batch process that produces an estimate of the aber-
rated field at the current control iteration with least-squares minimal error. The linearized
interaction of the DM probe and the aberrated field, Eq. 3.3.1, can be augmented by taking
multiple difference images using j pre-determined shapes. The image I+j is taken with one
deformable mirror shape, φj, while I−j is the image taken with the negative of that shape,
−φj, applied to the deformable mirror. The difference of each conjugate pair is then used to
construct a matrix of noisy measurements,
z =
I+
1 − I−1...
I+j − I−j
. (3.3.2)
Defining x as the image plane electric field state, we write z as a linear equation in x and
include additive noise, n,
z = Hx+ n (3.3.3)
which defines H as the observation matrix that relates the observed quantity to the state
we seek to estimate. By writing x as the real and imaginary parts of the electric field at a
72
specific pixel,
x =
<{C{Ag}}={C{Ag}}
, (3.3.4)
we can construct H so that it contains the real and imaginary parts of the jth DM pertur-
bation, C{Aφj}, in each row. With multiple pairs of images it takes the form
H = 4
<{C{Aφ1}} ={C{Aφ1}}
......
<{C{Aφj}} ={C{Aφj}}
. (3.3.5)
The product Hx will then match the intensity distribution in the measurement z. With at
least three measurements, j ≥ 3, we can take a left pseudo-inverse to solve for the estimate
of the real and imaginary parts of the aberrated field at each pixel in the image plane with
least-squares minimal error:
x = (HTH)−1HT z (3.3.6)
To write the system in full matrix form, the state x is stacked vertically for each pixel and
the observation matrix for a single pixel, H, is ordered into a larger block diagonal matrix.
In most cases, there are enough pixels in the dark hole that the dimension of H becomes very
large and too cumbersome for most mathematical programs to handle the matrix inverse.
So we must construct H as shown here to construct x pixel by pixel so that enough memory
is left to manage the experiment.
As we see from Eq. 3.3.6, the DM-Diversity algorithm is simply a least-squares batch
process estimator [68]. The pseudo-inverse minimizes the error because it is effectively aver-
aging the elements of H when the inverse is taken. The power of this algorithm also comes
from the difference imaging of conjugate probe shapes that are applied to the DM. We are
left only with the time-varying camera noise in our measurement, meaning that the sensor
noise n will follow a zero-mean Poisson distribution [35]. The problem becomes invertible
73
using two image pairs to construct z and H, but a minimum of 3 image pairs must be used
to create an overdetermined system that will produce a unique estimate with least-squares
minimal error from the available data [68]. Practically, we find that 4 image pairs must be
used to get a good enough estimate at the Princeton HCIL, largely to average model errors
and detector noise. Consequently, 8 images are taken per iteration to estimate the electric
field with the DM diversity algorithm. The algorithm has the advantage of being simple,
and relatively robust. The disadvantage is that the algorithm is fundamentally limited by
DM model uncertainty. The robustness of the algorithm comes with a high cost of exposures
that must be repeated every time, a major disadvantage in a system where the time required
for detection will be exposure limited.
3.4 Probe Shapes
With the DM-Diversity estimator in place we can explore the choice of the probe shapes, φj.
They must be chosen to modulate the estimation area well, otherwise the difference between
I+j and I−j would be so small that z would come close to zero. Even worse, the observation
matrix is constructed by computing the probe effect in the estimation area. If we choose
a probe that does not modulate the field in the estimation area well we will get rows that
are effectively zero, making H poorly conditioned. While our choice in the probe shapes is
somewhat arbitrary we must take care that they modulate the estimation area well enough
to produce a well posed problem in Eq. 3.3.6. We guarantee this by choosing shapes based on
analytical functions for which we know the Fourier transform. The DMs being non-conjugate
to the pupil plane will have little effect on this computation since we have shown that the
angular spectrum factor will simply add an additional phase distribution in the image plane.
Following Give’on et al. [23], we will simplify the problem of coverage/shape of the dark hole
by choosing two symmetric rectangular regions that span the region we wish to estimate.
Mathematically we produce a rectangle of width wx and height wy by multiplying two rect
74
functions, one for each dimension. Applying the inverse Fourier transform, the DM shape
required to produce this rectangle in the image plane is
F−1{ rect(wxx) rect(wyy)} = sinc(wxu) sinc(wyv). (3.4.1)
We offset the rectangle from the center by a distance a in the x dimension and a distance b in
the y direction by convolving it with two pairs of delta functions, one set for each coordinate.
The inverse Fourier transform of two symmetric delta functions is
F−1
{1
2[δ(x− a) + δ(x+ a)] ∗ 1
2[δ(y − b) + δ(y + b)]
}= cos(au) cos(bv) (3.4.2)
Applying an arbitrary amplitude, c, and the pupil function, A, the two offset rectangles in
the image plane generated by the DM shape φ are
F{Aφ} = F{A} ∗ c rect(wxx) rect(wyy)
∗ [δ(x− a) + δ(x+ a)] ∗ [δ(y − b) + δ(y + b)] (3.4.3)
= F{cA sin(wxu)} ∗ F{cA sinc(wyv))} ∗ F{cA cos(au)} ∗ F{cA cos(bv)}
= F{cA sinc(wxu) sinc(wyv) cos(au) cos(bv)}}
(3.4.4)
Inverse transforming, the shape we would like the DM to approximate is
φ = c sinc(wxu) sinc(wyv) cos(au) cos(bv). (3.4.5)
The coordinate offset for the delta functions and the width of the rect function is equal to
the frequency of the cosine and sinc functions respectively. Assuming linearity of the DM
actuation, Eq. 2.1.4, we have produced a phase distribution for one DM that results in a
unitary amplitude in two rectangular regions of the image plane. As discussed in Give’on
75
et al. [23], we must keep in mind that the true distribution at the image plane includes a
convolution with the nominal PSF, F{A}. The PSF will alter the field so that the the field
from a DM shape given by Eq. 3.4.5 will not have exact unit amplitude and the edges of
the rectangle will extend by one radius of the PSF. The distribution will still be relatively
uniform, so we are guaranteed to modulate the area under consideration with a reasonable
expectation that each pixel in the dark hole will also be modulated.
If we take the magnitude square of Eq. 3.4.3, we see that the intensity provided by the
probe shape will be proportional to the square root of the amplitude, c, of the shape produced
by Eq. 3.4.5. Generally, the amplitude of these shapes is prescribed by the normalized
intensity of the aberrated field. When we probe, we want to make a significant effect so that
there is a good signal in z, but we do not want to actuate so strongly that we wash out the
aberrations. Thus we choose an actuation amplitude equal to the square root of the average
contrast in the previous iteration [11].
The experimental results using probe pairs with the DM Diversity estimation algorithm
[23] are found in Ch. 5. This estimator is used to supply and estimate to both the windowed
and monochromatic forms of the stroke minimization algorithm developed in Ch. 2.
76
Chapter 4
Kalman Filter Estimation
The DM-Diversity algorithm described in Ch. 3 is quite effective, but it is limited by the
fact that it is only a batch process method. As shown in Fig. 4.1, it does not close the
loop on the state estimate. Therefore all state estimate information, x, acquired about the
electric field in the prior control step is lost. Thus we start over at each iteration, requiring
that we take a full set of estimation images to estimate the field again. In addition to being
Figure 4.1: Block diagram of a standard FPWC control loop. At any time step, k, onlythe intensity measurements, zk, provide any feedback to estimate the current state, xk, forcontrol. The red dashed lines show additional feedback from the prior electric field (or state)estimate, xk, and the control signal, uk, used to suppress it.
very costly with regard to exposures, the measurements will become progressively noisier as
we reach higher contrast levels. If we include feedback of the state estimate we will have
a certain degree of robustness to new, noisy measurements by including information from
77
prior measurements with better signal-to-noise. Since we already have demonstrated a model
based controller, we should be able to use this model to predict the change in the electric
field after the controller has applied a DM command. In doing so we do want to consider
the relative effect of process and detector noise to optimally combine an extrapolation of the
state estimate with new measurement updates. This is exactly the problem a discrete time
Kalman filter solves.
4.1 Constructing the Optimal Filter
A Kalman filter includes prior state estimate history by extrapolating a new estimate of the
state using a model, then optimally updating the estimate with a sequence of measurements
taken at discrete intervals. For the time being, the Kalman filter estimator will still use DM
probe pairs as described in §3 for the measurement update. In this way we are not testing
whether there is a better way to obtain information regarding the electric field, but rather
changing the way we use the information from the applied probe shapes to reconstruct the
electric field estimate. In particular, we use the prior information to improve our ability
to estimate the field in later iterations. This will allow us to use fewer measurements to
reconstruct the electric field.
Following the notation used in Stengel [68], we begin by assuming we have a state,
defined as the electric field estimate from the prior iteration, xk−1(+). The plus indicates
that the estimate was updated at the prior iteration with some measurement, be it from an
initialization or a prior estimation and control step. Prior to any additional measurements,
we will extrapolate from xk−1(+) to the current time step, xk(−), by an arbitrary function
(to be defined later in the chapter). We also seek a metric to gauge the uncertainty in
the estimate. Following Stengel [68], we define the extrapolated state estimate covariance,
Pk(−), as the expected value of the error between the estimate and the true state, xk:
Pk(−) = E[(xk(−)− xk)(xk(−)− xk)T ]. (4.1.1)
78
We now seek to optimally include new measurements to improve the state and covariance
estimates. These noisy measurements, zk = yk + nk, will still be difference images of probe
pairs. As discussed in Ch. 3, the conjugate pairs allow us to construct a linear observation
matrix, Hk, which stems from Eq. 2.1.9. If we were not in a low aberration regime the
observer would have to be nonlinear. This is not impossible for a Kalman filter, but can
make it highly biased [68] and computationally expensive. As we decide how to optimally
update the field, we must also have an estimate of the measurement noise covariance, which
we define as
Rk = E[nknTk ]. (4.1.2)
To properly demonstrate the conditions under which the Kalman filter is truly optimal,
we would have to show that propagating the state estimate is a Gauss-Markov sequence
(indicating that optimality requires white Gaussian inputs and Gaussian initial conditions)
[68]. This is rather tedious and can be found in many textbooks that discusses the Kalman
filter, so it will not be re-derived here. However, it is worth demonstrating that like the batch
process method the Kalman filter does produce an estimate with least-squares minimal error.
Since the Kalman filter operates on the estimate in closed loop, the weighted cost function
used to derive the batch process solution,
J =1
2[Hkxk − zk]TR−1
k [Hkxk − zk], (4.1.3)
will not adequately represent the error contributions in the system. We must also include an
estimate of the state covariance, since this will also propagate error in the estimate update.
Defining the error as both the difference between the noisy observation and the estimated
observation, Hkxk(+)− z, and the difference between the current estimate and the estimate
extrapolation, (xk − xk(−)), we write the quadratic cost function as
J =1
2
[xk − xk(−)]TPk(−)−1[xk − xk(−)
]+
1
2[Hkxk − zk]T R−1
k [Hkxk − zk] . (4.1.4)
79
We can formulate the cost in matrix form as
J =1
2
xk − xk(−)
Hkxk − zk
T Pk(−) 0
0 Rk
−1 xk − xk(−)
Hkxk − zk
(4.1.5)
=
IHk
xk −xk(−)
zk
T Pk(−) 0
0 Rk
−1
IHk
xk −xk(−)
zk
=1
2(Hkxk − zk)T R−1
k (Hkxk − zk), (4.1.6)
where we have now defined a new set of augmented matrices as
Hk =
IHk
, (4.1.7)
zk =
xk(−)
zk
, (4.1.8)
Rk =
Pk(−) 0
0 Rk
. (4.1.9)
This allows us to write an analog to the weighted cost function used to compute the batch
process estimator. Taking the partial derivative with respect to the state estimate xk and
evaluating at the optimal update, xk(+), we find
∂J(zk)
∂xk
T ∣∣∣∣xk(+)
= HTk R−1k Hk − HT
k R−1k zk. (4.1.10)
80
Evaluating the partial derivative at zero, the optimal state update is
xk(+) = (HTk R−1k Hk)
−1HTk R−1k zk (4.1.11)
=[Pk(−)−1HT
k RkHk
]−1 [Pk(−)−1xk(−) +HT
k R−1zk
]=[Pk(−)− Pk(−)HT
k (HkPk(−)HTk +Rk)
−1HkPk(−)]
·[Pk(−)−1xk(−) +HT
k R−1k zk
]= xk(−) + Pk(−)HT
k
[HkPk(−)HT
k +Rk
]−1[zk −Hkxk(−)] . (4.1.12)
From Eq. 4.1.12, we define the optimal gain to be
Kk = Pk(−)HTk [HkPk(−)HT
k +Rk]−1. (4.1.13)
Eq. 4.1.13 optimally combines the prior estimate history with measurement updates
to minimize the total error contributions based on the expected state and measurement
covariance. Much like the batch process method the Kalman filter produces a solution
that minimizes a quadratic cost function, Eq. 4.1, but it is also subject to the constraining
dynamic equations given by xk(−) and Pk(−). However, looking at Eq. 4.1.6 there is a
major advantage of the Kalman filter in its minimization of the cost function. For Hk to
be overdetermined, we only require a single measurement. Thus, at a fundamental level the
Kalman filter is formulated in such a way that it solves a least squares, left pseudo-inverse
problem, regardless of the number of measurements taken. This gives us the freedom to
minimize the number of exposures required to estimate the field to a precision adequate for
suppressing the field to the target contrast level.
While the form of the cost functions are the same, they are evaluating different crite-
ria. Consequently, we cannot use the cost functions to directly compare their optimality.
Instead we look to the only other metric of comparison, the covariance at each iteration. To
update the state covariance estimate, Pk(+), we continue to use the augmented matrices in
81
Eqs. 4.1.7-4.1.9. Beginning with the expected value function shown in Table 4.1, we write
Pk(+) = E[(xk(+)− xk)(xk(+)− xk)T ] (4.1.14)
= E
[[(HT
k R−1k Hk)
−1HTk R−1k nk
] [(HT
k R−1k Hk)
−1HTk R−1k nk
]T]=[(HT
k R−1k Hk)
−1HTk R−1k
]E[nkn
Tk ][(HT
k R−1k Hk)
−1HTk R−1k
]T= (HT
k R−1k Hk)
−1
= [Pk(−)−1 +HTk R−1k Hk]
−1. (4.1.15)
For comparison, we evaluate the covariance of a weighted form of the batch process method
described in Ch. 3, which is
P = E[(x− x)(x− x)T
](4.1.16)
= E[(HTR−1HTR−1n)HTR−1HTR−1n)T ]
= (HTR−1HTR−1)E[nnT ](HTR−1HTR−1)T
=(HTR−1H
)−1. (4.1.17)
As shown in Eq. 4.1.17, the state covariance of the batch process method resets after every
control step, and is tied to the noise in that particular set of measurements. However, the
covariance of the Kalman filter is also a function of the prior state covariance. Looking at
Eq. 4.1.15, HTk R−1k Hk, is guaranteed to be positive definite. Thus additional measurements
taken at each iteration will act to reduce the magnitude of the covariance since additional
measurements can do nothing but make the inversion smaller.
We can use the contrast normalization for the measurements, I00, to get an idea of the
estimator’s robustness. Looking ahead to §6.1, we see that I00 is a function of exposure time.
Thus if we do not take a long enough exposure in the probe images Rk will become quite
large, indicating a poor signal to noise ratio. This exposure time is based on the detection
limit for a given laser power, as discussed in §6.1. For the 2 mW laser power used in the
82
monochromatic experiments, this is on the order of 100 ms to detect 1×10−7 contrast levels.
In the broadband experiments, the power levels are on the order of a microwatt for any given
wavelength which means we require exposure times on the order of 10’s of seconds. In the
batch-process estimator, we are stuck with these measurements and will receive an estimate
with large covariance. In this case the control will not be effective, which is why we often see
jumps in contrast when using this estimator once we reach low contrast levels. In the case of
the Kalman filter, this high covariance is dampened by the contribution of prior covariance
estimates via Pk(−), stabilizing the state estimate and its covariance in the event that we
take a bad measurement. Since we cannot guarantee that a probe will provide good signal,
particularly at low contrast levels, this is an extremely attractive component of the Kalman
filter estimator. To complete the filter we need to propagate the prior estimate, xk−1(+),
to the current time step. The filter extrapolates to the current state estimate, xk(−), by
applying a time update to the prior state estimate via the state transition matrix, Φk−1, and
numerically propagating the control output from stroke minimization at the prior iteration,
uk−1, via a linear transformation described by Γk−1. We also have a disturbance from the
process noise, wk−1, which is propagated to the current state of the electric field via the
linear transformation, Λk−1. Assuming these components are additive, the state estimate
extrapolation is
xk(−) = Φk−1xk−1(+) + Γk−1uk−1 + Λk−1wk−1. (4.1.18)
We will apply the linearized optical model used to develop the batch process estimation
method and both control algorithms described in Ch. 2 and Ch. 3. Using a linearized model
avoids generating arbitrary bias in the estimate at each pixel, a common problem with a
nonlinear filter [21]. The first step in propagating the state forward in time is to update any
dynamic variation between the discrete time steps with the state transition matrix, Φk−1.
In this system, Φk−1 captures any variation of the field due to temperature fluctuations,
83
vibration, or air turbulence that perturb the optical system. To simplify the model, we
recognize that there is no reliable way to measure or approximate small changes in the
optical system over time with alternate sensors; we assume that the state remains constant
between control steps, making the state transition matrix, Φk−1, Φk−1 = Φ = I.
The process noise is any disturbance input into the system. The dominant contributor to
this will be errors in our expectation of the DM actuation, which will be discussed in greater
detail following this derivation. For the time being, we make the standard assumption that
the process noise is gaussian white noise, which means it’s expected value is zero. Thus, the
expected value of the state when we extrapolate is
xk(−) = Φk−1xk−1(+) + Γk−1uk−1. (4.1.19)
The covariance of the process noise will be handled in the covariance extrapolation. For
the same reason that we may treat Φ as a constant matrix, the optical system is assumed
to be stable enough that the linearized propagation of the control uk−1 is constant, making
Γk−1 = Γ. This matrix must map the control effect of the DM actuators to the image
plane electric field, but we need to sort it such that every pair of rows in Γu is the real
and imaginary parts of a particular pixel. To begin, we look to the control effect matrices
produced in Ch. 2. Recalling Eq. 2.1.14, We can produce a vector of complex values via
Gu = C{iAφ}. (4.1.20)
To produce gamma, we simply need to sort the control effect into the real and imaginary
parts per pixel. We have to take the real and imaginary parts pixel by pixel so that each
block element of the matrix forms as<{G}n,:={G}n,:
=
<{GDM1}n,: <{GDM2}n,:={GDM1}n,: ={GDM2}n,:,
(4.1.21)
84
where (n, :) indicates that we have taken all columns of the nth row in G. This block,
Gn,:, maps the effect of every DM actuator onto the nth pixel. We must reorganize the
control matrix in this manner for the sake of Hk. If we were to reorganize the state and
control vectors so that the real and imaginary components were stacked such that x =
[<{C{Ag}} ={C{Ag}}]T , Hk would be arranged in a sparse form rather than as a block
diagonal matrix. Thus, each submatrix for Γ shown in Table 4.1 is of dimension 2× 2NDM
and represents the control effect on a single pixel of the matrix.
Following Stengel [68], we use the state transition matrix to propagate the prior covariance
estimate forward. Applying an additive term for the process noise, Qk−1, the extrapolated
covariance estimate is
Pk(−) = Φk−1Pk−1(+)ΦTk−1 +Qk−1. (4.1.22)
where
Qk = EwkwTk . (4.1.23)
The details of how we formulate Qk, and the sensor noise, Rk, will be addressed in more
detail in §4.2. Combining Eq. 4.1.12, Eq. 4.1.13, and, Eq. 4.1.15 with the extrapolation
equations used in the cost function, we have the discrete time Kalman filter. This form of
the filter consists of five equations that describe the state estimate extrapolation, covari-
ance estimate extrapolation, filter gain computation, state estimate update, and covariance
estimate update at the kth iteration [68]:
xk(−) = Φk−1xk−1(+) + Γk−1uk−1. (4.1.24)
Pk(−) = Φk−1Pk−1(+)ΦTk−1 +Qk−1 (4.1.25)
Kk = Pk(−)HTk
[HkPk(−)HT
k +Rk
]−1(4.1.26)
xk(+) = xk(−) +Kk [zk −Hkxk(−)] (4.1.27)
Pk(+) =[Pk(−)−1 +HT
k R−1k Hk
]−1(4.1.28)
85
A fundamental property of the Kalman filter is that the optimal gain, Eq. 4.1.26, is
not based on measurements, but rather estimates of the state covariance, Pk(−), process
noise from the actuation Qk−1, and sensor noise Rk. This means that the optimality of the
estimate is closely related to the accuracy and form of these matrices; this will be discussed
at length in §4.2. The gain matrix, Kk, is ultimately what balances uncertainty in the prior
state estimate against uncertainty in the measurements zk when computing the final state
estimate update, xk(+).
Matrix Dimension
Φ = I (2 ·Npix)× (2 ·Npix)
Γ =
[<{GDM1} <{GDM2}={GDM1} ={GDM2}
]1
...[<{GDM1} <{GDM2}={GDM1} ={GDM2}
]n
(2 ·Npix)× (2 ·NDM)
Λ = Γ (2 ·Npix)× (2 ·NDM)
P0 = E[(x0 − x0)(x0 − x0)T ] (2 ·Npix)× (2 ·Npix)
Qk = ΛE[wkwTk ]ΛT (2 ·Npix)× (2 ·Npix)
Hk = diag
<{GDM2φk1} ={GDM2φk1}
......
<{GDM2φkj} ={GDM2φkj}
n
(Npix ·Npairs)× (2 ·Npix)
Rk = E[nknTk ] (Npix ·Npairs)× (Npix ·Npairs)
Kk is computed (2 ·Npix)× (Npix ·Npairs)
Table 4.1: Definition of all filter matrices. NDM is the number of actuators on a single DM,Npix is the number of pixels in the area targeted for dark hole generation, and Npairs is thenumber of image pairs taken while applying positive and negative shapes to the deformablemirror
With Hk, zk, Γ, x, and uk constructed, the dimension and form of the rest of the filter
follows. Table 4.1 and Table 4.2 define all the matrices and vectors in the filter equations
86
for this problem and provides their dimensionality for clarity. The initialization of the
covariance, P0, is critical for the performance of the filter. In our system this cannot be
measured, so we must initialize with a reasonable guess. We can use the final covariance
matrix from a prior control attempt (to maximum achievable contrast) to initialize the filter
in the future so that its form might be more accurate. We compute the process noise
assuming a standard zero-mean variance on the actuation of the DMs, w. The sensor noise
is determined statistically from the readout noise that the detector exhibits when taking
dark frames. As in Ch. 3, the focal plane measurements zk are identical to that of §3, and
are constructed into a vertical stack of difference images taken in a “pair-wise” fashion to
produce j measurements for n pixels. Likewise Hk takes on a similar form, and is a matrix
constructed from the effect of a specific deformable mirror shape φj on the real and imaginary
parts of the electric field in the image plane. Finally, we compute the covariance update,
Pk(+), based on the added noise from the new measurements. The estimated state is a
vertical stack of the real and imaginary parts of the electric field at each pixel of the dark
hole in the image plane. The control signal u is a vertical stack of the actuators of each DM,
with DM1 being stacked on top of DM2. Since we are only considering process noise at
the DMs, the process disturbance w is a vertical stack of the variance expected from each
actuator.
Recalling Eq. 3.3.1, Hk is constructed by separating the real and imaginary parts of
the DM probe field. Thus it will be underdetermined unless at least 2 pairs of images
are used in the measurement, one of the major limitations of the batch process method in
Ch. 3. This will result in a non-unique solution to the state when using a batch-process,
and will only provide the solution with the smallest quadratic norm since it must be solved
via the right pseudo-inverse. On the other hand, the Kalman filter only requires a single
measurement as an update to the state. Therefore it isn’t necessary for the matrix to be
square or overdetermined, and we maintain a favorable dimensionality when updating the
state.
87
Variable Dimension
z =
I+1 − I−1
...I+j − I−j
1
...I+1 − I−1
...I+j − I−j
n
(Npix) · (Npairs)× 1
x =
[<{E1}={E1}
]...[
<{En}={En}
] (2 ·Npix)× 1
u =
[DM1DM2
](2 ·NDM)× 1
w =
[σDM1
σDM2
](2 ·NDM)× 1
Table 4.2: Definition of filter vectors. NDM is the number of actuators on a single DM,Npix is the number of pixels in the area targeted for dark hole generation, and Npairs is thenumber of image pairs taken while applying positive and negative shapes to the deformablemirror
In the scenario of coronagraphic imaging, we are photon limited which typically means
exposure time is by far the limiting factor when estimating the field. However, there a
lot of mathematical operations involved with a Kalman filter. It is worth looking at the
computation time required to compute the update since it will ultimately limit the speed
with which we may estimate the field. The number of mathematical operations follows from
the dimension of the matrices given in Table 4.1 and Table 4.2. Thus, the computation is
directly dependent on the size of the dark hole, the number of actuators on the DMs, and
the number of measurements taken at each step. For a fixed dark hole size and a set number
of actuators, all we can do is attempt to minimize the number of measurement updates per
iteration. Presumably we could bin the camera to reduce the number of pixels, Npix, but
88
this will not benefit a true observatory since the image plane is Nyquist sampled (implying
a loss of necessary spatial information if we bin the detector). The number of actuators is
not a limiting factor in any space or ground telescope design to date, but this does bring to
light a general control and estimation challenge for the next generation of extremely large
telescopes (ELTs). AO studies for ELTs are investigating DMs with ≈ 40,000 actuators or
more, meaning there will be a numerical challenge for any estimation and control scheme
(even a conventional atmospheric AO scheme)[41]. In this case, we would likely have to come
up with a way to reduce the dimension of the problem. However, in current observatory
scenarios the highest available actuator count is limited to 4,000 actuators. In this case
we will certainly be limited by exposure time even if we estimate the field over the entire
controllable area of the DM (since exposure times suitable for high contrast detection are
on the order of minutes to hours). In the laboratory experiment, with 2 mW of laser power
using the Ripple3 coronagraph[5] the computation time for the estimation step is on the
order of seconds, which is much faster than the field variability at the 1× 10−7 level. Since
readout takes approximately 0.6 seconds on the Starlight Xpress SXV-M9 camera, exposure
time is a significant fraction of the time required to estimate the field, even in the case of
high laser power levels. Thus, the speed of the Kalman filter computation is not limiting our
current achievable contrast levels.
4.2 Sensor and Process Noise
Two important design parameters for the performance of the filter are the process noise,
Qk−1, and the sensor noise, Rk. In order for the filter to operate optimally in the laboratory
we must make reasonable assumptions for the values that exist in the laboratory. Rather
than running a simulation to find the most likely values, we appeal to physical scaling of the
two largest known sources of error in the system. Our sensor noise will be the dark current
and read noise inherent to the detector. Process noise will largely come from errors in the
89
actuation shape.
Defining the control perturbation for the Kalman filter such that it is treated as an
additive error allows us to appeal to the value of the process noise Q physically, since we do
not have a way to measure it. For a disturbance, wk, we define the covariance of the process
noise as
Q′k = E[wkw
Tk
]. (4.2.1)
Following the definitions for Q in Stengel [68], the process noise at the image plane is
Qk = ΛQ′kΛT , (4.2.2)
where Λ propagates the process noise to the image plane. To propagate process noise onto
the image plane, Qk = ΛE[wwT ]Λ, we assume w is solely from the variance in DM actuation.
This allows us to choose Λk−1 = Λ = Γ. We can trace errors of the actuation shape to two
sources. The first is the resolution of the digital-to-analog (D/A) converter, which is 14-
bits over a 250 Volt range. This gives us a resolution of approximately 0.015 Volts, which
corresponds to approximately 0.03 nm vertical resolution. This is much more precise than
the surface knowledge, which is the true limitation. Poor knowledge of the surface comes
from the inherent nonlinearity in the voltage-to-actuation gain as a function of voltage,
the variance in this gain from actuator to actuator, and the accuracy of the superposition
model used to construct the mirror surface that covers the 32x32 actuator array of the
Boston Micromachines kilo-DM. Physical models, such as those found in Blain et al. [9],
have been constructed to produce a more accurate surface prediction over the full 1.5µm
stroke range with an rms error of ≈ 10 nanometers. Since we operate in a low actuation
regime, superposition is still a relatively safe assumption as evidenced by current laboratory
success. The Kalman filter presents an elegant solution where we can treat actuation errors
as additive process noise and include them in the estimator in a statistical fashion, rather
than deterministically in a physical model. Since there is no physical reasoning to justify
90
varying Qk at each iteration, it will be kept constant throughout the entire control history
(Q′k = Q′ = constant). Two versions of Q′ can be considered in this case. The first is where
we simply have no correlation between actuators, giving a purely diagonal matrix with a
magnitude corresponding to the square of the actuation variance, σu. Note that while Q′ is
diagonal the process noise at the image plane, Q, will not be diagonal because of the linear
transformation to propagate this variance to the image plane via Γ. The second version of
Q′ which we may consider is one that has symmetric off-diagonal elements. This will treat
uncertainty due to inter-actuator coupling and errors in the superposition model statistically.
As a first step, we will not consider inter-actuator coupling to help avoid a poorly conditioned
matrix. This helps guarantee that the Kalman filter itself will be well behaved. Thus the
process noise for the filter will be
Q = σ2uΓIΓT . (4.2.3)
Following Howell [35] the noise from both the incident light and dark noise in a CCD
detector follows a Poisson distribution. This will lead to an estimator that is not truly
optimal for short exposures, but the Poisson distribution will more closely approximate
a Gaussian distribution with non-zero mean for an adequately long exposure. We could
subtract a median dark frame, but differencing pairwise images allows us to construct a
linear observer in matrix form. The noise in each measurement will have zero-mean and will
become more Gaussian as the exposure time increases.
The statistics for the HCIL detector at relevant exposure times are shown in Fig. 4.2.
Here we simplify the noise statistics by assuming it is uncorrelated and constant from pixel
to pixel, making Rk a diagonal matrix of the mean pixel covariance in units of contrast.
Since we do not have the ability to measure this variance actively the assumption is that the
CCD and laser source have been thermally stabilized, which will keep the variance constant
91
3385 3390 3395 3400 3405 34100
0.5
1
1.5
2
2.5x 10
4 Distribution of Dark Frame for 50ms
Counts
Occu
ren
ce
(a) 50 ms Dark
3385 3390 3395 3400 3405 34100
0.5
1
1.5
2
2.5x 10
4
Counts
Occu
ren
ce
Distribution of Dark Frame for 200ms
(b) 200 ms Dark
Figure 4.2: Average counts across the detector for a dark frame. At short exposure times thisfollows a Poisson distribution. At much longer exposure times it should be well approximatedby gaussian distribution, but is still slightly Poisson at 200ms.
over time, thus defining the sensor noise as
R =σCCDI00
INpairs×Npairs . (4.2.4)
Having appealed to physical scaling in the HCIL, we now have close approximations of
the true process and sensor noise exhibited in the experiment. With an appropriate, non-
zero, initialization of the covariance we will be able to produce an effective optimal gain that
will leverage the data at each iteration as much as possible to produce a new state estimate
update.
4.3 Iterative Kalman Filter
An additional advantage of the Kalman filter is that we may apply the filter iteratively,
feeding the newly computed state xK(+) and covariance update Pk(+) back into the filter
again, setting uk−1 to zero. For sufficiently small control this will help account for nonlin-
earity in the actuation and better filter noise in the system, limited only by the accuracy of
92
the observation matrix, Hk. With no control update, the control signal will be set to zero
when we iterate the filter. Following a notation similar to Gelb et al. [21], the jth iteration
of feedback into the iterative Kalman filter at the kth control step is
xj,k(−) = xj−1,k−1(+) (4.3.1)
Pj,k(−) = Φj−1,k−1Pj−1,k−1(+)ΦTj−1,k−1 +Qj−1,k−1 (4.3.2)
Kj,k = Pj,k(−)HTj,k
[Hj,kPj,k(−)HT
j,k +Rj,k
]−1(4.3.3)
xj,k(+) = xj,k(−) +Kj,k [zj,k −Hj,kxj,k(−)] (4.3.4)
Pj,k(+) =[Pj,k(−)−1 +HT
j,kR−1j,kHj,k
]−1. (4.3.5)
The power of iterating the filter lies in what we are fundamentally trying to achieve. For
a successful control signal, we will have suppressed the field. This means that the magnitude
of the probe signal will be lower than the control perturbation. This guarantees that Hk
will better satisfy the linearity condition than Γu. As a result, if we iterate the filter on
itself during a given control step we can use the discrepancy between the image predicted by
Hkxk(+) and the measurements, zk, in Eq. 4.3.4 to filter out any error due to nonlinear terms
not accounted for in Γ. In this way, we can accommodate a small amount of nonlinearity in
the extrapolation of the state without having to resort to a nonlinear, or extended, Kalman
filter. This means that we don’t have to re-linearize about xk(+), as would be the case for
an iterative extended Kalman filter (IEKF). It also avoids having to concern ourselves with
any bias introduced into the estimate by a nonlinear filter.
4.4 Optimal Probes: Using the Control Signal
In §3.4 we discussed the choice of probe shapes to create a well posed problem. In principle,
we have found shapes that adequately probe the field by perturbing the field as uniformly as
possible. However, nobody has ever looked deeply into the true merit of these functions or
93
how to choose the “best” shapes to probe the dark hole. In any dark hole there are discrete
aberrations that are much brighter than others, requiring that we apply more amplitude to
those spatial frequencies. Conversely the bright speckles raise the amplitude of the probe
shape, which is too bright for to take a good measurement of dimmer speckles. Excluding
this issue, we also cannot truly generate the analytical functions described in §3.4. Even the
DM with the highest actuator density available, the Boston Micromachines 4K-DM©, can
only approximate each function with 64 actuators. We account for the true shape in the
model but this shape does not truly probe each pixel in the dark hole with equal weight,
which was the primary advantage of the analytical function for a probe shape in the first
place. Fortunately, we can once again appeal to the mathematical model for estimation and
control to help determine an adequate probe shape. Once one estimate has been provided,
the control law determines a shape to suppress all the speckles in the dark hole. Since it
has suppressed the field, this control shape necessarily probes the aberrated field in the
dark hole. If we apply the conjugate of the control shape we will increase the energy of
the aberrated field. Thus, we can rely on the controller to compute shapes that optimally
probe the aberrated field and automatically choose amplitudes appropriate for the intensity
at each pixel in the dark hole.
4.5 Chapter Assumptions
§4.1 Filter Construction:
• The linear form of the filter derived here requires the linearized models of the image
plane electric field developed in Ch. 1 and Ch. 2. The merit of this linearization over
an extended Kalman filter and our ability to accommodate any nonlinearities via filter
recursion is discussed in this section.
§4.2 - Sensor and Process Noise:
94
• All process noise is limited to uncertainty in the DM actuation, and does not account
for interactuator coupling.
• All sensor noise is limited to the dark current of the CCD. The noise has zero-mean,
and over long exposures the Poisson distribution describing this noise will closely ap-
proximate white gaussian noise.
95
Chapter 5
Laboratory Results
In Chapters 2, 3, and 4 we developed a number of estimation and control algorithms that
make up the focal plane wavefront correction algorithm designed to recover regions of high
contrast in finite areas of the image plane. In this chapter we present the results of experi-
ments testing these correction algorithms at the Princeton HCIL and their ability to produce
symmetric dark holes in both monochromatic and broadband light. Since the purpose is to
develop and test the performance of the controller itself, we do not apply any post-processing
techniques to remove so-called “incoherent” components of the electric field [33]. This is not
done because it is possible that model error will look like incoherent light and will be falsely
subtracted, adding uncertainty to our performance. More importantly, our purpose is to test
the performance of the correction algorithms, which is different than testing our ability to
detect a planet. The values reported in this thesis demonstrate a situational performance
during an observation, rather than relying on a post-processing technique to achieve a higher
contrast level.
5.1 Monochromatic Performance
To begin, we demonstrate the monochromatic performance of the Princeton HCIL using
the stroke minimization correction algorithm derived in §2.1. This allows us to compare
96
the performance of the DM Diversity and Kalman filter estimators in the simplest scenario
where we do not have to consider the effect of bandwidth on their performance. Overall, we
demonstrate our ultimate achievable contrast and the ability of the Kalman filter to more
efficiently suppress the field by requiring fewer estimation exposures.
5.1.1 DM Diversity Performance
As discussed in Ch. 3, the DM diversity estimator can produce a unique solution with as
few as two measurements, each a difference image of conjugate DM shapes. This is the
number of measurements used at the JPL HCIT for estimation at each iteration, but at the
Princeton HCIL there is enough uncertainty in the system that we require a minimum of
three measurements (6 images) to take advantage of the averaging effect of the left pseudo-
inverse. Practically, we find that the DM diversity estimator requires four measurements (8
images total) to reach our ultimate achievable contrast levels. This is the baseline image set
for comparing to the performance of the Kalman filter in §5.1.2. The laboratory starts at
λ0/D
λ0/D
Initial Image Diversity Estimator
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6.5
−6
−5.5
−5
−4.5
−4
(a)
1 11 21 30
10−6
10−5
10−4
Co
ntr
ast
Iteration
Contrast Plot
AVG Contrast
Left Contrast
Right Contrast
(b)
λ0/D
λ0/D
Best Contrast Diversity Estimator
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6.5
−6
−5.5
−5
−4.5
−4
(c)
Figure 5.1: Experimental results of sequential DM correction using the DM-Diversity esti-mation algorithm. The dark hole is a square opening from 7–10 × -2–2 λ/D on both sidesof the image plane. (a) The aberrated image. (b) Contrast plot. (c) The corrected image.Image units are log(contrast).
an initial contrast of 1.23× 10−4 (Fig. 5.1(a)). Using the least-squares estimation technique
it is capable of reaching an average contrast of 2.3 × 10−7 in a (7–10)×(-2–2) λ/D region
97
within 30 iterations (Fig. 5.1(c)) on both sides of the image plane, a unique capability that is
a result of the two deformable mirrors in the system. The size of the dark hole is limited by
our certainty in the DM shape. As we increase the outer working angle of the dark hole, we
need better certainty of the DM at higher spatial frequencies to maintain the same contrast
level. This is compounded by the fact that we have two DMs in series, but this enables us
to create symmetric dark holes in the image plane.
5.1.2 Kalman Filter Performance
In this section we correct the field using the Kalman filter from Ch. 4 for estimation using
four, three, two, and one pair of images as a measurement update to assess the degradation
in performance as information is lost. We begin with four measurements (four image pairs),
to compare its performance using the same number of measurements as the batch process
estimator to produce the results of §5.1.1. Using 4 pairs, the filter achieved a contrast of
4.0× 10−7 in (7-10)x(-2-2) λ/D symmetric dark holes within 20 iterations of the controller,
shown in Fig. 5.2. Note that this used a total of 160 estimation images, which is the same
amount of information available to the batch process method in §5.1.1 when it achieved a
contrast of 3.5× 10−7 in 20 iterations.
λ0/D
λ0/D
Initial Image
−10 −5 0 5 10
−10
−5
0
5
10
−6.5
−6
−5.5
−5
−4.5
−4
(a)
0 10 20
10−6
10−5
10−4
Contr
ast
Iteration
Contrast Plot
AVG Contrast
Left Contrast
Right Contrast
(b)
λ0/D
λ0/D
Final Image: 4 Pairs
−10 −5 0 5 10
−10
−5
0
5
10
−6.5
−6
−5.5
−5
−4.5
−4
(c)
Figure 5.2: Experimental results of sequential DM correction using the discrete time extendedKalman filter with 4 image pairs to build the image plane measurement, zk. The dark hole isa square opening from 7–10 × -2–2 λ/D on both sides of the image plane. (a) The aberratedimage. (b) Contrast plot. (c) The corrected image. Image units are log(contrast).
98
When the number of image pairs is reduced to three, the correction algorithm was still
able to reach a contrast level of 5.0 × 10−7 using only 120 estimation images, as shown in
Fig.5.3. Having proven that we can successfully reach very close to the same limits with fewer
λ0/D
λ0/D
Initial Image
−10 −5 0 5 10
−10
−5
0
5
10
−6.5
−6
−5.5
−5
−4.5
−4
(a)
0 10 20
10−6
10−5
10−4
Contr
ast
Iteration
Contrast Plot: 3 Pairs
AVG Contrast
Left Contrast
Right Contrast
(b)
λ0/D
λ0/D
Final Image: 3 Pairs
−10 −5 0 5 10
−10
−5
0
5
10
−6.5
−6
−5.5
−5
−4.5
−4
(c)
Figure 5.3: Experimental results of sequential DM correction using the discrete time extendedKalman filter with 3 image pairs to build the image plane measurement, zk. The dark hole isa square opening from 7–10 × -2–2 λ/D on both sides of the image plane. (a) The aberratedimage. (b) Contrast plot. (c) The corrected image. Image units are log(contrast).
exposures, we now tune the covariance initialization and noise matrices and attempt only
using two pairs of images. By reducing the number of image pairs to two, we are using half
as many images as the correction shown in §5.1.1 and have reached a point where the batch
process method will no longer provide a solution that takes advantage of the averaging effect
of the left pseudo-inverse solution. After further tuning the covariance and noise matrices,
the contrast achieved after 30 iterations of the correction algorithm was 2.3 × 10−7, shown
in Fig. 5.4. Note that this is better than the case which used three pairs because we have
improved the covariance initialization and increased the number of times the filter is iterated
in a single control step. In fact it should be noted that making the filter iterative is critical
to its performance since it accounts for nonlinearity, particularly in the propagation of the
control.
Reducing the number of measurements to a single pair we find a very interesting result.
The quality of the measurement at any particular time step of the algorithm is now dependent
on the quality of that particular probe shape. As a result, if the probe does not happen to
99
λ0/D
λ0/D
Aberrated Field
−10 −5 0 5 10
−10
−5
0
5
10
−6.5
−6
−5.5
−5
−4.5
−4
(a)
0 10 20 30
10−6
10−5
10−4
Contr
ast
Iteration
Contrast Plot
AVG Contrast
Left Contrast
Right Contrast
(b)
λ0/D
λ0/D
Corrected Contrast: 2.3 × 10−7
−10 −5 0 5 10
−10
−5
0
5
10
−6.5
−6
−5.5
−5
−4.5
−4
(c)
Figure 5.4: Experimental results of sequential DM correction using the discrete time extendedKalman filter with 2 image pairs to build the image plane measurement, zk. The dark hole isa square opening from 7–10 × -2–2 λ/D on both sides of the image plane. (a) The aberratedimage. (b) Contrast plot. (c) The corrected image. Image units are log(contrast).
modulate the field well the field estimate gets worse. It is also important to cycle through
the probe shapes. A single probe may not modulate a specific location of the field well, so we
must choose a different probe shape to guarantee that we adequately cover the entire dark
hole. Starting from an aberrated field with an average contrast of 9.418× 10−5, Fig. 5.5(a),
we achieved a contrast of 3.1×10−7 in 30 iterations and 2.5×10−7 in 43 iterations of control,
Fig. 5.5(c). Looking at the contrast plot in Fig. 5.5(b), the sensitivity of a single measurement
update to the quality of the probe is very clear. What is interesting, however, is that the
modulation damps out over the control history. While we do not suppress as quickly in earlier
iterations, as in the case with more probes, we achieve our ultimate contrast levels in almost
the exact same number of iterations. This is a direct result of developing good coverage
across the dark hole over time by changing the probe shape at each iteration. Thus, even
with one measurement update at each iteration the prior state estimate history stabilizes the
estimate in the presence of the measurement update’s poor signal-to-noise at high contrast
levels. What is further encouraging is that this is still applying the arbitrary probe shapes
derived in Ch. 3. If we were to intelligently choose our probes, as discussed in Ch. 4, we will
see a dramatic improvement in the rate of convergence for a single measurement update.
A very promising aspect of this estimation scheme is that its performance did not degrade
100
λ0/D
λ0/D
Aberrated Field
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6.5
−6
−5.5
−5
−4.5
−4
(a)
0 10 20 30 40
10−6
10−5
10−4
Contr
ast
Iteration
Contrast Plot
AVG Contrast
Left Contrast
Right Contrast
(b)
λ0/D
λ0/D
Corrected Contrast 2.5 × 10−7
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6.5
−6
−5.5
−5
−4.5
−4
(c)
Figure 5.5: Experimental results of sequential DM correction using the discrete time extendedKalman filter with one image pair to build the image plane measurement, zk. The dark holeis a square opening from 7–10 × -2–2 λ/D on both sides of the image plane. (a) Theaberrated image. (b) Contrast plot. (c) The corrected image. Image units are log(contrast).
significantly as the amount of measurement data was reduced. With only 86 estimation
images it was capable of reaching the same final contrast (within measurement uncertainty)
achieved by the DM diversity algorithm in §5.1.1, which achieved a contrast of 2.5 × 10−7
in 30 iterations. The batch process required 240 images to maintain an estimate of the
entire control history, achieving a contrast of 2.3 × 10−7. Thus by making the estimation
method more dependent on a model we were able to reduce our need to measure deterministic
perturbations in the image plane electric field.
5.2 Broadband Performance
To take spectra of any planets we discover in our dark hole, we need to extend the experi-
mental results of §5.1 to broadband light so that the planet is detectable in more than one
wavelength. In §2.4 we developed the Windowed Stroke Minimization algorithm with esti-
mate extrapolation to accomplish this task. Here we show the results of these experiments,
and point to an interesting laboratory limitation that required an upgrade of the optical fiber
used as the point source in the experiment. The results in §5.2.1 are prior to this upgrade,
and are included for the sake of comparison. §5.2.2 shows the most current results from the
101
Princeton HCIL producing symmetric dark holes in a targeted 10% band around λ0 = 633
nm. In all cases, the results we present are in a dark hole region with dimension 7 –10 ×
-2 – 2 λ0/D. The contrast measurement is pinned to the central wavelength so that we can
pin the performance to a fixed sky angle, α, defined as α = tan−1(nλ0/D). In a 10% band
the physical shift is less than one pixel at the HCIL, and the controller corrects an area of 6
–11 × -3 – 3 λ0/D. In this way we do not have areas that the controller has not corrected
leaking light into the dark hole, skewing our measurement of the controller performance.
5.2.1 Prior to Single Mode Photonic Crystal Fiber
In the first broadband experiments at the HCIL, the output fiber shown in Fig. 1.5 was
simply a 633 nm single mode fiber. The correction was performed at 620 rather than 633
nm as in the other experiments, partially because of filter availability. In this experiment
we have tested the performance of the Windowed Stroke Minimization algorithm of §2.4
over a 10% bandwidth. The estimate for the filters bounding the 10% target bandwidth
are computed using the estimate extrapolation technique developed in §2.5. Starting at
an average contrast of 1.2740 × 10−4 (Fig. 5.6(c)) over the five filters spanning our 10%
bandwidth (600,620,633,640,650 nm), Fig. 5.6(d) shows an average contrast of 6.15 × 10−6
when we use the filter extrapolation technique . Note that while the central wavelength of the
650 nm filter does not exactly reach 5% above our central wavelength, it has a relatively wide
bandwidth that reaches 558 nm at its FWHM. Starting at a contrast level of 1.0529× 10−4
over the full bandwidth, the controller reached a contrast limit of 1.842× 10−5.
Looking at the wavelength performance, we see that even the central wavelength is not
suppressed particularly well. The dark holes exhibit a good average contrast, but there is a
lot of variance within them and their edges are not well defined. Additionally, we see a rapid
degradation of the dark hole field as a function of wavelength to the point where it is virtually
indistinguishable when we reach the bounding wavelengths in the optimization at 600 and
650. Compared to a typical monochromatic experiment, these images depict an abnormally
102
550 600 632 660 700
10−5
10−4
Co
ntr
ast
λ
Pre-PCSM Extrapolation Results
AVG ContrastLeft Contrast
Right ContrastInitial Contrast
(a)
λ0/D
λ0/D
Pre-PCSM Full Bandwidth 1.84 × 10−5
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−5
−4.8
−4.6
−4.4
−4.2
−4
(b)
λ0/D
λ0/D
Pre-PCSM 10% Mean Initial = 1.27 × 10−4
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−5.6
−5.4
−5.2
−5
−4.8
−4.6
−4.4
−4.2
−4
−3.8
(c)
λ0/D
λ0/D
Pre-PCSM 10% Mean Contrast = 6.15 × 10−6
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−5.6
−5.4
−5.2
−5
−4.8
−4.6
−4.4
−4.2
−4
−3.8
(d)
Figure 5.6: Pre-PCSM Extrapolated results
high amount of structure in the dark hole and appear to be highly sensitive to variance in low
to mid-spatial frequency aberrations. The chromatic dependence of these errors, particularly
at the shorter wavelengths, indicates that the 633 nm single mode fiber is inadequate for
the broadband experiments. This is a result of the multimode output (primarily TEM01
and TEM10 modes) at shorter wavelengths and the high degree of attenuation at longer
wavelengths. We chose to reproduce the results in this section after upgrading the fiber in
the laboratory.
103
5.2.2 Photonic Crystal Single Mode Fiber Upgrade
Given the non-single mode nature of the output beam at shorter wavelengths (and our
sensitivity to such aberrations), the poor coupling efficiency, and high attentuation of the
633 nm single mode fiber at longer wavelengths we chose to upgrade the fiber delivery to a
Koheras Photonic Crystal continuously Single Mode (PCSM) fiber. We chose a fiber option
that fully spanned the bandwidth we operate over, with a 5 µm core. This provided the
smallest mode field diameter available, 4.5 – 4.7 ± 0.5 µm, providing a numerical aperture
(NA) of ≈0.1 – 0.14 across the visible spectrum (NA being the sine of the divergence half-
angle). This is comparable to the 4.3 – 4.6 µm mode field diameter and NA 0.10 – 0.14 of
a single mode 633 nm fiber between 633 and 680 nm. Overall, the PCSM fiber has a lower
level of attentuation, is continuously single mode, and roughly matches the beam divergence
angle expected from the original fiber, which we have found in the past to well approximate
a point source. Since the field from a star is effectively planar, our ability to provide single-
mode light at all wavelengths allows us to more accurately demonstrate the controller under
conditions true to a real observation. Fig. 5.8 shows the overall results of applying the same
extrapolation technique after the new fiber had been installed. Fig. 5.8(a) shows marked
contrast improvement at all wavelengths, the out of band wavelengths improving on the order
of 3 × 10−5. As we would have expected, the shorter wavelengths improved more than the
longer wavelengths because their output no longer contains higher TEM modes. Comparing
Fig. 5.8(b) to Fig. 5.6(b) we also see that we have a slight improvement in the inner working
angle of the dark hole, which is consistent with the fact that we eliminated very low order
modes, such as TEM01 and TEM10, by upgrading to the new fiber. Very little of the energy
in Fig. 5.6(d) is (intentionally) below the cutoff wavelength for single mode output of the
633 nm SM fiber, which is why the IWA improvement is not as evident when comparing to
Fig. 5.8(d). Looking at the progression of the final dark hole in wavelength, Fig. 5.9, we see
that the central wavelength is deeply suppressed while the intensity of the dark hole raises
rapidly. For the filters inside the ≈ 10% optimization bandwidth (600, 620,640, and 650
104
nm) we see that the contrast degradation is a result of small scale aberrations growing in
intensity. Outside of these wavelengths the dark hole degrades rapidly to the point that it is
not distinguishable in the 550 and 740 nm images. While the average contrast does degrade
from the slight shift in the dark hole location, it is also due to speckles within the dark hole
increasing in intensity. This indicates that we are somewhat limited by the accuracy of our
extrapolation, which tends to introduce fine structure into the dark hole. Note, however,
that when comparing Fig. 5.9(e) with Fig. 5.7(d) we see that the fine structure at the central
wavelength is gone. This is entirely due to the fiber upgrade, since no other modifications
were made to the experiment.
105
The accuracy of the functional relationship of the phase and amplitude among wave-
lengths will ultimately bound the achievable bandwidth; therefore, as a metric, these results
are also compared to estimating each wavelength separately. As discussed in §2.5, improv-
ing this functional relationship requires that we establish a higher order relationship of the
electric field that captures more of the system model. For the time being, we compare the
performance of the simplest (and fastest) extrapolation technique we may physically mo-
tivate to multiple estimates, which will be slower but presumably more accurate at longer
wavelengths.
Fig. 5.10 shows the overall performance of multiple estimates vs. single estimates. When
estimating each wavelength separately the contrast reaches 5.67 × 10−6 in a ∼ 10% band
(Fig. 5.10(d)) and 1.364× 10−5 over the full bandwidth (Fig. 5.10(b)). There is no improve-
ment compared to the 5.48×10−6 contrast achieved in the 10% band and 1.298×10−5 contrast
over the full spectrum using the estimate extrapolation technique. Shaklan et al. [65] show
that the ultimate achievable contrast is a function of the correction bandwidth. They show
that this limitation is from propagation induced amplitude distributions in the field from
surface figure errors on the optics, and the fact that we have a finite controllable bandwidth
using two DMs in series (or a Michelson configuration). If we assume that the DM surfaces
(Fig. 1.10) are the worst figures in our system and apply this to the derivation in Shaklan
et al. [65], the HCIL optical system should be capable of reaching at least 1 × 10−6 over a
20% bandwidth, indicating that both methods are well above the fundamental limitations
of this optical system (Figs. 5.1(b), 5.10(a)) and these results are largely limited by higher
sensitivity to estimation error and system stability. Comparing the contrast as a function of
wavelength in Fig. 5.8(a) and Fig. 5.10(a), the bandwidth has been suppressed much more
uniformly when multiple estimates are used in lieu of the extrapolation technique. Since the
bounding wavelengths were only slightly underweighted in the optimization (µ = 0.75) we
expected a relatively uniform suppression as in Fig. 5.10(a). This indicates that the accuracy
of the extrapolation was the limiting factor in allowing the controller to evenly suppress the
106
bandwidth. However, the ultimate contrast of the central wavelength is not nearly as low in
the direct estimate as it was when applying multiple estimates. Comparing the dark holes at
the central wavelength using estimate extrapolation (Fig. 5.9(e)), we see that the dark hole
using direct measurements (Fig. 5.11(e)) exhibits much more residual structure. However,
Fig. 5.11(c)–Fig. 5.11(g) show that the region bounding the corrected area persists better
than the dark hole in the extrapolation case, Figs. 5.9(c)–5.9(g).
Since both reached roughly the same average contrast in the 10% band, we may have
fundamentally bottomed out the achievable contrast for symmetric dark holes (at the HCIL)
over that bandwidth. In other words, we can either have all five filters at a modest con-
trast level or we can have one wavelength highly suppressed at the cost of worse contrast
in the others. This would be related to the inherent chromaticity of our pupil from highly
aberrated, non-conjugate planes. This could be beyond the effective bandwidth achievable
using only two DMs in series. However, another distinct possibility is that we have reached
a stability limit in the experimental setup. Since we required three individual estimates to
achieve the results shown in Fig. 5.10, the estimation step took roughly three times longer
than in the extrapolation case. The low power of the filtered broadband light requires expo-
sure times of ≈40 seconds. With 8 exposures required per estimate using the batch process
estimator means that the estimation step went from ≈5 to ≈15 minutes per iteration. As
will be shown in §6.5, the system is only stable to ≈2 − 5 × 10−7 over such a long period
(independent of power fluctuations). Thus, the extrapolation method reached the limit of
system variance over a 5 minute interval at the central wavelength, but at the cost of less
accurate estimates over the bandwidth due to an innaccurate extrapolation. On the other
hand, the longer time frame required to take multiple estimates meant that we compromised
the stability of the experiment but we were able to more evenly suppress the field over the
bandwidth. As a result, we cannot prove that we have reached a fundamental limit in the
laboratory without getting more laser power or improving system stability. The sensitivity of
the correction algorithm to laboratory stability demonstrates the power of the extrapolation
107
technique. To take full advantage of an observatory’s stability, we clearly want to reduce
the time required to produce estimates of the electric field over the optimization bandwidth.
Furthermore, the advantage of establishing an augmented cost function and using extrapo-
lated wavelengths is that it automatically extends the optimal estimator developed in Ch. 4
to broadband light because this method only requires a single monochromatic estimate. It
is therefore worthwhile to continue pursuing more accurate and sophisticated extrapolation
techniques. The most promising direction we currently see is to augment the Kalman filter
to include the extrapolation. This potentially allows us to produce estimates of multiple
wavelengths using incomplete measurements at every wavelength. Thus, the estimator could
not independently produce an estimate at each wavelength but averages uncertainty in the
wavelength dependence across all three estimates.
5.3 Final Remarks
In this chapter we have demonstrated the Kalman filter estimator, the windowed stroke
minimization algorithm, an estimate extrapolation technique, and the ability to create sym-
metric dark holes via two DMs in series. The Kalman filter was tested against the original
DM-Diversity batch process method using the monochromatic Stroke Minimization algo-
rithm developed by Pueyo et al. [55]. The experiments show that the Kalman filter’s ability
to optimally estimate the field, balancing prior state estimate feedback with new measure-
ments dramatically boosts the efficiency of the estimation stage and stabilizes the estimate
at higher contrast levels. This efficiency is a function of both the number of exposures and
the exposure time. As it stands, the HCIL suppresses three orders of magnitude, which is
well within the dynamic range of our camera. As a result, exposure time does not affect the
efficiency of the results shown in this section. However, as we continue to higher contrast
levels (∼ 104 for a 16 bit camera) the exposure time at each iteration will also contribute
to the efficiency since we will have to change the exposure time. In this case state esti-
108
mate feedback will become even more important because the reliance on prior estimates will
reduce our dependence on extremely long exposure times at very high contrast levels. Over-
all, the results presented here show half the number of exposures required for estimation
without sacrificing achievable contrast or dark hole area. Additionally, thanks largely to
model improvements the convergence rate to the ultimate achievable contrast has increased
dramatically. The monochromatic wavefront suppression has matured greatly.
The Windowed Stroke Minimization algorithm is the first controller written so that it
explicitly suppresses a bandwidth, and the initial results are promising. The extrapolation
technique allows us to further improve the efficiency of the estimation stage by removing
the requirement that we obtain a field estimate for every wavelength in the optimization.
The ultimate achievable contrast in a 10% band is a little more than one order of magnitude
worse than the best performance demonstrated with the monochromatic algorithm, but this
is within a factor of two of the original symmetric dark hole results using monochromatic
light presented in Pueyo et al. [55]. There is still a great deal of work to be done to directly
optimize the bandwidth and improve the estimate extrapolation, but given the simplicity of
the physically motivated computations the results shown in §5.2.2 are very promising.
109
λ0/D
λ0/D
Pre-PCSM Extrapolate, λ = 550 nm, 2.4183e-05
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(a) λ = 550 nm
λ0/D
λ0/D
Pre-PCSM Extrapolate, λ = 577 nm, 1.4965e-05
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(b) λ = 577 nm
λ0/D
λ0/D
Pre-PCSM Extrapolate, λ = 600 nm, 9.4195e-06
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(c) λ = 600 nm
λ0/D
λ0/D
Pre-PCSM Extrapolate, λ = 620 nm, 2.3727e-06
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(d) λ0 = 620 nm
λ0/D
λ0/D
Pre-PCSM Extrapolate, λ = 633 nm, 5.1867e-06
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(e) λ = 633 nm
λ0/D
λ0/D
Pre-PCSM Extrapolate, λ = 640 nm, 5.7065e-06
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(f) λ = 640 nm
λ0/D
λ0/D
Pre-PCSM Extrapolate, λ = 650 nm, 8.0541e-06
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(g) λ = 650 nm
λ0/D
λ0/D
Pre-PCSM Extrapolate, λ = 670 nm, 1.3266e-05
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(h) λ = 670 nm
λ0/D
λ0/D
Pre-PCSM Extrapolate, λ = 694 nm, 1.8555e-05
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(i) λ = 694 nm
λ0/D
λ0/D
Pre-PCSM Extrapolate, λ = 740 nm, 2.3959e-05
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(j) λ = 740 nm
Figure 5.7: Pre-PCSM Extrapolate Individual Filters
110
550 600 632 660 700
10−6
10−5
10−4
Contr
ast
λ
δ1 = δ2 = 0.75 Contrast Plot
AVG Contrast
Left Contrast
Right Contrast
Initial Contrast
(a)
λ0/D
λ0/D
Extrapolated Full Bandwidth 1.298 × 10−5
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−5
−4.8
−4.6
−4.4
−4.2
−4
(b)
λ0/D
λ0/D
Extrapolated 10% Mean Initial = 1.02 × 10−4
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−5.6
−5.4
−5.2
−5
−4.8
−4.6
−4.4
−4.2
−4
−3.8
(c)
λ0/D
λ0/D
Extrapolated 10% Mean Contrast = 5.48 × 10−6
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−5.6
−5.4
−5.2
−5
−4.8
−4.6
−4.4
−4.2
−4
−3.8
(d)
Figure 5.8: Extrapolated results
111
λ0/D
λ0/D
Extrapolated, λ = 550 nm, 1.6352e-05
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(a) λ = 550 nm
λ0/D
λ0/D
Extrapolated, λ = 577 nm, 9.1212e-06
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(b) λ = 577 nm
λ0/D
λ0/D
Extrapolated, λ = 600 nm, 9.5157e-06
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(c) λ = 600 nm
λ0/D
λ0/D
Extrapolated, λ = 620 nm, 5.7468e-06
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(d) λ = 620 nm
λ0/D
λ0/D
Extrapolated, λ = 633 nm, 8.1868e-07
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(e) λ0 = 633 nm
λ0/D
λ0/D
Extrapolated, λ = 640 nm, 4.5843e-06
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(f) λ = 640 nm
λ0/D
λ0/D
Extrapolated, λ = 650 nm, 6.7336e-06
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(g) λ = 650 nm
λ0/D
λ0/D
Extrapolated, λ = 670 nm, 1.6055e-05
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(h) λ = 670 nm
λ0/D
λ0/D
Extrapolated, λ = 694 nm, 1.5587e-05
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(i) λ = 694 nm
λ0/D
λ0/D
Extrapolated, λ = 720 nm, 1.8526e-05
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(j) λ = 720 nm
λ0/D
λ0/D
Extrapolated, λ = 740 nm, 2.7381e-05
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(k) λ = 740 nm
Figure 5.9: Extrapolated Estimate Individual Filters
112
550 600 632 660 700
10−5
10−4
Contr
ast
λ
Contrast Plot, Multiple Estimates 10% Band
AVG Contrast
Left Contrast
Right Contrast
Initial Contrast
(a)
λ0/D
λ0/D
Direct Full Bandwidth 1.364 × 10−5
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−5
−4.8
−4.6
−4.4
−4.2
−4
−3.8
(b)
λ0/D
λ0/D
Direct 10% Mean Initial = 9.83 × 10−5
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−5.6
−5.4
−5.2
−5
−4.8
−4.6
−4.4
−4.2
−4
−3.8
(c)
λ0/D
λ0/D
Direct 10% Mean Contrast = 5.67 × 10−6
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−5.6
−5.4
−5.2
−5
−4.8
−4.6
−4.4
−4.2
−4
−3.8
(d)
Figure 5.10: Direct Estimate results
113
λ0/D
λ0/D
Direct, λ = 550 nm, 1.9162e-05
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(a) λ = 550 nm
λ0/D
λ0/D
Direct, λ = 577 nm, 1.3395e-05
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(b) λ = 577 nm
λ0/D
λ0/D
Direct, λ = 600 nm, 6.3108e-06
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(c) λ = 600 nm
λ0/D
λ0/D
Direct, λ = 620 nm, 4.8845e-06
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(d) λ = 620 nm
λ0/D
λ0/D
Direct, λ = 633 nm, 4.0046e-06
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(e) λ0 = 633 nm
λ0/D
λ0/D
Direct, λ = 640 nm, 6.5068e-06
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(f) λ = 640 nm
λ0/D
λ0/D
Direct, λ = 650 nm, 6.6701e-06
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(g) λ = 650 nm
λ0/D
λ0/D
Direct, λ = 670 nm, 1.4555e-05
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(h) λ = 670 nm
λ0/D
λ0/D
Direct, λ = 694 nm, 1.777e-05
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(i) λ = 694 nm
λ0/D
λ0/D
Direct, λ = 720 nm, 2.4958e-05
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(j) λ = 720 nm
λ0/D
λ0/D
Direct, λ = 740 nm, 4.0198e-05
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6
−5.5
−5
−4.5
−4
(k) λ = 740 nm
Figure 5.11: Direct Estimate Individual Filters
114
Chapter 6
Sources and Propagation of Error
To fully understand the performance of the estimation and control algorithms demonstrated
in Ch. 5, we must identify the sources of error and determine how they affect the contrast
performance. Since this becomes an issue of accuracy and precision, it affects both the
ultimate achievable contrast and the rate of convergence. For example, in §5.1.2 the average
contrast in the dark hole for the first three iterations was 1.3934× 10−4, 5.303× 10−5, and
1.827 × 10−5. Within 30 iterations we reached a contrast floor of 2.3 × 10−7. We need to
understand what limited our ultimate achievable contrast, but we also need to understand
what limited our convergence rate. We did not have to re-linearize the controller to achieve
this contrast level, so presumably we should have been able to control to a greater level of
precision in the initial iterations to achieve our ultimate contrast in fewer iterations. We
need to understand what limited the accuracy and precision of our estimate at 1.3934×10−4
so that we only reached a contrast level of 5.303 × 10−5 in the first control step, rather
than being able to get close to our ultimate achievable contrast in a single iteration of the
correction algorithm. In this chapter we will investigate experimental limitations that affect
our ultimate achievable contrast, as well as those that will limit our performance in a single
iteration. These limitations can either be model dependent or a limitation of the actual
experiment. To maximize the efficiency of the controller, we would eventually like to reach
115
a point where each iteration is limited only by the precision of the experiment.
6.1 Precision of a Contrast Measurement
Throughout this thesis we have measured the intensity distribution in the image plane using
dimensionless units, that we commonly refer to as contrast because the values normalized to
the peak value of the star’s PSF. In this section we will develop how we calibrate the image
by normalizing the peak of the PSF, and discuss the implications of such a normalization on
the estimator when the intensity is discretized by the detector. Knowing the detector bias,
b, and the peak counts, qp, of PSF’s main lobe, the contrast of any pixel with q counts in a
single image is given by
10−C =(q − b)(qp − b)
. (6.1.1)
The value of q and qp assume a specific exposure time, t, and pixel dimensions, (dx, dy). By
finding the peak count rate per unit time per unit area,
I00 =(qp − b)t dx dy
, [counts/sec/m2/contrast] (6.1.2)
we can evaluate the normalization constant for arbitrary exposure time and pixel binning.
Thus, the contrast of any pixel in the image with q counts for an image with arbitrary
exposure time and effective pixel dimension, (dx, dy), is
10−C =(q − b)t dx dyI00
. (6.1.3)
Assuming we do not wish to change our spatial sampling of the image plane, (dx, dy) is fixed
and we can simply define our normalization as
I0 = I00 dx dy. (6.1.4)
116
Knowing how we evaluate the contrast in a single image, we want to understand the
precision with which we can measure the contrast at any point in the image plane. We
quantify this as the contrast value per count on the detector, given by
10−C/(q − b) =1
tI0
. [contrast/count] (6.1.5)
We define this as the contrast resolution of the image. Looking at Eq. 6.1.5 we see that
the contrast value is a function of integration time and the normalization constant. For a
fixed input power from the light source, I0 will not change. Thus, for a fixed pixel area our
contrast resolution will be limited by the detector bias, the number of bits in the DAC, and
our exposure time. The bias and number of bits are ultimately fixed quantities for a given
detector, so integration time is our only variable in determining the contrast resolution. From
Eq. 6.1.5, we see that by maximizing our exposure time we can evaluate the best possible
contrast resolution,
10−Cmin/q =1
tmaxI0
. [contrast/count] (6.1.6)
To evaluate our best contrast resolution, we must first calculate the maximum allowable
exposure time. This is not as obvious as it would first appear. Our first requirement is that
we do not saturate the detector. We choose an exposure time that gets the pixel with the
worst contrast, 10−Cmax , as close to the maximum allowable counts, (qmax − b), as possible
without overshooting. However, we would also like the pixel with the highest contrast in our
estimation area, 10−Cestmax , to stay within the linear response of the detector. This imposes
a ceiling of (qestmask − b) counts on this pixel, which is lower than (qmax − b). Thus, our
maximum allowable exposure time will be
tmax = min
[(qmax − b)I010−Cmax
,(qestmax − b)I010−Cestmax
]. (6.1.7)
To find the contrast resolution in a single image, we consider the contrast measured per
117
count. Having found tmax, this resolution is defined as
10−Cmin/q =1
tmaxI0
. [contrast/count] (6.1.8)
Applying Eq. 6.1.7 to Eq. 6.1.8 we find that the contrast resolution for a single exposure is
10−Cmin/q = max
[10−Cmax
qmax − b,
10−Cestmax
qestmax − b
]. (6.1.9)
To decide which contrast value is the limiting factor, we must look to the control history.
In the early stages of wavefront control 10−Cestmax will be larger than at later iterations, and
may compete as the limiting factor on exposure time simply because the controller has not
suppressed it yet. In a classical correction scheme, where we allow the field to be suppressed
at each iteration, this value will continue to decrease and will not be the limiting factor on
tmax. Likewise, it is at these higher contrast levels that we become more concerned with the
precision of our estimate since it will become a more significant fraction of the measured
intensity. Thus, in this correction scenario we are more concerned with the peak contrast
value, 10−Cmax , and how it limits the exposure time. As an example, the 16-bit camera used in
the Princeton HCIL has 65, 536 counts with a bias of approximately 3, 380 counts. In a typical
experiment at the Princeton HCIL, the peak contrast level in the aberrated field is limited to
approximately 1×10−3 and I00 is approximately 107000 counts/sec/m2/contrast with 2 mW
of laser power. When we probe the field we can conservatively expect the perturbation to
double this value, particularly in earlier iterations where the probes have greater amplitude.
To stay well away from saturation even in the presence of noise fluctuations, this peak
contrast level should not exceed 60, 000 counts. As a result, the maximum allowable exposure
time is ≈ 280 ms, making the best possible contrast resolution ≈ 3×10−8. Fortunately higher
levels of precision are possible simply by frame adding. Neglecting read-noise and detector
dark current, the maximum allowable counts can simply be multiplied by the number of
exposures. The lowest achievable detection limit is then limited by read noise. This has
118
been considered in great detail by Savransky [62], and this work even points towards a
requirement that we use a photon counting detector in a true planet finding mission. The
best contrast value measured in symmetric dark holes at the HCIL is 2.3 × 10−7. Even at
these contrast levels the precision of our measurement has become a significant percentage
of the measured value. At the Princeton HCIL we actively purge warm nitrogen over a
thermoelectrically cooled CCD so that we do not require a window on the detector. Since
we have applied a convective heating source to the detector it reaches thermal equilibrium at
a higher temperature. The constant flux problem also means that the thermoelectric coolers
cannot stabilize the temperature as well. The net result is that the residual detector noise
will vary from image to image by approximately 10 counts (see Fig. 4.2). Thus, both the
precision of a single image and the variance in the detector background for a single image
are very near our minimum achievable contrast and we find that we must average at least
two frames to maintain this contrast level.
It should be noted that the maximum exposure time is not necessarily recommended.
Perturbations due to model error or read error can push the maximum value beyond its
allowable limit, so a factor of safety should be applied in the exposure time. If the read noise
and additional readout time can be tolerated, it is always better to read out more frequently
and average images as a way of mitigating risk in acquiring the field. Additionally, if we
consider a correction scheme as in Ch. 4 where we do not suppress the field as we increase
our precision in the estimate the choice made in Eq. 6.1.7 to find tmax is not as clear since
the contrast level in the correction area does not go down as we estimate the field. This
decision will have to be made dynamically since it will be a function of whether the exposure
is intended for estimation or measuring the control effect. In the end, which term dominates
the maximum allowable exposure time when 10−Cestmax is still large will be a function of
the quality of the alignment since this will shift energy into the low working angles we are
concerned with. If our model is good enough, it is the maximum allowable exposure time
that will fundamentally bound the achievable contrast in a single iteration.
119
6.2 Estimation Algorithms and Propagating Error
Inaccuracies in the system model is the largest source of error in the experiment. These
typically are a result of approximations or measurement error of critical physical parameters.
Model error affects both the estimator and the controller, doubling its effect on the total
error. Specifically, these model errors will induce errors in the control effect matrix, Eq. 2.1.14
, and the observation matrix, Eq. 3.3.5. One critical constant used in all of our transforms
is the value of λf/D, the scaling parameter that appears in the exponential of the Fourier
transform, Eq. 1.5.25. An error in this value has one of the most dramatic effects, because
beyond producing the incorrect transform it also affects which pixels we sample for the dark
hole. If the value of λf/D is off by the physical dimension of a single pixel we will sample
the image plane incorrectly in the estimate and our control effect will appear in regions
unpredicted by the model. Since we can generally have a very high level certainty in D
and λ, this error budget is usually left to uncertainty in the effective focal length of the
system. Note that even sub-pixel shifts can have an effect on the estimate because the DM
commands produce a continuous distribution and energy will unintentionally be shifted from
the edge of one pixel into another during estimation. For this reason, we find that knowing
λf/D to within a millimeter or less is very important to minimize the impact of model error
on the estimation and control. Fortunately, such a dramatic response to this error means
that we have the sensitivity to measure this properly and calibrate it by applying sinusoidal
functions to the DMs. Two model errors that are a bit more tenacious are uncertainty in
the propagation distance and uncertainty in the DM shape.
An error in the propagation distance between a DM and the pupil affects the phase of
the transfer function used in the estimation and control algorithms. If we evaluate the DM
transfer function in Eq. 1.6.27, incorrectly assuming that the propagation distance is z, which
has an error of δz. Thus, for the true distance, z, we describe the measurement error as
z = z + δz. The computed transfer function with the incorrect propagation distance, C{·},
120
for the DM actuation is
C{Aφ} = e−i πz
λf2 (x2+y2)F{Aφ} (6.2.1)
= e−iπ(z+δz)
λf2 (x2+y2)F{Aφ}
= e−i πδz
λf2 (x2+y2)e−i πz
λf2 (x2+y2)C{Aφ}. (6.2.2)
Eq. 6.2.2 shows that we introduce an additional phase perturbation to the DM transfer
function that scales quadratically with working angle. Since we rely on this model to estimate
the field via probe shapes, Eq. 4.1.20, this perturbation also appears in the electric field
estimate. Decomposing the DM command into spatial frequencies of period m in the u
direction and n in the y direction with arbitrary phase shift θ, we find that
F{φ} = F{
cos
(2π
D(m0u+ n0v) + θ
)}(6.2.3)
=1
2F{ei(
2πD
(m0u+n0v)+θ) + e−i(2πD
(m0u+n0v)+θ)}
=1
2eiθδ
(mD− m0
D
)+
1
2e−iθδ
( nD− n0
D
). (6.2.4)
Thus, there is a one to one correspondence between the value of the induced phase perturba-
tion at a specific location in the image plane with the value of the phase shift for a particular
spatial frequency in the DM plane. In other words, we can correlate the phase perturbation
from δz to a commanded spatial frequency by equating
eiθ(x,y) = e−i πδz
λf2 (x2+y2)(6.2.5)
Looking back to Eq. 6.2.4, the perturbation in the phase shifts at the DM plane will scale
quadratically with spatial frequency, reducing the certainty of higher order correction. While
this will be less significant at small angular separation, the higher uncertainty at large spatial
121
frequencies will ultimately limit the achievable outer working angle of the dark hole. Since
the error in phase shift will be applied at both the estimation and control steps, the phase
errors induced on each spatial frequency of the DM command will be double the value in
the exponential of Eq. 6.2.2. Fig. 6.1 shows the phase picked up per millimeter of error in
the measurement of z. Since this phase scales quadratically, the phase error becomes fifteen
times worse between 4λ/D, the IWA of the Ripple3 coronagraph, and 16λ/D, the maximum
controllable angle of the Kilo-DM©. Since propagation distance of the DMs in the HCIL
make the phase contribution on the order of 2π this error is not noticeable at very low
contrast levels, making it a much more subtle error than choosing the wrong focal length.
Physically, we can think of the DM command as being chosen to conjugate the field at a
λ0/D
λ0/D
IP Phase per mm of Propagation Error
−10 −5 0 5 10
−10
−5
0
5
10
Phase (
rad)
1
2
3
4
5
6
7
8
x 10−3
Figure 6.1: An error in our knowledge of the propagation distance from the DM to the pupilinduces, to first order, a phase error that scales quadratically with working angle. The figureshows the phase error across the image plane per mm of error in the measurement of thepropagation distance.
particular pixel in the image plane. With an error in z, any particular spatial frequency will
arrive at the correct pixel on the detector, but the command won’t conjugate the field as
well as predicted in the model. Since this error scales quadratically with spatial frequency,
it becomes less significant as we try to achieve lower inner working angles. It will however
become an issue when we try to increase our discovery space by pushing the outer working
122
angle closer to the controllable limit of the DMs.
A second source of model error is the DM shape. We must recognize that while the
superposition model applied for the estimation and control algorithms in Ch. 2, Ch. 3, and
Ch. 4 is very useful, the true DM shape is nonlinear and subject to a plate equation. This
is very difficult to model, and we mitigate this by remaining in a low stroke regime where
the assumption of superposition and linearity is valid. There is also an inherent uncertainty
in the model of the influence function shape. If the full width half max of a single actuators
influence function is incorrect, then the spatial frequencies we apply will not be what is
modeled. If we approximate a sine wave by pushing and pulling adjacent actuators, then
the period of this oscillation is dependent on the full-width-half-max of the two actuators.
Thus, for a fixed pupil we will assume the incorrect spatial frequency and this will cause
energy to be distributed in areas we did not intend. The true influence function tends not
to be a simple gaussian either, having little wings towards the edges of the actuator. This
produces a roughly additive error at mid to high spatial frequencies, but they are of very low
amplitude. Thus this effect is not seen until we reach very high contrast levels, where the
amplitude of the perturbations from prior iterations becomes the same order as the control
amplitudes being attempted. Finally, our model assumes a particular actuator gain which
is itself a nonlinear response (at high stroke) and subject to a large amount of variation. If
we continue to adhere to the model of superposition and stay in the low stroke regime, we
may treat gain uncertainty as additive errors across the DM plane. We describe the modeled
DM actuation, φ, as the sum of the true actuation, φ, and an additive perturbation present
in the model, δφ. Since δφ is the result of a gain mismatch, the perturbations in the model
will follow the sign of the DM actuation, allowing us to write the model of the differential
123
measurement described by Eq. 3.3.1 as
I+ − I− = 4
[<{iC{Aφ}} ={iC{Aφ}}
]<{C{A}+ C{Ag}}
={C{A}+ C{Ag}}
(6.2.6)
= 4
[<{iC{A(φ+ δφ)i}} ={iC{A(φ+ δφ)}}
]<{C{A}+ C{Ag}}
={C{A}+ C{Ag}}
. (6.2.7)
Since the model assumes that there is an additional component contributing to the intensity
measurements, we find that the observation matrix is
H = H + δH, (6.2.8)
indicating that the model artificially changes the mapping between the electric field and the
observation, I+ − I−. Each element of δH will make the model’s observation matrix, H,
predict the incorrect response of z to the field’s interaction with that particular probe shape.
As a result, there is a spatially varying energy mismatch in the image plane between the
estimated and true states due to uncertainty in the actuator gain. If the gain variations
are random, then the effect on any particular spatial frequency is different. However, a
systematic error in the gain model can affect some spatial frequencies more than others.
For example, if we command a 16 cycle per aperture oscillation across the kilo-DM every
actuator is being pulled in an opposing direction to its adjacent actuators. However, if we
command an 8 cycle per aperture oscillation across the DM we end up with pairs of pixels
pulling in the same direction. There are some measurements from Boston Micromachines
that show increases in the voltage-to-amplitude gain when multiple actuators are pulling in
concert. Thus, we would underpredict the actuator gains for 8 cycles/aperture compared
to the gain for a 16 cycle/aperture command. Fortunately, the interferometric data shown
in Fig. 6.2 indicates that in the low stroke regime, the mathematical superposition of two
adjacent actuators very closely replicates pulling the two at the same time. Thus, the additive
124
56V
56 V
0 V
Applie
d V
oltage
−2 −1 0 1 2−200
−156.62−145.99
−108.96−102.18
010
Actuator Width (mm)
Heig
ht C
hange (
nm
)
Height Change of Adjacent Actuators on DM2
One Actuator
Two Actuators
Superposition
Two − One
Figure 6.2: Using a white light interferometer at Boston Micromachines, we have beenprovided with the following superpositon data for the kilo-DMs used in the Princeton HCIL.For a stroke that far exceeds those used in the control experiment, there is only a 10.6 nm(∼ 7%) discrepancy between superposing two actuators and pulling the two together.
perturbation, δH, will not be a function of spatial frequency (which would make computing
its effect much more complicated). Instead, we expect variability in actuator gain to be the
most common source of error; the effect of gain mismatch in the model will be relatively
random instead of dependent on the spatial frequencies being actuated.
94 V
56 V
0 V
Applie
d V
oltage
−1 −0.5 0 0.5 1−120
−107.3
0
53.8
70
Actuator Width (mm)
Heig
ht C
hange (
nm
)
DM1 Relative Actuation From 56V offset
X direction
Y direction
(a) DM1 Actuator Shapes
94 V
56 V
0 V
Applie
d V
oltage
−1.5 −1 −0.5 0 0.5 1 1.5−120
−102.5
0
165.8
185
Actuator Width (mm)
Heig
ht C
hange (
nm
)
DM2 Relative Actuation From 56V offset
X direction
Y direction
(b) DM2 Actuator Shapes
Figure 6.3: Using a white light interferometer at Boston Micromachines, we have beenprovided with the following data for the kilo-DMs used in the Princeton HCIL. The dataindicates that not only is the influence function perturbed from a gaussian, but the shape isa function of x and y. Additionally, the actuator gain is a function of whether the actuatoris being released or pulled from DC offset applied to the mirror.
125
The last systematic gain error that we must worry about is demonstrated in Fig. 6.3.
In the low stroke regime, we have a limited amount of data that would indicate that the
gain will be relatively constant regardless of the behavior of adjacent actuators. However,
we also see that this gain changes based on whether we are releasing or pulling (“poking”
or “pulling”) the actuator. This is most likely due to the internal stresses in the face sheet,
which is intentionally introduced by the manufacturer to force the DM surface to a flatter
nominal shape. Knowing approximately what these gains are, the control model actually
computes a new gain matrix on the fly, changing the gain based on whether it is a push
or a pull at that control iteration. This resulted in a small amount of improvement in the
convergence rate and ultimate achievable contrast in the laboratory.
6.3 Accuracy of Wavelength Extrapolation
As discussed in §2.5, the extrapolation method used for broadband control assumes that
the amplitude variations across the pupil are wavelength independent, completely neglect-
ing the perturbations from surfaces non-conjugate to the pupil. The degree of accuracy is
completely dependent on the surface flatness of the DMs, particularly at low to mid-spatial
frequencies. Since the phase errors across the mirror act as small focusing elements as the
light propagates towards the pupil, amplitude variations from these surfaces will be highly
chromatic. Since we have confined our wavelength assumption to an estimate extrapolation,
rather than building it into the control law, we will see poorer contrast performance at our
bounding wavelengths. This will tend to drive the central wavelength lower than the bound-
ing wavelengths (under equal weighting in the controller) and result in higher actuation levels
from the DMs. The effect of this extrapolation can be minimized either by attempting to
add higher order terms or attempting to construct a sensing scheme by which we produce a
polynomial describing the wavelength dependence of the pupil plane.
126
6.4 DM Controllable Space
In section 6.5 the issue of error in the deformable mirror model was addressed. More advanced
learning and adaptive algorithms, the beginning of which is presented in this thesis, are
designed to mitigate the errors propagated through the control code from this. Whether or
not the desired shape is achievable is a separate issue. The inability to solve for arbitrary
shape will limit the achievable contrast and the area over which this can be achieved. For
example, the monochromatic subspace in the image plane cannot exceed more than nλ/D,
where n is half the number of actuators along the plane chosen. Additionally, pure sinusoids
cannot be actuated, and the end condition of the DM limits the slopes and achievable shapes
of the mirror. The distribution of actuators also comes into play, where a square array of
actuators cannot perfectly generate a shape with circular symmetry.
6.5 Experiment Stability and Laser Power
Disturbances in the system will eventually degrade a dark hole back to its original contrast
level (or near it). This will ultimately limit our achievable contrast at a fundamental level,
but to truly understand this we must understand the time scale of these disturbances. To
test this, we began by running the correction algorithm to generate symmetric dark holes
in the image plane. The DM commands were then fixed and we proceeded to measure
the contrast over a 500 minute period. Simultaneously we re-imaged a pupil plane with a
beam splitter so that we could normalize by the total power incident on the image. This
removes any variability in the source itself, but leaves behind any variability in the detector
gain, or potentially a rapid fluctuation in the laser power that was not captured by the
pupil image. Looking at Fig. 6.4(a), Fig. 6.4(b), and Fig. 6.4(c) we see vary little variation
in the structure of the dark hole over the 8 hour period that we measured the contrast
performance. To see the image variation, we look to the deviation in the average contrast
in the dark hole. Fig. 6.4(d) indicates the the contrast variance can become a significant
127
λ0/D
λ0/D
Initial Image
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6.5
−6
−5.5
−5
−4.5
−4
(a) Initial IP
λ0/D
λ0/D
t = 240 minutes
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6.5
−6
−5.5
−5
−4.5
−4
(b) IP after 4 hours
λ0/D
λ0/D
t = 480 minutes
−10 −5 0 5 10
−10
−5
0
5
10
log 1
0(C
ontrast)
−6.5
−6
−5.5
−5
−4.5
−4
(c) IP after 8 hours
0 100 200 300 400 480−6
−4
−2
0
2
4x 10
−7
Time (min)
Contrast
Variation
Absolute Variation From Initial Image
(d) Contrast Deviation vs. Time
0 100 200 300 400 480−6
−4
−2
0
2
4
6
8x 10
−7 Running Difference of Contrast
Time (min)
Contrast
Variation
(e) Running Difference vs. Time
Figure 6.4: With fixed DM shapes, the contrast of symmetric dark holes was measured overa 500 minute period after creating two dark holes. To remove laser power fluctuations fromthe measurement, a camera was placed in the pupil plane with a beam splitter to allowus to normalize the contrast by the variance of the total power incident on the pupil overtime. (a), (b), and (c) show the evolution of the dark holes in time. (d) shows the absolutevariance in time from the initial contrast. (e) shows the running difference from one imageto the next over time. Note that while we see significant variation at ∼ 1/10th the intensityof the dark hole over time, the visual difference is barely discernible.
fraction of our ultimate achievable contrast measured in the initial image, but over much
longer time scales than are required for correction. Fig. 6.4(e) shows the running difference
between measurements over the time history. Thus we see that with a fixed DM setting the
variance over a five minute interval can be of order 10−8−10−7, which means that we cannot
correct fast enough to “freeze” the disturbances below this. Thus we can reach 10−7 contrast
128
levels because we are capable of measuring and controlling on a timescale that is faster than
the five minute variation of the experimental disturbances at a level of 1×10−7, but to reach
contrast levels below 1× 10−7 we must improve the short term stability of the system. Since
imaging is by far the slowest component, particularly when we reach levels of high contrast,
the simplest solution is to increase the power of the laser so that the integration time, and
hence the time required for estimation and control, is reduced.
Interestingly, this speaks to the required stability for wavefront control for a real obser-
vatory. To fairly assess the flight readiness of a control system achieving high contrast, the
intensity of the light source must be scaled so that the contrast stability of the experiment
matches what we expect in the observatory. Laboratories such as the HCIL, HCIT, and the
AMES testbed use very large amounts of power to minimize the exposure time and build up
a high level of statistical certainty in their measurement. In this way they are able to push
the absolute limit of achievable contrast, but it does not speak to the controller performance
in a true observatory environment. Relating this back to §6.1, increasing laser power is
equivalent to increasing I00 so that we may decrease tmax and still achieve the same contrast
resolution. This actually points to a very interesting mapping problem where we should
pick the laser power based on the photon and sensor noise in the system, making everything
comparable to our expected observatory performance. At the Princeton HCIL, we will even-
tually be limited by the time required to take an exposure. For the Starlight express cameras
that we use, the readout time is approximately 0.6 seconds when reading out the full frame.
Thus, regardless of our computation time each pair of images taken for estimation requires
that the aberrations be stable for at least 1.2 seconds for the estimate to be as accurate as
possible to reach higher levels of contrast. There is a large body of work investigating the
stability requirements for a mission capable of imaging Earth-like planets [26, 27, 64, 27].
The more aggressive the coronagraph is, and the smaller the desired inner working angle,
the stricter these tolerances become. Parameters such as secondary alignment or low order
bending modes for the primary mirror (and its support structure) can have tolerances in the
129
range of nanometers to picometers [66].
6.6 Stability of Laser Power
Variance in the incident flux skews the measurement of contrast, which means that at some
level of contrast the variance in the laser power will dramatically affect the normalization
of the field. The primary cause of laser power fluctuation is from temperature variation
in the gain medium. These fluctuations affect the mean gain value, resulting in a power
fluctuation at the output of the oscillator. Since we can average random fluctuations over
time, we are only concerned with static drifts in the power level. Static drift in the output
power is mitigated with temperature feedback control on the gain medium of the laser. As
a result, we are not overly concerned with systematic power drift through the course of
control. However, we should keep in mind that while this dominates, temperature is not the
only possible source of power drift. In fact, the laser was picked based on its power stability
spanning the duration of control. We need to ensure that we are averaging exposures with an
adequate signal-to-noise ratio to eliminate any residual fluctuations in the laser power arising
from temperature fluctuations in the gain medium. Once high contrast values are reached
the variance in the normalization becomes significant and active normalization is required.
This motivates a system which measures the core of the light, which is reflected off the
focal plane mask. This allows for accurate measurement of contrast since the normalization
constant can be actively adjusted.
6.7 Final Remarks on Error
Overall, the errors limiting our correction performance come in two categories; model errors
which limit the ultimate achievable contrast and the convergence rate of the system, and
experimental limitations that tend to limit the ultimate achievable contrast. Ideally, we
would like to be limited by the system, not our model. The results from the first published
130
0 10 20 30 40 50 6010
−7
10−6
10−5
10−4
Contr
ast
Iteration
Comparing Contrast Performance
AVG Contrast 2009Left Contrast 2009Right Contrast 2009
Current AVG ContrastCurrent Left ContrastCurrent Right Contrast
Figure 6.5: The figure compares the ultimate achievable contrast and convergence rate insymmetric dark holes from the initial 2009 experiments to current laboratory performance.Since the limitations were largely model based, we see that both the ultimate achievablecontrast and the convergence rate improve.
symmetric dark hole experiment in Pueyo et al. [55] were a result of laboratory instabilities,
but the dominant errors were model based, one example being errors in the influence function
shape and voltage-to-actuation gains used to produce the control effect matrices. Fig. 6.5
compares the current laboratory performance with that of the original data set in Pueyo
et al. [55]. Both data sets use the monochromatic Stroke Minimization algorithm, but the
new data set employs an improved model and the Kalman filter estimation scheme. With
these improvements we are now capable of achieving the same contrast levels in nearly one
tenth the time. We have also pushed the ultimate achievable contrast by a full order of
magnitude, achieved it in half as many iterations, and used half the exposures per iteration
to do so.
131
Chapter 7
Conclusions and Future Directions
The purpose of the broadband control and estimation work presented in this thesis is to
improve the efficiency and robustness of wavefront correction for high contrast. The estimate
extrapolation techniques applied to the broadband controller enable us to suppress the field
using only a single estimate. This reduces the number of exposures simply by not requiring
estimates for every wavelength. By closing the loop on the state estimate, we exploited the
Kalman filter to reduce our requirement on the number of exposures taken to estimate the
field. The filter also made the estimate more robust to noise, since it is designed specifically
to optimally combine the prior history and update measurements. This was particularly
useful at high contrast levels, where sensor noise can dominate the measurement. Overall,
the combination of the Kalman filter estimator and the Windowed Stroke Minimization
controller to produce a more stable estimate and to reach the same contrast levels with one
sixth of the measurements that prior versions of estimation and broadband control would
have required. However, none of this work is capable of reducing the effect of model error
addressed in Ch. 6, nor does it directly minimize the number of exposures used to achieve a
particular contrast level. Having developed a closed loop state estimator, we now have the
framework to consider more advanced estimation and control schemes designed to do just
that. The following sections discuss ways in which we might address time varying physical
132
parameters that will induce model error and attempt to find a global optimization of the
estimation and control problem with regard to exposure time.
7.1 Parameter Adaptive Filtering
In Ch. 6 we discussed many limitations to the accuracy of correction. Among the worst was
DM model error, which cannot be eliminated by the estimation algorithms of Chapters 3 and
4 since the model error is built into the observation matrix. We do however, have one method
or recourse. We can attempt estimating the most critical physical parameters during control
and thereby produce an adaptive model for our controller. In truth, nothing would be better
than to measure the parameters as well as possible prior to attempting control so that they
need not be estimated in a potentially nonlinear and computationally expensive fashion. The
correction algorithms presented in this thesis will also undoubtedly suppress the aberrated
field more efficiently if they have the correct parameters from the start. However, we require
knowledge of many parameters to such a high level of precision that the observatory must be
considered as a dynamic system. While we may have a high level of certainty to begin with,
slow variations in the observatory (such as thermal fluctuations or vibrations due to station
keeping) will cause these parameters to drift. As a result we have no choice but to actively
measure their variation [66], which is the purpose of sensors such as the coronagraphic low
order wavefront sensor (CLOWFS) [32], or to estimate them along with the image plane
electric field.
7.2 Dual Controller
This thesis has focused a great deal on optimal estimators and controllers, each of which
is guaranteed to carry some form of global optimality. However, we have not in any way
guaranteed that we have reached our lowest achievable contrast with the smallest possible
number of exposures. The control algorithms in Ch. 2 minimize actuation, not images.
133
However they only truly require one image to measure the control effect and take a contrast
measurement, so this is already minimal. The Kalman filter estimator developed in Ch. 4
does leave a free parameter with regard to the number of exposures taken for estimation at
each control iteration. However, there is no minimization through the entirety of control and
there is a balance to be had. There is a balance to be had in the number of images taken
at a particular iteration of the estimator, but this has a consequence on the accuracy of the
estimate due to the averaging nature of the estimator. For example, if we use one set of
exposures per iteration to suppress the field to a desired level we may have been able to do
the same job with less than half the iterations if we had doubled our estimation exposures
at each iteration. The purpose of a dual controller is to minimize our requirements on the
control effect and the estimator at the same time. In other words, the dual controller uses
an optimal control law to decide whether it must perturb the field to gain more knowledge
or if it should suppress the field to achieve its target. By designing such an algorithm we
could potentially save on exposures required for correction, leaving more time available to
take science exposures. This in turn increases the likelihood of detection and makes spectral
characterization more efficient.
7.3 Including Alternate Sensors
Up to this point there has been an underlying assumption that we are working within the
stability range inherent to a space telescope. However, the concepts here are becoming more
applicable to ground-based coronagraphic efforts [50, 47, 8] as their AO systems improve.
For the purpose of this thesis, the assumption is that the AO system designed for atmo-
spheric correction is the most upstream sensor in the optical path and is operating at the
wavelength farthest from the final image plane camera. For this reason this data will be the
most difficult to incorporate into the update, but has the advantage of being the fastest up-
date with the most information about residual atmospheric turbulence. Given the stability
134
requirements laid out in Shaklan et al. [66] for a space telescope, we will assume that a low
order wavefront sensor will be present in both systems. As shown in Fig. 7.1, the high order
Figure 7.1: Schematic representing relative position of different wavefront sensors in theoptical path of a coronagraphic instrument.
wavefront sensor is upstream of the final focal plane stop in the coronagraph. We assume it
is taking measurements to correct variation in the optical system that degrade the quality
of the PSF. Since this measurement is taken in the collimated beam and is operating at a
shorter wavelength [50], it is unclear how well it can be applied to the focal plane estimation
algorithms presented in this thesis. However, the coronagraphic low order wavefront sen-
sor [32] takes a measurement by directly imaging the reflected starlight off the focal plane
mask, meaning that it operates at the same wavelength as the science channel. In a space
observatory this would be able to directly measure and compensate for thermal fluctuations
in large system optics. In a ground mission this would also include slow speed, high order,
residual atmospheric turbulence. This is the final wavefront measurement prior to the coro-
nagraphic channel, and therefore experiences the most common path with the science path.
This however, is still assumed to be taken at a different bandpass, but is necessarily at a
closer bandpass than the fast AO loop.
135
7.3.1 Establishing a Reference
To make a differential measurement, we must first establish a reference field with which
to base changes in the output of the CLOWFS. Since the focal plane estimator would be
considered a high precision periodic update to the estimate we must first calibrate the relative
changes to properly balance the two measurements. Rather than writing a linear Kalman
filter which updates the state as in Eq. 4.1.27, we will write our update as
xk(−) = Fk−1 (xk−1(+)) + Γk−1uk−1. (7.3.1)
We will now include a model based update that maps differential changes in the measure-
ments taken with a CLOWFS and higher order wavefront sensors to their image plane effect
via the time update Fk−1(xk−1). The accuracy of this model will be highly dependent on
the accuracy of the model transforming the measurement to the final image plane, as well
as the resolution and accuracy of the sensor itself. Our expectation is that this will be less
accurate for extrapolating to the current state than applying probe images, but will allow us
to operate on a much faster timescale. This will in turn allow us to take fewer exposures in-
tended to probe the field, making the estimate faster and more stable to large perturbations
in the field that would otherwise drive the probe model out of the linear regime.
7.3.2 Applying Reference to the Time Update
The focal plane techniques described in this paper are most effective when the estimation
and control steps can be applied within the timescale of the remaining aberrations. For this
reason, these techniques can only be applicable on ground-based telescopes if they are done
after a fast AO loop for atmospheric correction. Fortunately, the Kalman filter formalism
built into the estimation will allow us to account for time evolution of the aberrated field
via a time update on the previous state using the sensor data from these prior control loops.
These sensors are commonly introduced via a dichroic to avoid losing photons in the science
136
channel, which means that we must address the issue of non-common path and wavelength
at each of these channels in order to use this sensor data.
7.4 Bias Estimation
This thesis has focused on estimation schemes that rely on a linear model to say that we
can take differential measurements with conjugate DM settings to estimate the electric field.
However, there is a large incentive to eliminating the need to take pairwise measurements for
each observation in the estimate. By taking a single image per observation (with background
subtraction), we reduce our dependence on a linearization of the field to produce the estimate,
and we automatically reduce the required exposures for estimation in half. Additionally, the
planet itself is a source of bias in the image. Thus, if we include bias estimation we can
use the entire control history to disambiguate planets from residual speckles. Since the
bias is an estimate built up over time with inadequate signal-to-noise, we will still have to
take an exposure at high contrast to extract a high quality spectra. Fortunately, the bias
estimation allows us to identify candidates before we are required to take long exposures at
high contrast levels. With the disambiguation happening in closed loop, we are not only
using the correction time history to solve the identification problem but this also allows us
to adaptively modify the dark hole area and contrast level. This also relaxes the required
size of the dark hole at the highest contrast levels, relaxing the control tolerances once we
attempt detection.
7.5 Final Remarks
For a coronagraph to be a viable option for a space mission detecting and characterizing
Earth-like planets the wavefront control must be both effective and efficient. Even in a hy-
bridized scheme where an external occulter is used for spectral characterization, we must
rely on the coronagraphs ability to reach 10−10 contrast levels to achieve detection. Even
137
as a detection instrument the broadband performance of the wavefront control system will
drive the bandwidth available for detection, since it is the easiest way to reduce exposure
time in a photon limited observation. The time required to perform wavefront control will
also directly impact the number of targets visited, and hence the number of planets the
coronagraph is capable of detecting over a finite mission life. The overall efficiency of the
wavefront control system, i.e. how fast it can suppress the field to the required contrast
level, is also a function of the quality of the optical surfaces and the stability of the observa-
tory. Shaklan et al. [66] show that the observatory will require picometer levels of stability
in focus alone to maintain such high levels of contrast. We will never be able to beat the
fundamental limitations of the observatory, so the spirit of this thesis is to push the accuracy
and efficiency of the wavefront control algorithm to a level where the observatory variation
is the only limitation, rather than the controller, at any point in time. This will maximize
our time spent doing science, making the mission more cost effective. To that end, we have
used the HCIL to demonstrated new estimation and broadband control schemes. In these
experiments we have put just as much value on their convergence rate and effective use of
acquired data as we have the ultimate achievable contrast. We have shown that when the ex-
periment is model limited, improvements in ultimate achievable contrast automatically map
to faster convergence rates. The Kalman filtering and extrapolation techniques increase our
efficiency simply by reducing the estimators requirement on new exposures at each iteration.
Eventually, these improvements will make the estimator effective for the stability level seen
at ground-based observatories. This will become the ultimate platform for demonstrating
the effectiveness of these focal plane wavefront control concepts for a space mission, where
the true stability is expected to be better than a ground observatory but will not be well
known.
138
Bibliography
[1] URL http://exoplanets.org/.
[2] JRP Angel and NJ Woolf. An imaging nulling interferometer to study extrasolar planets.
The astrophysical journal, 475:373, 1997.
[3] S.A. Basinger, L.A. Burns, D.C. Redding, F. Shi, D. Cohen, J.J. Green, C.M. Ohara, and
A.E. Lowman. Wavefront sensing and control software for a segmented space telescope.
In Proceedings of SPIE, volume 4850, pages 362–369. Citeseer, 2003.
[4] N.M. Batalha, W.J. Borucki, S.T. Bryson, L.A. Buchhave, D.A. Caldwell,
J. Christensen-Dalsgaard, D. Ciardi, E.W. Dunham, F. Fressin, T.N. Gautier, et al.
Kepler’s first rocky planet: Kepler-10b. The Astrophysical Journal, 729:27, 2011.
[5] R. Belikov, A. Give’on, B. Kern, E. Cady, M. Carr, S. Shaklan, K. Balasubramanian,
V. White, P. Echternach, M. Dickie, J. Trauger, A. Kuhnert, and N. J. Kasdin. Demon-
stration of high contrast in 10% broadband light with the shaped pupil coronagraph.
Proceedings of SPIE, 6693:pp. 66930Y–1 – 66930Y–11, September 2007.
[6] R. Belikov, E. Pluzhnik, M.S. Connelley, F.C. Witteborn, T.P. Greene, D.H. Lynch,
P.T. Zell, and O. Guyon. Laboratory demonstration of high-contrast imaging at 2 λ/d
on a temperatre-stabilized testbed in air. In Proceedings of SPIE, volume 7731, page
77312D, 2010.
[7] R. Belikov, E. Pluzhnik, F.C. Witteborn, T.P. Greene, D.H. Lynch, P.T. Zell, and
139
O. Guyon. Laboratory demonstration of high-contrast imaging at inner working angles
2 λ/d and better. In Proceedings of SPIE, volume 8151, page 815102, 2011.
[8] J.L. Beuzit, M. Feldt, K. Dohlen, D. Mouillet, P. Puget, F. Wildi, L. Abe, J. Antichi,
A. Baruffolo, P. Baudoz, et al. Sphere: a planet finder instrument for the vlt. In
Proceedings of SPIE, volume 7014, page 41, 2008.
[9] Celia Blain, Rodolphe Conan, Colin Bradley, Olivier Guyon, and Curtis Vogel. Char-
acterisation of the influence function non-additivities for a 1024-actuator mems de-
formable mirror. Proceedings of the 1st AO for ELT conference, 01 2010. URL
http://arxiv.org/abs/1001.5048.
[10] M.R. Bolcar and J.R. Fienup. Sub-aperture piston phase diversity for segmented and
multi-aperture systems. Applied Optics, 48(1):A5–A12, 2009.
[11] P.J. Borde and W.A. Traub. High-contrast imaging from space: Speckle nulling in a
low aberration regime. Applied Physics Journal, 638:488–498, February 2006.
[12] W.J. Borucki, D.G. Koch, G. Basri, N. Batalha, A. Boss, T.M. Brown, D. Caldwell,
J. Christensen-Dalsgaard, W.D. Cochran, E. DeVore, et al. Characteristics of kepler
planetary candidates based on the first data set. The Astrophysical Journal, 728:117,
2011.
[13] W.J. Borucki, D.G. Koch, G. Basri, N. Batalha, T.M. Brown, S.T. Bryson, D. Caldwell,
J. Christensen-Dalsgaard, W.D. Cochran, E. DeVore, et al. Characteristics of plane-
tary candidates observed by kepler. ii. analysis of the first four months of data. The
Astrophysical Journal, 736:19, 2011.
[14] G. Bryden, W. Traub, L.C. Roberts Jr, R. Bruno, S. Unwin, S. Backovsky, P. Brugarolas,
S. Chakrabarti, P. Chen, L. Hillenbrand, et al. Zodiac ii: debris disk science from a
balloon. In Proceedings of SPIE, volume 8151, page 81511E, 2011.
140
[15] A. Carlotti, R. Vanderbei, and NJ Kasdin. Optimal pupil apodizations of arbitrary
apertures for high-contrast imaging. Optics Express, 19(27):26796–26809, 2011.
[16] W. Cash. Detection of earth-like planets around nearby stars using a petal-shaped
occulter. Nature, 442(7098):51–53, 2006.
[17] D.J. Des Marais, M.O. Harwit, K.W. Jucks, J.F. Kasting, D.N.C. Lin, J.I. Lunine,
J. Schneider, S. Seager, W.A. Traub, and N.J. Woolf. Remote sensing of planetary
properties and biosignatures on extrasolar terrestrial planets. Astrobiology, 2(2):153–
181, 2002.
[18] J.R. Fienup. Phase retrieval algorithms: a comparison. Applied Optics, 21(15):2758–
2769, 1982.
[19] JR Fienup. Phase-retrieval algorithms for a complicated optical system. Applied Optics,
32(10):1737–1746, 1993.
[20] J.R. Fienup. Phase-retrieval algorithms for a complicated optical system. Applied optics,
32(10):1737–1746, 1993.
[21] A. Gelb, J. Kasper, R. Nash, C. Price, and A. Sutherland. Applied Optimal Estimation.
M.I.T Press, 1974. ISBN 0486682005.
[22] R.W. Gerchberg and W.O. Saxton. A practical algorithm for the determination of the
phase from image and diffraction plane pictures. Optik, 35:237–246, 1972.
[23] A. Give’on, B. Kern, S. Shaklan, D.C. Moody, and L. Pueyo. Broadband wavefront
correction algorithm for high-contrast imaging systems. Proceedings of SPIE, 6691:
66910A–1 – 66910A–11, 2007.
[24] A. Give’on, B.D. Kern, and S. Shaklan. Pair-wise, deformable mirror, image plane-
based diversity electric field estimation for high contrast coronagraphy. In Proceedings
of SPIE, volume 8151, page 815110, 2011.
141
[25] Joseph W. Goodman. Introduction to Fourier Optics. Roberts & Company, 2005.
[26] J.J. Green and S.B. Shaklan. Optimizing coronagraph designs to minimize their contrast
sensitivity to low-order optical aberrations. Optical Science and Technology, 2003.
[27] J.J. Green, S.B. Shaklan, R.J. Vanderbei, and N.J. Kasdin. The sensitivity of shaped
pupil coronagraphs to optical aberrations. In Proceedings of SPIE Conference on As-
tronomical Telescopes and Instrumentation, 5487, pages 1358–1367, 2004.
[28] T.D. Groff and N.J. Kasdin. Designing an optimal estimator for more efficient wavefront
correction. In Proceedings of SPIE, volume 8151, page 81510X, 2011.
[29] T.D. Groff, A. Carlotti, and N.J. Kasdin. Progress on broadband control and deformable
mirror tolerances in a 2-dm system. In Proceedings of SPIE, volume 8151, page 81510Z,
2011.
[30] O. Guyon. Phase-induced amplitude apodization of telescope pupils for extrasolar ter-
restrial planet imaging. Astron. Astrophys, 404:379, 2003.
[31] O. Guyon, J.R.P. Angela, D. Backmanc, R. Belikovc, D. Gaveld, A. Giveone, T. Greenec,
J. Kasdinf, J. Kastingg, M. Levinee, et al. Pupil mapping exoplanet coronagraphic
observer (peco). In Proc. of SPIE Vol, volume 7010, pages 70101Y–1, 2008.
[32] O. Guyon, T. Matsuo, and R. Angel. Coronagraphic low-order wave-front sensor: Prin-
ciple and application to a phase-induced amplitude coronagraph. The Astrophysical
Journal, 693:75, 2009.
[33] O. Guyon, E. Pluzhnik, F. Martinache, J. Totems, S. Tanaka, T. Matsuo, C. Blain,
and R. Belikov. High-contrast imaging and wavefront control with a piaa coronagraph:
Laboratory system validation. Publications of the Astronomical Society of the Pacific,
122(887):71–84, 2010. ISSN 0004-6280.
142
[34] O. Guyon, B. Kern, R. Belikov, S. Shaklan, A. Kuhnert, et al. Phase-induced amplitude
apodization (piaa) coronagraphy: recent results and future prospects. In Proceedings of
SPIE, volume 8151, page 81510H, 2011.
[35] S.B. Howell. Handbook of CCD astronomy, volume 2. Cambridge Univ Pr, 2000. ISBN
0521648343.
[36] N. J. Kasdin and D. A. Paley. Engineering Dynamics A Comprehensive Introduction.
Princeton University Press, 2011.
[37] N. J. Kasdin, R. J. Vanderbei, and R. Belikov. Shaped pupil coronagraphy. Comptes
Rendus Physique, 8:312–322, April 2007. doi: 10.1016/j.crhy.2007.02.009.
[38] N.J. Kasdin, R.J. Vanderbei, D.N. Spergel, and M.G. Littman. Extrasolar planet finding
via optimal apodized-pupil and shaped-pupil coronagraphs. The Astrophysical Journal,
582(2):1147–1161, 2003.
[39] N.J. Kasdin, A. Carlotti, L. Pueyo, T. Groff, and R. Vanderbei. Unified coronagraph
and wavefront control design. In Proceedings of SPIE, volume 8151, page 81510Y, 2011.
[40] Jason Kay. Electric Field Estimation for High-Contrast Imaging. PhD thesis, Princeton
University, 2009.
[41] C. Keller, V. Korkiakoski, N. Doelman, R. Fraanje, and M. Verhaegen. Extremely
fast focal-plane wavefront sensing for extreme adaptive optics. In Proceedings of SPIE,
volume 8447, page In Press, 2012.
[42] M.J. Kuchner and W.A. Traub. A coronagraph with a band-limited mask for finding
terrestrial planets. The Astrophysical Journal, 570(2):900–908, 2002.
[43] M. Levine, S. Shaklan, and J. Kasting. Terrestiral planet finder coronagraph science
and technology definition team (stdt) report, 2006. URL http://exep.jpl.nasa.gov/
TPF/STDT_Report_Final_Ex2FF86A.pdf.
143
[44] M. Levine, D. Lisman, S. Shaklan, J. Kasting, W. Traub, J. Alexander, R. Angel,
C. Blaurock, M. Brown, R. Brown, et al. Terrestrial planet finder coronagraph (tpf-c)
flight baseline concept. Arxiv preprint arXiv:0911.3200, 2009.
[45] M. Levine, R. Soummer, et al. Overview of technologies for direct optical imaging of
exoplanets. submitted to the Astro2010 technology development white paper call, 2009.
[46] R.G. Lyon, M. Clampin, P. Petrone, U. Mallik, T. Madison, M.R. Bolcar, M.C. Noecker,
S.E. Kendrick, and M. Helmbrecht. Vacuum nuller testbed (vnt) performance, charac-
terization and null control: progress report. In Proceedings of SPIE, volume 8151, page
81510F, 2011.
[47] B. Macintosh, M. Troy, R. Doyon, J. Graham, K. Baker, B. Bauman, C. Marois,
D. Palmer, D. Phillion, L. Poyneer, I. Crossfield, P.J. Dumont, B. M. Levine, M. Shao,
E. Serabyn, C. Shelton, G. Vasisht, J. K. Wallace, J. Lavigne, P. Valee, N. Rowlands,
K. Tam, and D. Hackett. Extreme adaptive optics for the thirty meter telescope. In
Proceedings of SPIE, volume 6272, pages 62720N–1 – 62720N–15. procee, 2006.
[48] B.A. Macintosh, J.R. Graham, D.W. Palmer, R. Doyon, J. Dunn, D.T. Gavel, J. Larkin,
B. Oppenheimer, L. Saddlemyer, A. Sivaramakrishnan, et al. The gemini planet imager:
from science to design to construction. In Proc. SPIE, volume 7015, pages 7015–43, 2008.
[49] F. Malbet, JW Yu, and M. Shao. High-dynamic-range imaging using a deformable
mirror for space coronography. Publications of the Astronomical Society of the Pacific,
107(710):386–398, 1995.
[50] F. Martinache, O. Guyon, V. Garrel, C. Clergeon, T. Groff, P. Stewart, R. Russell, and
C. Blain. The subaru coronagraphic extreme ao project: progress report. In Proceedings
of SPIE, volume 8151, page 81510Q, 2011.
[51] Y. Minowa, Y. Hayano, S. Oya, M. Watanabe, M. Hattori, O. Guyon, S. Egner, Y. Saito,
M. Ito, H. Takami, et al. Performance of subaru adaptive optics system ao188. In
144
Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, volume
7736, page 122, 2010.
[52] C. Petit, J.M. Conan, C. Kulcsar, H.F. Raynaud, T. Fusco, J. Montri, and D. Rabaud.
Optimal control for multi-conjugate adaptive optics. Comptes Rendus Physique, 6(10):
1059–1069, 2005.
[53] L. Poyneer and J.P. Veran. Predictive wavefront control for adaptive optics with arbi-
trary control loop delays. JOSA A, 25(7):1486–1496, 2008.
[54] L. Pueyo and N. J. Kasdin. Polychromatic Compensation of Propagated Aberrations
for High-Contrast Imaging. ApJ, 666:609–625, September 2007.
[55] L. Pueyo, J. Kay, N.J. Kasdin, T. Groff, M. McElwain, A. Give’on, and R. Belikov.
Optimal dark hole generation via two deformable mirrors with stroke minimization.
Applied Optics, 48(32):6296–6312, 2009.
[56] L. Pueyo, S.B. Shaklan, A. Give’On, M. Troy, N.J. Kasdin, J. Kay, T. Groff, M. McEl-
wain, and R. Soummer. Correction of quasi-static wavefront errors for elt with two
sequential dms. In Adaptative Optics for Extremely Large Telescopes, volume 1, page
5009, 2010.
[57] L. Pueyo, N. Jeremy Kasdin, A. Carlotti, and R. Vanderbei. Design of phase induced
amplitude apodization coronagraphs over square apertures. The Astrophysical Journal
Supplement Series, 195:25, 2011.
[58] L. Pueyo, N.J. Kasdin, and S. Shaklan. Propagation of aberrations through phase-
induced amplitude apodization coronagraph. JOSA A, 28(2):189–202, 2011.
[59] Laurent Pueyo. Broadband contrast for exo-planet imaging: The impact of propagation
effects. PhD thesis, Princeton University, 2008.
145
[60] D. Redding et al. Wavefront sensing & control for a large segmented space telescope.
In Bulletin of the American Astronomical Society, volume 41, page 342, 2009.
[61] F. Rigaut. Ground-conjugate wide field adaptive optics for the elts. Beyond conventional
adaptive optics, 58:11–16, 2002.
[62] Dmitry Savransky. Exosystem Modeling for Mission Simulation and Survey Analysis.
PhD thesis, Princeton University, 2011.
[63] S. B. Shaklan and J. J. Green. Reflectivity and optical surface height requirements
in a broadband coronagraph. 1.Contrast floor due to controllable spatial frequencies.
Applied Optics, 45:5143–5153, July 2006.
[64] S.B. Shaklan and J.J. Green. Low-order aberration sensitivity of eighth-order corona-
graph masks. The Astrophysical Journal, 628:474, 2005.
[65] S.B. Shaklan, J.J. Green, and D.M. Palacios. The terrestrial planet finder coronagraph
optical surface requirements. Proceedings of SPIE, 6265:pp. 62651I–1 – 62651I–12, 2006.
[66] S.B. Shaklan, L. Marchen, J. Krist, and M. Rud. Stability error budget for an aggressive
coronagraph on a 3.8 m telescope. In Proceedings of SPIE, volume 8151, page 815109,
2011.
[67] D. N. Spergel. A new pupil for detecting extrasolar planets. astro-ph/0101142, 2000.
[68] R.F. Stengel. Optimal Control and Estimation. Dover Publications, 1994. ISBN
0486682005.
[69] J. Trauger, K. Stapelfeldt, W. Traub, C. Henry, J. Krist, D. Mawet, D. Moody, P. Park,
L. Pueyo, E. Serabyn, et al. Access: a nasa mission concept study of an actively
corrected coronagraph for exoplanet system studies. In Proceedings of SPIE, volume
7010, page 701029, 2008.
146
[70] S. Unwin and W. Traub. Zodiac: A balloon facility for exoplanet debris disk observa-
tions. In Proceedings of the conference In the Spirit of Lyot 2010: Direct Detection of
Exoplanets and Circumstellar Disks. October 25-29, 2010. University of Paris Diderot,
Paris, France. Edited by Anthony Boccaletti., volume 1, page 35, 2010.
[71] R. J. Vanderbei, E. Cady, and N. J. Kasdin. Optimal Occulter Design for Finding
Extrasolar Planets. The Astrophysical Journal, 665(1):794–798, 2007.
[72] R.J. Vanderbei. Fast fourier optimization, sparsity matters. Mathematical Programming
Computation, pages 1–17, 2012.
[73] R.J. Vanderbei and W.A. Traub. Pupil mapping in 2-d for high-contrast imaging.
Astrophysical Journal, 626:1079–1090, 2005.
147