THE WOLF–RAYET PHENOMENON IN THE
INFRARED: MASSIVE STARS PROBING STELLAR
FORMATION
A Dissertation
Presented to the Faculty of the Graduate School
of Cornell University
in Partial Fulfillment of the Requirements for the Degree of
Doctor of Philosophy
by
John-David Thomas Smith
May 2001
c© John-David Thomas Smith 2001ALL RIGHTS RESERVED
THE WOLF–RAYET PHENOMENON IN THE INFRARED: MASSIVE STARS
PROBING STELLAR FORMATION
John-David Thomas Smith, Ph.D.
Cornell University 2001
We present 8–13µm spectra at resolution R∼ 600 of 29 northern Galactic Wolf–Rayet stars, covering a broad range of spectral subtypes, including 14 WC, 13 WN,
1 WN/WC, and an additional reclassified WN star. Most constitute the first ever
reported mid-infrared spectrum. Lines of He i and He ii, accompanied in some stars
by [Ne ii] and [S iv], are strongly present in 22 of the sources observed, while 6
of the sources exhibit the powerful emission of heated circumstellar carbon dust.
Correspondence with optically determined subtypes is found to be incomplete, with
significant deviations for later types seen in both WN and WC. For a single WC
star, WR121, neon abundance is estimated from [Ne ii] emission, and found to be
∼7× the cosmic value, as predicted by long-standing but contested evolved corenucleosynthesis calculations.
The observed line parameters are used in a population synthesis model to quan-
tify the contribution of WR stars to the total integrated emission of a single 106 M�
stellar population formed in an instantaneous burst, and including a T≤70 K dustemission component with Lir/Lblue = 0.1. After ∼5Myr, the infrared emissionlines which form in the WR winds achieve similar line luminosities and equivalent
widths in the aggregate starburst spectrum as the commonly used optical λ4686
“WR Bump”, when the stars are embedded in dust providing AV ≥ 5 magnitudesof screen extinction. We address the possibility of direct detection of these infrared
wind lines, which may be possible in star-forming galaxies if a significant popula-
tion of recently formed stars is hidden by dust, and the starburst is observed against
a relatively low-level stellar background. A mid-infrared spectrum of Wolf–Rayet
Galaxy NGC5253 is presented, but hot dust overwhelms these line features. We es-
timate that 5× 105 WR stars would be required for direct detection of mid-infraredwind lines in this nearby dwarf galaxy, with substantially fewer required for strong
near-infrared lines.
Observations were made with Score, a unique mid-infrared spectrograph built
as a prototype of Sirtf’s short-wavelength high-resolution spectrograph module.
Score achieves spectral sensitivity similar to the Infrared Space Observatory. Im-
portant details of the instrument are presented, along with new techniques developed
for the extraction of Score spectral data.
Biographical Sketch
On an auspicious Monday, September 24, 1973, John-David Thomas Smith was
born to young parents Thomas and Teressa, sharing his birthday with such notables
as Chief Justice John C. Marshall, author F. Scott Fitzgerald, and Jim Henson,
of Kermit the Frog fame. After a short stint traveling with his family in a green
converted school bus, following the orchard harvest from Southeast to Northwest and
living the care-free lifestyle of hippydom, he traded the jaunty caprice of migrant
fruit picking for a practical and rooted trailer-in-a-field existence. Continuing to
climb up the complex Southern social ladder in home town Owensboro, Kentucky,1
he and family, which had since become six, soon moved into a cabin and finally a
modest home in the lush countryside of the Miller Lakes Resort Park, operated by
his grandparents primarily, it seemed to him, for his enjoyment. Long summer days
spent in the lake, building stick forts in the woods, or raising fishing worms to sell to
deep-pocketed RV campers left him generally happy, but unprepared for the harsh
reality of West Louisville Elementary School, with its mandatory square-dancing
classes, and jelly-bean field days.
Acutely aware of the ever-waning attention span of modern Americans, he soon
adopted the foreshortened sobriquet “JD”, which he carries to this day, except when
addressed by his grandparents.
Despite various experiments of differing degrees of success (and punishment) in
the chemistry of flammability which he conducted before the tardy bell in class-
room trash cans throughout middle and early high school, JD continued to advance
through the peerless American educational system at a dizzying pace. Long camp-
ing trips in the South and Southwest reaffirmed his love of the natural world, which,
when coupled with the realization that his job at the Windy Hollow Drag Racing
1Barbecue capital of the world.
iii
Track offered little growth potential, seeded his first thoughts of becoming a scien-
tist, which he mistakenly thought implied some special lifestyle cachet. His interest
in the physical sciences was furthered by a high-school chemistry teacher who taught
him never to get acid “near your eye or your groin,” and a physics teacher who gave
him a new approach to problem-solving: “Whaddya gonna do, coach? Drop back
ten yards and punt.” The latter was an especially effective technique when applied
to calculations of the “momanumanertia.” His Latin teacher helped spawn a lifelong
love of literature, and reaffirm the wisdom of ipsa scientia potestas est, not to men-
tion semper ubi sub ubi. In the Spring of 1991, he graduated from that esteemed
center of erudition, Apollo High School, and spent the summer studying aquatic
lake life in situ, and jumping on trains with other indiscriminately safety-conscious
friends.
Armed with the brazen optimism of youth, and an admission letter to the Mas-
sachusetts Institute of Technology, JD headed north to Boston that Fall, after hav-
ing been duly warned by his grandfather of loose Yankee morals and politics. After
some adjustment to city life and the mysterious disappearance and reappearance
of r’s from the native dialect, he jump-started his budding Physics career by refin-
ing hydrous-balloon ballistic ranging techniques from his Back Bay roof deck. The
wealth of opportunities for the curious mind at MIT was sufficient to outweigh its
brutal spirit-crushing machinery, in the steely grip of which many strong friend-
ships were forged. A summer job at an engineering firmed cemented his distaste
for the overly practical, and various other research projects at MIT confirmed that
astrophysics was indeed the most interesting sub-branch of the field, as he had first
intuitively suspected sleeping under the stars as an eight-year old.
After graduating from MIT in 1995, and yearning for a return to his rural (feral?)
childhood, JD headed to Cornell University, centrally isolated in scenic upstate New
York. There he reconnected with his youthful self by roving through trails and
streambeds once again. After some thought, and a coin-toss (which mandate he
disobeyed), he joined the first-rate infrared group in the Department of Astronomy,
where he would spend many productive days rummaging through old drawers of
wires and components last shelved several decades earlier, until destiny had brought
them together. Almost six years later he emerged like a spotted wood moth from
the chrysalis of graduate education, revitalized and ready to chew with purpose and
vigor through the pulpy timbers of science.
JD heads next for the desert climes of Tucson, Arizona, with fingers crossed in
iv
anticipation of the launch of the spacecraft which will put food on his table for the
next several years. But first he will be married, and revel with his bride in one last
warm and fine Ithaca summer.
v
For Sara,
my fidus Achates.
vi
Acknowledgements
Truth persuades by teaching, but does not teach by persuading.
—Quintus Septimius Tertullianus
I have been privileged to have been taught many things by many remarkable teach-
ers, not all of whom would consider that their foremost profession. I am pleased
to acknowledge the lasting impact they have had on my graduate career, and the
production of this thesis. First and foremost, my advisor, Jim Houck, always im-
pressed me with his uncanny ability to reduce month-long projects into ten words
or less, which accurately summed-up a) what I had done, b) what I hadn’t done
that I thought I did, and c) what I really should have done. Occasionally all of these
analyses could be further compressed into one poignant pair: “What’s happening?”
— a query of incomparable motivational power. I have always tried to emulate his
startling and consistent faculty for simultaneously seeing both the forest and the
trees.
I thank my other advising committee members, Gordon Stacey, Jim Cordes, and
David Chernoff, for their close reading and excellent suggestions. I am particularly
indebted to Gordon for the inexhaustible patience and good humor which he brought
to all our discussions, science and otherwise.
I am very grateful to the Score team and contributors, including John Wilson,
Stephen Rinehart, Mike Colonno, Chuck Henderson, and especially Jeff Van Cleve,
without whom the instrument would likely have been shaken to death by Sirtf
contractors, instead of being used for science (and the good of humanity). Others
who contributed substantially to Score, and to most of the projects in the infrared
group, were George Gull, Bruce Pirger, and Justin Schoenwald.
The Palomar staff was exceptionally responsive and helpful. I thank in particular
telescope operators Karl Dunscombe and Rick Burruss, who always kept the mood
vii
light and humor good, even at the bitter end of ten day observing runs. Mike
Doyle, John Henning, Dave Tennent, and the rest of the Palomar crew deserve
special commendation for their expertise and patience.
For their helpful discussion and assistance with modifying Starburst99, I
thank Claus Leitherer, and especially Daniel Devost, with whom I learned more
about this code than I probably should have. Pat Morris and Roberta Humphreys
offered very illuminating discussion on the properties of hot stars and spectral di-
agnostics. Vassilios Charmandaris was always available with interesting anecdotes
on the lives of galaxies, and offered encouragement in the darkest hours of thesis
preparation, assuring me on multiple occasions that “it does add up.” I also thank
Matt Bradford for having finished before me, so that I could profit from his triumphs
and learn from his mistakes.
The support of friends and family is of course a vital ingredient in any graduate
career, and I have been fortunate to have had both. I thank my parents, who never
told me I couldn’t, and my brother and sisters, for showing me how much more
there is to life. Since there are far too many to list, I thank all my friends who have
so generously given of themselves, and reminded me that “life is too important to
take seriously.”
Special thanks are due Sara Ann Lederman, who has been my beacon and my
hope for the past eight years, and without whom I would know very little about life,
loyalty, love, and true happiness. May I always be so fortunate.
viii
Table of Contents
1 Introduction 11.1 Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 The SIRTF Cornell Echelle Spectrograph, SCORE 62.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.1 Two Paths Diverge . . . . . . . . . . . . . . . . . . . . . . . . 72.2.2 Electronics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 Spectral Extraction 143.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143.2 Issues of Cross Dispersion . . . . . . . . . . . . . . . . . . . . . . . . 143.3 Optimal Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.3.2 Profile and Variance . . . . . . . . . . . . . . . . . . . . . . . 203.3.3 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.3.4 Order Curvature . . . . . . . . . . . . . . . . . . . . . . . . . 233.3.5 Line Tilt and Curvature . . . . . . . . . . . . . . . . . . . . . 25
3.4 Scorex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.4.1 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.4.2 Noise Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.4.3 Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.4.4 Order Overlap . . . . . . . . . . . . . . . . . . . . . . . . . . . 363.4.5 Observing Efficiency . . . . . . . . . . . . . . . . . . . . . . . 363.4.6 Flux Calibration . . . . . . . . . . . . . . . . . . . . . . . . . 373.4.7 Flux Renormalization . . . . . . . . . . . . . . . . . . . . . . . 37
3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
ix
4 Wolf-Rayet Stars 394.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394.2 Physical Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . 414.3 Mid-Infrared Spectroscopy: Background . . . . . . . . . . . . . . . . 474.4 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484.5 Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.5.1 Reddening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524.6 Observed Spectra and Results . . . . . . . . . . . . . . . . . . . . . . 54
4.6.1 The WN Stars . . . . . . . . . . . . . . . . . . . . . . . . . . . 734.6.2 The WC Stars . . . . . . . . . . . . . . . . . . . . . . . . . . . 754.6.3 Line Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.7 Terminal Velocities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 884.8 Mass Loss Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.8.1 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 904.8.2 Observations and Results . . . . . . . . . . . . . . . . . . . . . 93
4.9 Neon Abundance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 974.9.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 974.9.2 Abundance Calculation . . . . . . . . . . . . . . . . . . . . . . 984.9.3 Observations and Inputs . . . . . . . . . . . . . . . . . . . . . 1004.9.4 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . 103
4.10 Neon/Sulfur Abundance . . . . . . . . . . . . . . . . . . . . . . . . . 1054.11 Dust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1064.12 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5 The Wolf-Rayet Phenomenon in Star-Formation 1085.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1095.2 Introduction and Background . . . . . . . . . . . . . . . . . . . . . . 1125.3 Starburst Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.3.1 STARBURST99 . . . . . . . . . . . . . . . . . . . . . . . . . . 1155.3.2 Infrared Activity . . . . . . . . . . . . . . . . . . . . . . . . . 1155.3.3 Defining the FIR flux . . . . . . . . . . . . . . . . . . . . . . . 1165.3.4 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1165.3.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1185.3.6 Extinction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.4 Infrared Detectability . . . . . . . . . . . . . . . . . . . . . . . . . . . 1285.4.1 Continuum Dilution . . . . . . . . . . . . . . . . . . . . . . . 1285.4.2 Line Strength . . . . . . . . . . . . . . . . . . . . . . . . . . . 1295.4.3 Background Continuum Sources . . . . . . . . . . . . . . . . . 1315.4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.5 NGC 5253 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1335.5.1 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . 1345.5.2 Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1345.5.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
5.6 Required Number of WR Stars . . . . . . . . . . . . . . . . . . . . . 136
x
5.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
6 Conclusions and Future Direction 1396.1 Wolf-Rayet Stars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1396.2 Wolf-Rayet Galaxies . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
A Derivation of the Optimal Extraction Weights 142
xi
List of Tables
3.1 Digital Unit to Photoelectron Conversion Factor. . . . . . . . . . . . 34
4.1 Source observations and parameters . . . . . . . . . . . . . . . . . . 494.2 Undetected Sources with Upper Flux Limits. . . . . . . . . . . . . . 514.3 Object Counts in Survey by Subtype. . . . . . . . . . . . . . . . . . 534.4 Observed Line Data, by Line. . . . . . . . . . . . . . . . . . . . . . 784.5 He i and He ii Line Blend Constituents. . . . . . . . . . . . . . . . . 864.6 Terminal Wind Velocity from [S iv] . . . . . . . . . . . . . . . . . . . 894.7 Mass Loss Rate from Extrapolated Continuum . . . . . . . . . . . . 964.8 Atomic Data for Neon and Sulfur Lines. . . . . . . . . . . . . . . . . 1004.9 Example Mass-Loss Rates from Various Methods. . . . . . . . . . . . 1024.10 Ne+ and S3+ Abundances. . . . . . . . . . . . . . . . . . . . . . . . . 104
5.1 Starburst99 Model Inputs . . . . . . . . . . . . . . . . . . . . . . 1185.2 Maximum WR Flux Contribution at Selected Wavelengths . . . . . . 1225.3 WR Line Ratios at Maximum Contribution . . . . . . . . . . . . . . 130
xii
List of Figures
2.1 Optical Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2 Production Optical Schematic . . . . . . . . . . . . . . . . . . . . . . 92.3 NGC 7027 Spectrum and Slit View . . . . . . . . . . . . . . . . . . . 102.4 Electronics Schematic . . . . . . . . . . . . . . . . . . . . . . . . . . 112.5 Sensitivity Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.1 Curved Order Schematic . . . . . . . . . . . . . . . . . . . . . . . . . 153.2 Unreduced Atmospheric Spectrum . . . . . . . . . . . . . . . . . . . 163.3 Straight Array Order Schematic . . . . . . . . . . . . . . . . . . . . . 183.4 Curved Array Order Schematic with Line Tilt . . . . . . . . . . . . . 233.5 Laboratory Ammonia and Blackbody Spectra . . . . . . . . . . . . . 303.6 Score Wavelength and Line Tilt Calibration . . . . . . . . . . . . . 323.7 Score Electronic Noise Variance . . . . . . . . . . . . . . . . . . . . 35
4.1 Upper Main Sequence Luminosity Limit . . . . . . . . . . . . . . . . 424.2 WR Star Lifetime as a Function of Mass and Metallicity . . . . . . . 444.3 The WN9–WN8 spectra . . . . . . . . . . . . . . . . . . . . . . . . . 564.4 The WN8–WN7 spectra . . . . . . . . . . . . . . . . . . . . . . . . . 584.5 The WN6 spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604.6 The WN5 spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624.7 The WN5–WN4 spectra . . . . . . . . . . . . . . . . . . . . . . . . . 644.8 The WC9 spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . 654.9 The WC8–WC7 spectra . . . . . . . . . . . . . . . . . . . . . . . . . 684.10 The WC7–WC6 spectra . . . . . . . . . . . . . . . . . . . . . . . . . 704.11 The WC5–WC4 spectra . . . . . . . . . . . . . . . . . . . . . . . . . 724.12 The NaSt1 Spectrum. . . . . . . . . . . . . . . . . . . . . . . . . . . 744.13 He ii 9.7µm to He i+He ii 11.3µm Line Ratios . . . . . . . . . . . . . 844.14 He ii 9.7µm:He i+He ii 11.3µm:He i+He ii 12.36 Line Ratios . . . . . 854.15 Wind Model Geometry . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.1 Spectral Energy Distribution of Galaxies . . . . . . . . . . . . . . . . 1115.2 Optical WR Galaxy diagnostic . . . . . . . . . . . . . . . . . . . . . 1135.3 Model Spectra Energy Distribution . . . . . . . . . . . . . . . . . . . 1195.4 WR and O Star Count Evolution . . . . . . . . . . . . . . . . . . . . 121
xiii
5.5 WR Contribution to Total Spectral Energy . . . . . . . . . . . . . . 1245.6 Extincted Model Spectral Energy Distribution . . . . . . . . . . . . . 1275.7 NGC 5253 Score Spectrum . . . . . . . . . . . . . . . . . . . . . . . 135
xiv
We are all in the gutter, but someof us are looking at the stars.
Oscar Wilde
Chapter 1
Introduction
The search for the true history of star formation in the universe is not unlike any
other historical analysis: plagued by incomplete and inaccurate information, con-
flicting accounts, impassioned but opposing viewpoints — the same ambiguities and
inconsistencies that confront historians seeking fundamental understanding of any
event which has been clouded over by the inexorable passage of time.
Uncovering the accurate history of stellar creation, from the first light of primeval
galaxies to the familiar glow of the local neighborhood, would have immediate and
sweeping impact on the theories and frameworks of cosmology, stellar evolution,
galaxy and cluster formation — bridging the substantial gap in knowledge and
time between the creation of the primordial elements, and the complex and highly
structured processes which shape the present day universe. Yet despite intensive
research effort and rapid recent progress, this history remains poorly understood.
A variety of attempts have been made to quantify the total amount of star
formation that occurred at early epochs (e.g. Madau et al., 1996). Some accounts
indicate that the formation rate went through a sharp peak quite recently, near
redshift z ∼ 1, and then fell dramatically to the current, relatively quiescent levels.These efforts to formulate a consistent history, however, have been complicated by
a poor understanding of the distribution and evolution of dust in the most luminous
early stellar environments. The same dust whose formation is intimately connected
to the birth of stars is quite effective at concealing it.
Recent analysis of the luminosity density function of high redshift Lyman-break
galaxies (Steidel et al., 1999) has provided strong support of other evidence indicat-
ing that the peak star forming epoch has not been observed, to z & 4. We simply
have not yet seen the onset of star formation in the universe. The resolution of the
1
2
sub-mm background into individual sources at high redshift, and the subsequent es-
timation of the surprisingly large fraction of star-forming activity concealed by dust
among these objects (e.g. Peacock et al., 2000) have begun to resolve this ambiguity,
but further uncertainties remain, including the importance of active galactic nuclei
(as might help explain the recently resolved hard X-Ray background — Barger
et al., 2001), and associated nuclear starbursts to the energy output of the early
star-forming universe.
The most luminous galaxies known emit strongly in the infrared, out-numbering
quasars in the local universe (z . 0.3) by 2:1 at luminosities greater than 1012 L�(Sanders & Mirabel, 1996). The source of luminosity in these so called infrared
luminous galaxies (ILGs) is uncertain — members of the class display a range of
contribution both from central, active nuclei and extended starbursts. Do the ILGs
have corollaries or progenitors in the distant universe, and will they be related to
the other high z populations already known? What role do normal galaxies play in
the overall history of star formation?
We will soon be able to begin answering these questions, thanks to new space
and airborne observatories including SIRTF, Herschel, and SOFIA, along with a
developing contingent of more capable ground-based instruments deployed on larger,
more efficient telescopes at infrared-favorable sites.
1.1 Techniques
To probe the evolution of star formation within a population of objects, it is vital
to understand the various contributions to their energy output. Increased sensitiv-
ity may uncover relationships between currently known classes of objects, or may
make possible the discovery of entirely new classes, but alone it cannot address the
nature of their energy generation. Without techniques available to approach and
disentangle the dominant underlying power mechanisms, we will be limited to sim-
ply counting objects, albeit in a variety of ways (luminosity functions, co-moving
volume densities, spatial correlation functions, etc.). Though much can be learned
from object counts, without deeper insight into the nature of the luminous content at
every epoch in the star-forming universe, its accurate history cannot be developed.
The measured emission from local luminous galaxies is dominated by several
processes: direct starlight from massive stars, highly non-thermal radiation from
active nuclei, and thermal re-radiation by dust. These emission components com-
3
bine, sometimes with significant dust extinction, to yield energy distributions which
are often difficult to interpret. A variety of techniques exists to help resolve this
ambiguity and differentiate star formation from other power sources.
Tracing the Hα recombination emission from H ii regions excited by ionizing
radiation of the embedded hot stars is a common method of measuring stellar con-
tent, and has been used with good results (e.g. Kennicutt, 1998); however, it fails
for objects suffering more than moderate extinction. Also, since atomic hydrogen is
so abundant in galaxies, and can be excited or ionized in a variety of astrophysical
contexts, its power to distinguish stellar from other excitation mechanisms can be
limited. Other hydrogen recombination lines (e.g. Hβ or Brγ) can be used in the
same way, but they quickly become quite weak at longer wavelengths, before appre-
ciable relief from heavy extinction is obtained.1 A more fundamental difficulty is
the reliance on properties of the surrounding medium. Since the lines observed have
been reprocessed by recombination, a “leaky” medium which permits the escape of
ionizing photons from the galaxy can confuse or obstruct this diagnostic.
Ultraviolet (UV) continuum measurements directly sample the photospheres of
hot stars, which emit the bulk of their energy at these wavelengths. Hence the total
UV flux scales directly with the star formation rate of active star-forming galaxies.
Emission just longward of the Lyman limit is in principle an extremely powerful
diagnostic of the stellar population, and ground-based rest frame UV observations of
high redshift galaxies have already provided compelling constraints on the evolution
of star and element formation (Madau et al., 1996). The chief disadvantage of
ultraviolet techniques is extreme sensitivity to extinction. The extinction just above
the Lyman limit is A.1µm/AV ∼ 400 — an increasingly severe difficulty to overcomefor studying environments of very recent star formation, in which the hottest and
youngest stars can often be the most deeply embedded. For distant galaxies the
Lyman forest of absorption lines can also remove considerable flux between Lyα
and the limit.
Energy source discrimination techniques which are based on far-infrared and ra-
dio continuum measurements, while virtually free from the effects of dust extinction,
are more prone to misinterpretation, since in dusty galaxies with hidden AGN they
sample only reprocessed radiation. Alternatively, the ionized regions surrounding
the hot stars in a stellar population can be directly investigated with mid- and
1Brα (4.05µm) may be an important line available from space, since it sits at a relative minimumin dust extinction.
4
far-infrared fine structure lines of neon, sulfur, and other elements, which suffer
relatively little extinction (AV /A15µm ∼ 100). The hardness of the ionizing flux(and hence the nature of the source of ionizing photons) can be determined using
well-chosen ratios among these lines, but age dependence, upper-mass cutoff, and
abundance effects may dilute their discriminatory power (Thornley et al., 2000).
A technique for identifying star formation which uses unambiguously stellar ra-
diation has led to a class of objects known as Wolf-Rayet Galaxies (Conti, 1991),
so named because of their measured contingent of Wolf-Rayet (WR) stars, a class
of peculiar emission line objects characterized by extreme mass loss rates Ṁ ∼10−4 M� yr
−1 = 1010Ṁ�. The mass loss is driven in fast (365 ≤ v∞ ≤ 5000 km/s),dense stellar winds, in which broad emission lines are formed. Though detecting the
unusual emission features which signal the presence of WR stars in the integrated
spectra of galaxies is challenging, this direct diagnostic provides incontrovertible
evidence of massive star formation.
Although the WR direct detection technique is powerful, it has so far been
applied using only a handful of strong optical wind lines, which cannot probe envi-
ronments in which star formation is embedded in dust. Though intrinsically weaker,
the possibility of extending this diagnostic to use infrared WR wind emission lines to
pierce the veil of extinction which may cloak a substantial fraction of star formation
in the local and distant universe is what motivated this thesis.
1.2 Organization
In an effort to extend the available spectral templates of WR stars to mid-infrared
(MIR) wavelengths, we undertook an 8–13µm spectrophotometric survey of fifty
northern Galactic stars with Score, a novel MIR spectrograph built as a prototype
of one of the instruments to fly aboard Sirtf. Score will be described briefly in
Chapter 2, and its sensitivity compared with past and future space-borne instru-
ments. Chapter 3 will document new analysis techniques which were developed for
the extraction of the Score spectral data. The spectra themselves will be pre-
sented in Chapter 4, along with constraints on Wolf-Rayet stellar evolution from
mass-loss rates, terminal wind velocities, and neon and sulfur abundances derived
from the data. In Chapter 5, population synthesis models of very young, instan-
taneously formed single stellar populations will be presented, which draw on the
measured MIR spectra to evaluate the conditions for which infrared direct WR
5
detection might be favorable. Score observations of nearby blue compact dwarf
galaxy NGC5253 will also be presented. Final conclusions and future possibilities
will be given in Chapter 6.
Chapter 2
The SIRTF Cornell Echelle
Spectrograph, SCORE
2.1 Introduction
The Space Infrared Telescope Facility (Sirtf) is a NASA observatory slated to
launch in July, 2002. It will consist of two cameras and a four-module spectrograph,
the Infrared Spectrograph (IRS; Houck et al., 2000; Roellig et al., 1998). Two of
the four IRS modules operate at low resolution with long slits, and two trade slit
length for increased resolution.
Compared to ground based infrared observations, Sirtf’s expected sensitivity
enhancements are phenomenal. Atmospheric molecular bands of H2O, N2O, CO2,
CH4, O3, and others absorb large amounts of infrared radiation. A blackbody of tem-
perature 300K reaches its peak flux at ∼10µm, and hence the thermal backgroundof the telescope, and the atmosphere, which is substantially emissive in the same
molecular bands mentioned, is extremely high. Sirtf gains significantly both by
getting out from beneath the atmosphere and away from the thermal background of
a warm telescope in its solar, Earth-trailing orbit, and by passively cooling the tele-
scope assembly; however, these advantages alone do not account for the tremendous
Sirtf sensitivity gains. Other advancements combine to make it several hundred
times more sensitive than the similarly sized ISO space observatory. Efficient, large-
format infrared detector array technology has recently progressed at a rapid pace.
Coupled with spectrographic designs which make full use of the higher pixel count,
these detectors contribute significantly to the improved performance. Though still
plagued by the difficulties of high thermal background and atmospheric opacity,
6
7
ground-based mid-infrared instruments can take advantage of the same detectors
and designs to achieve greatly improved performance over earlier efforts.
The Sirtf Cornell Echelle (Score) is a mid-infrared (MIR) spectrograph built
as a prototype of the IRS’s short wavelength, high resolution (“short-high”) module,
and was the first cross-dispersed MIR spectrograph in operation (Van Cleve et al.,
1998; Smith et al., 1998). Score operates at a resolution of R ∼ 600 over the N-Band wavelength range from 8–13.5µm, using versions of the same detectors which
will fly aboard Sirtf, modified to accommodate the notably higher background
fluxes which attend ground-based observation. It has been operated with success
at the f/70 Cassegrain focus of Palomar Observatory’s 5-meter Hale telescope since
November, 1996. Score was designed with the same philosophy as the Sirtf
instrument: no moving parts, and “bolt-and-go” construction, which eliminates the
fine tuning adjustments most instruments require upon assembly, and decreases cost.
2.2 Hardware
The original Score testbench was constructed entirely of aluminum by Ball Aero-
space, permitting room temperature focusing to hold (except for a second order
focus shift upon cooling) at cryogenic temperatures. The Score dewar was man-
ufactured by Precision Cryogenics, Inc., and contains two cryogenic tanks, which
permit cooling the detector arrays and spectrograph to liquid helium (LHe) temper-
atures for proper operation. Additionally, the 3.7 liter liquid nitrogen (LN2) tank
decreases emission from internal optics and increases the LHe hold time (&24 hours
for 5 liters LHe).
Score uses two 128×128 Si:As BIBIB focal plane arrays, developed at RockwellInternational (now Boeing, Seib et al., 1994). One of the focal-plane arrays (FPAs)
serves as the detector for the spectrograph, and the other serves as the slit-viewer
detector.
2.2.1 Two Paths Diverge
The Sirtf IRS prototype module which forms the core of the Score instrument
was modified with an additional optical path which will not be present on Sirtf—
a slit viewer. Fig. 2.1 shows diagrammatically Score’s two optical paths. The
actual as-built schematic is shown in Fig. 2.2. Incoming light from the telescope is
8
Slit-Viewer FPA
Slit-viewer lens
SlitCollimator
PredisperserGrating
EchelleGrating
Camera Spectrograph FPA
Blocking Element
Filter
f/12 beamfrom f-converter
Cut-on Filter
Figure 2.1: A simple diagram of the Score optical system (not to scale), showingthe two major pathways. The light from the telescope passes throughan f-converter, yielding an f/12 beam on the slit. Light passing throughthe slit plane first encounters a 7.3µm cut-on filter. It is then colli-mated, cross-dispersed, dispersed by the echelle, and imaged onto thespectrograph FPA. Light reflected from the slit plane passes through afilter/blocking element pair and is re-imaged onto the slit-viewer FPA.
focussed on the reflecting slit surface, which is tilted at a 45◦ angle to the optical
axis. The slit itself has a projected size of 240×480µm, which is matched to the5-m telescope’s 10µm diffraction limit of 1′′ (2.44λ/D).
Light passing through the slit is collimated and encounters a grating pre-disperser
and then the echelle grating (which disperse at right angles to each other), and is
then refocused by a camera mirror onto the main detector. This produces the
echellogram spectral format seen in Fig. 2.3, in which 15 individual orders of the
main echelle grating, sorted by the cross-disperser, fall in adjacent positions.
Light which is reflected by the slit surface enters the slit-viewer module, gets
filtered by a blocking element and a “silicate filter” centered at 11.3µm with 1µm
bandpass, and is re-imaged onto the second detector. This provides a 12′′ diameter
field of view surrounding the slit, which is used for target acquisition, and absolute
flux calibration of point sources. The slit-viewer data are recorded simultaneously
with the spectral data, and provide an accurate register of the source position on
the slit. Fig. 2.3 also shows an example slit-viewer image for planetary nebula
9
Figure 2.2: The production schematic of the Score optical elements in two or-thogonal views, starting at the cold Lyot stop formed by the telescopeand f/converter optics. See text for description.
10
Figure 2.3: The crossed-echelle spectrum (right), and simultaneous 11.3µm slit-viewer image (left) of planetary nebula NGC 7027. A bright line of[S iv] is seen in the center of the spectrum, along with Ar iii, Ar iv,[Ne ii], and a broad PAH feature. The slit-viewer camera shows theprecise locate of the slit (dark rectangle) on the southwest lobe.
NGC 7027.
2.2.2 Electronics
The Score control electronics which operate the two arrays are similar to those
employed by SpectroCam, a 10µm spectrograph and camera built by Cornell for
the Palomar 5-m (Hayward et al., 1993). A schematic diagram of the operating
electronics is shown in Fig. 2.4. In brief, the two arrays are driven in parallel, and
clocked out exactly as if they were one larger 256×128 array with eight outputchannels. Most array input clocks and biases are slaved together for the two arrays,
and duplicate pre-amplifier and 14-bit analog-to-digital converters handle digitizing
the four output channels per detector. A dedicated PC mounted near the instrument
at the back of the telescope controls the arrays, while a Sun Sparcstation collects the
data and serves as the instrument’s user interface. The large and highly variable
background seen by Score at Palomar requires rapid chopping (∼5Hz) on andoff source to cancel the changing sky flux and imposes the additional requirement
that the arrays must be read every 60ms to avoid saturating the detector wells. A
11
data
controlMetrabytePIO24PC
NationalAT-DIO-32F
Chop interface box
Choppingsecondary
ObservatoryEthernet
Sparcstation
Telescopecomputer
Telescopeautoguider
Clock LevelShift
Pre-AmpCoadder+ A/D
DC bias
Gain, BW, and S/H enable
Analog box
Digitalbox
Dewar
cam
era
spec
8
4
44
14
CS
4
C = camera detector substrate biasS = spectrograph detector substrate bias
Figure 2.4: A schematic of the Score electronics, indicating the major bias, clock-ing, and data pathways. The two FPA’s are indicated at right in thedewar. Duplicate pre-amplifiers and analog-to-digital converters forthe four output channels of each array are shown.
hardware co-adder is therefore employed to digitally stack consecutive frames in an
integration prior to readout.
2.3 Performance
Score’s performance is illustrated in Fig. 2.5. The point source flux density sensi-
tivity (1-σ, 500s), with chopping overhead removed, is shown. The predicted value
agrees well with the observed sensitivity. Sensitivity is degraded at the edges of the
atmospheric N-Band window, and at a strong ∼9.7µm O3 telluric feature.For reference, the Sirtf IRS sensitivity (J. Van Cleve, priv. communication),
and the ISO sensitivity (R ∼ 103) are shown. The latter was computed using thelaboratory noise model outlined in the ISO handbook (Leech et al., 2001), and
is based on detector testing which does not include in-orbit S/N degradation by
effects such as glitches and fringing. The predicted ISO sensitivity in two detector
bands falls remarkably close to Score’s measured value (though at somewhat higher
resolution). In practice, however, Score’s realized sensitivity is at least as good and
in some instances appears to be noticeably better than ISO’s SWS spectrograph, as
12
Figure 2.5: The Score sensitivity for a 500s staring 1–σ detection. The solid lineshows the predicted sensitivity for 25% combined telescope and skyemissivity, which is in close agreement with the measured sensitivity.Also shown are the sensitivities predicted for the SIRTF IRS module,and computed for the ISO–SWS, along with the continuum flux froma nearby brown dwarf.
13
confirmed by comparison of Score and ISO observations of the same target star.1
Also shown for reference is the flux density of nearby brown dwarf Gl229b.
2.4 Conclusions
The phenomenal performance gains achieved by Score for ground-based mid-
infrared spectroscopy presage the fundamental role Sirtf will play in making this
the “decade of the infrared.” It has vindicated the design concepts and no-moving-
parts philosophy Sirtf’s spectrographic instrument is based on, and has proved
extremely scientifically productive, contributing to projects on Be Stars, planetary
nebulae, M dwarfs, and many others. Future mid-infrared instruments utilizing
modern large detector arrays will benefit from the pioneering role Score has played.
1ISO, however, had much larger wavelength coverage than Score, including regions blockedby the atmosphere. The [Ne iii] (15.5 µm) and [Nev] (24.3 µm) lines, unavailable to ground-basedinstruments, proved particularly valuable.
Chapter 3
Spectral Extraction
3.1 Introduction
The advent of large format optical and infrared detector arrays has substantially im-
pacted spectrograph design and data analysis techniques. Since individual spectral
orders are usually imaged as approximately linear entities, and are generally more
extended in the dispersion direction than the spatial (slit) direction, taking full ad-
vantage of the available detector area requires careful placement of the spectrum
on the array. Partitioning the image delivered by the telescope using fiber bundles
or differential slicing elements allows additional spatial information to be recorded
— spectra from distinct object locations but with the same wavelength coverage
are replicated in different locations on the array. Alternatively, cross dispersing the
spectrum permits additional spectral orders to be separated, filling the detector ar-
ray, and greatly increasing the wavelength coverage available. These modern optical
configurations have revolutionized astronomical spectrography, but have also neces-
sitated a new class of sophisticated algorithms to efficiently extract the spectra they
produce.
3.2 Issues of Cross Dispersion
In spectrographs which cross-disperse the incoming beam (such as Score — Chap-
ter 2), the spectral orders formed by the main dispersing element (often an echelle
grating operated in high order) will not in general fall parallel to the detector rows or
columns. The large wavelength coverage at relatively high resolution made possible
by cross-dispersed echelle spectrograph designs comes with a cost: the necessity of
14
15
cols
row
s
Figure 3.1: A schematic curved order image with exaggerated slit rotation, andspectral axes depicted.
sampling the non-linear dispersion regime of the order-sorting element (typically a
grating or a prism), resulting in curved orders with variable spacing between them.
Grisms, or immersed gratings, which combine the opposing non-linear disper-
sions of prisms and gratings in a single optical element, can potentially be used as
cross-dispersers to reduce order curvature and differential spacing, much as an achro-
matic doublet balances to first order the opposing chromatic aberrations of positive
and negative lenses of different glass materials. Despite the increased placement
efficiency and more evenly spaced orders made possible by grisms, the higher order
aberrations remain significant. The effective dispersion direction and slit direction
(the spectral axes) will in general still not be perpendicular, requiring either one
or the other to cross the detector axes (usually the slit direction is chosen to align
approximately with one of detector axes).
For more typical, curved-order spectra, not only are the spectral axes non-
orthogonal and non-parallel to the detector axes, they also rotate and diverge or
converge along the order, illustrated schematically in Fig. 3.1. Additional, unre-
lated optical effects can serve to introduce wavelength dependent line tilt, as seen in
Fig. 3.2, a Score spectrum of atmospheric emission obtained by summing individ-
ual sky frames, and dividing out the lab-measured optical and electronic efficiency
map to accentuate the data located at the extremes of the slit. Note the pronounced
counter-clockwise rotation of lines in the lower orders.
Long slit lengths, though not usually present in cross-dispersed spectrograph
designs, can complicate matters further by creating slit images which are not just
tilted but also curved. A tilted slit image can be regarded as a small segment of
a curved slit image which samples such a small portion of the curvature that it
appears locally straight.
Curved, unevenly spaced orders and a rotating slit image combine to make stan-
dard, row or column-based extraction undesirable for many digital two-dimensional
16
Figure 3.2: An unreduced atmospheric spectrum showing various emission and ab-sorption lines of H2O, CO2, and O3. Longer wavelengths are towardsthe bottom and left. The combined optical and electronic efficiencymeasured in the lab as the difference of warm blackbodies has beendivided out to enhance the signal at the extremes of the slit. Note thepronounced tilting of lines at the long wavelength (bottom) end of thespectrum, and the differential tilt within individual orders.
17
spectra.
3.3 Optimal Extraction
Early methods for extracting digitally recorded spectra involved identifying the re-
gion within the slit image occupied by the source and sky, and by the sky alone, and
then individually summing over the pixels in these regions in a direction perpen-
dicular to the dispersion axis (the spatial direction). These two sums can be scaled
to each other using the number of pixels involved, and the sky subtraction trivially
performed by differencing the pixel-weighted sums. While this technique is concep-
tually and computationally quite simple, modern spectrographs have increasingly
rendered it obsolete.
Even if the spectral orders are well aligned with the detector axes, the standard
extraction technique suffers. Since an object’s flux is not usually equally distributed
along the slit, but instead is described by a spatial profile — a wavelength depen-
dent function specifying the fraction of light which falls in each pixel along the
slit (see Fig. 3.3) — adding together all of the spatial pixels containing object flux
unnecessarily degrades the signal-to-noise (S/N) of the resultant spectrum, even if
all these pixel values are characterized by the same underlying noise (not often the
case). One alternative possibility is summing over only those object pixels with
the highest S/N , above some preset threshold. Unfortunately, this technique will
compromise spectrophotometric accuracy, since the profile will not be uniformly
sampled. This problem is especially severe for the common case of slit profiles
which change shape with wavelength. An immediate solution is apparent: perform
a weighted total along the slit direction, with weights chosen to maximize the S/N .
The class of algorithms which prescribe these weights statistically are called optimal
extraction algorithms.
3.3.1 Background
The first optimal extraction algorithm was documented by Horne (1986), and later
expanded by others. Assuming a known profile function at a single wavelength, Pλj,
with normalization∑
j Pλj = 1, where j is the discrete coordinate in the spatial
direction (1 ≤ j ≤ n along a slit image covering n pixels) and λ is the dispersioncoordinate (see Fig. 3.3), we can estimate the expected value of a given pixel at
18
x
1/3
j
y5
4
3
2
1 2 4 6 7 8 93 5 11 12 13 14 15 16 17 1810 2019
λ
Figure 3.3: A schematic portion of an ideal straight array order, labelled with thecoordinates systems used in the discussion. The discrete coordinatesare (λ, j), enumerating pixels in the dispersion direction (rows), andslit direction (columns), respectively. The analogous continuous coor-dinates are (x, y). To the right, an example profile, Pλj, at a singlewavelength shows how a given hypothetical flux distribution along theslit (smooth curve) becomes a normalized profile function.
some wavelength λ in a single order of the spectral image D as
〈Dλj〉 = Pλjfλ (3.1)
where fλ is the true and unknown flux at one wavelength in an order, which we wish
to recover. The profile, Pλj, will in general be affected by the source intensity and
angular size (e.g., point-like vs. extended), atmospheric seeing, slit throughput, and
various other instrumental optical effects.
An estimate of the true flux, fλ, based on the data at a given wavelength (in
one spectral order on the array — order overlap is addressed in § 3.4.4) is denotedf̃λ. Dropping the redundant λ from all quantities, this estimate can be written as a
linear combination of all the pixels along the slit:
f̃ =∑
j
wjDj (3.2)
where wj are an optimal set of weights at this wavelength which we wish to de-
termine. Alternatively, we could consider a weighted sum of the individual flux
estimates at each spatial position — f̃j = Dj/Pj. Each of the f̃j represents the
flux we would estimate using the profile if we had only the jth pixel’s data to con-
sider; if the profile is accurately determined, the independent estimates f̃j should all
have the same mean (though different variances). We can write this alternatively
19
formulated, but tantamout estimate with a different set of weights, w′j:
f̃ =∑
j
w′j f̃j =∑
j
w′jDjPj
(3.3)
It will be convenient to retain both of these interchangeable formulations of the
weighted sum. Note that as of yet we have said nothing of the origin or form of the
profile function Pj (but see § 3.3.2). The variance of the equivalent flux estimates inEqs. 3.2 & 3.3 is
V (f̃) =∑
j
w2jVj =∑
j
w′2jVjP 2j
(3.4)
where Vj is the variance at pixel j along the slit. The expected value of each data
element (Eq. 3.1) implies an expected value of the flux estimate, using Eqs. 3.2 &
3.3, of 〈f̃〉
= f∑
j
wjPj = f∑
j
w′j (3.5)
For an unbiased estimate, we require the expected value of the flux estimate to equal
the true flux —〈f̃〉
= f , which translates into the pair of constraints:
∑j
wjPj =∑
j
w′j = 1 (3.6)
Minimizing each of the equivalent variances in Eq. 3.4 subject to these constraints
with the method of Lagrange multipliers (see Appendix A), the optimal weights are
found to be
wj = w′j/Pj =
Pj/Vj∑i P
2i /Vi
(3.7)
The weights for the two different formulations are related as we would have predicted
had the final equality in Eq. 3.5 held term by term. Since the variance of each
individual flux estimates f̃j is given by Vj/P2j (see Eq. 3.4), we can understand
Eq. 3.7 as simply the statement that the minimum variance in the weighted average
of a population (in this case, the one-pixel flux estimates) with identical mean is
achieved with weights inversely proportional to the individual variances (see, e.g.
Bevington, 1992, p. 59).
20
3.3.2 Profile and Variance
The profile and the noise model used to predict pixel variance are the two most im-
portant ingredients in optimal extraction. Incorrectly estimating the pixel variances
will degrade the S/N , but will not bias the extraction, since any set of Vj in Eq. 3.7
will satisfy the enforced unbiased estimate condition (Eq. 3.6). An incorrect profile,
however, will disrupt the individual per-pixel flux estimates, so that the assumption
that each of the f̃j are drawn from a population with the same mean is invalidated,
and the calculations which led to the optimal weights in Eq. 3.7 are flawed. This can
introduce a bias into the final calculated flux. Obviously, to maintain an unbiased
flux estimate, the error in the profile utilized must be appreciably smaller than the
individual pixel noise.
The earliest techniques developed to ensure a well-determined spatial profile
function involved summing over each row corresponding to a given pixel (a fixed
value of j in Fig. 3.3), and using the same fractional profile present within this
order sum at all wavelengths in the order (Robertson, 1986). For orders with any
curvature, or with intrinsic variation in the profile, this method fails. To accommo-
date variations in the profile, newer formulations specified fitting a given function,
such as a Gaussian, to the spatial profile data at each wavelength, and then smoothly
interpolating over the family of functions so found. This method forces the choice of
some predefined form for the profile, which may not always hold, due to changes in
seeing, optical distortions, and different intrinsic angular distributions for different
sources. Even if the profile remains relatively unchanged across the order, it may
not have a form simple enough to yield to an easy analytic fit.
The strength of the profile generation method introduced by Horne (1986) is
that it specifies no special form for the underlying profile function — assuming
only that it varies smoothly across the order.1 Thus, the technique is powerful for
point sources and extended sources, in the presence of all aberrations which serve
to introduce continuous profile distortions.
The pixel variances, while immaterial to the overall bias of the extracted flux, are
critical if performance better than standard extraction (the special case of optimal
extraction in which all the wj are equal) is to be obtained. Typically, a model
1This assumption breaks down for extended sources in which the shape or center of the spa-tial profile shifts with wavelength — for instance objects with discrete line emission offset fromcontinuum emission. It is still valid for spatially resolved source components with different, evenopposite, continuum slopes, as long as the continua vary smoothly, and for objects with line andcontinuum emission not separately resolved.
21
variance is formulated as:
Vλj = V0 + |Pλj f̃λ|/Q (3.8)
where V0 is a constant term, including detector readout noise, which is indepen-
dent of the impinging flux. The second term measures the expected variance with
the implicit assumption that the arriving photon flux is characterized by Poisson
statistics, such that for a number of photo-events N , the realized noise in the mea-
surement is√
N . The factor Q is the number of photo-electrons per recorded data
number, and is fixed by the detector and readout electronic characteristics. If the
constant term is zero, and the recorded flux is contributed entirely by the source
object (i.e. the measurement is source noise limited), then Vλj ∝ Pλj and all pixelsin the estimate of Eq. 3.7 are given equal weight. If background or detector noise
are important, Eq. 3.8 must be modified, since the noise affecting the measurement
of f̃λ will be dominated not by f̃λ itself. Since other types of noise may appear
which are arbitrary functions of the flux impinging on a pixel (linear or nonlinear),
or even the prior history of the detector, care must be taken to develop an accurate
noise model (see § 3.4.2 for a simple example).
3.3.3 Algorithm
The Horne algorithm involves establishing the profile and variance map for all pixels
in an order, and using these to compute the optimal weights (Eq. 3.7) used in
deriving the final flux. Since both the profile and variance themselves depend on
the flux calculated, the process is necessarily iterative. Typically the procedure
consists of generating an initial flux estimate as a starting point (again omitting the
redundant subscript λ):
f̃ (0) =∑
j
Dj (3.9)
This is simply the standard extraction flux estimate. The initial profile function,
P(0)j , is found by fitting low order polynomials over wavelength to a set of profile
estimates, one for each pixel position along the slit. The initial estimate of the
fraction of flux which pixel j contains is simply
P̃(0)j =
Dj
f̃ (0)(3.10)
22
To prevent cosmic ray or other blemishes from affecting the profile, outliers are
iteratively removed from the profile values being fit if
(Dj − f̃jP̃j)2/Vj ≥ σ2clip (3.11)
where σ2clip is a clipping variance. This simply enforces the constraint that the
square deviation from the expected pixel value in units of the pixel variance (the
mean square deviation) cannot exceed a given threshold.
Negative values of the profile function are truncated at zero, and it is normalized
at each wavelength:
P(0)j =
max(0, P(0)j )∑
i max(0, P(0)i )
(3.12)
where the max function returns the maximum value of its two operands. The pixel
variance is then recomputed using not the pixel value Dj, but the expected pixel
value f̃P(0)j :
V(1)j = V (f̃P
(0)j ) (3.13)
Using this updated variance to derive a new set of weights in Eq. 3.7, a new weighted
flux estimate, f̃ (1), is calculated. From the updated flux comes a new profile estimate
at each pixel position:
P̃(1)j =
Dj
f̃ (1)(3.14)
This updated profile is fit again to determine the smooth profile function, using
the new variance to control outliers. Iteration continues in this way until the flux
converges. The algorithm is not sensitive to the ordering of operations within the
iteration sequence, and typically convergence requires only a few iterations.
If the assumption that the spatial profile remains smooth is valid, the profile
function fit described should be relatively less noisy than the individual pixels on
which it depends, since it is affected by data at so many wavelengths.
Another strength of the optimal extraction algorithm comes as a byproduct
of the well-determined profile and pixel variance, both of which were required for
determining the correct weights. Cosmic rays or other blemishes can be effectively
removed from the final spectrum by examining data at each wavelength for extreme
deviation from the expected profile. To avoid detection and introduce spurious line
features into the spectrum, these artifacts would need to closely mimic the spatial
profile — an unlikely scenario for otherwise uncorrelated signals. A bad pixel mask
23
ji
x
y
λ
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
Figure 3.4: A portion of a curved order image, with an example tilted line profilein gray. The order envelope is smooth, but a pixel-binned version isshown. The discrete coordinate λ no longer corresponds to columns, asin Fig. 3.3 (i now serves this purpose), but is instead a parameter whichfollows the curved order. The different tick angle at each λ correspondsto the changing line tilt across the order. The j coordinate is as before,but a pixel-based resampling tracing the order is shown (shifting byentire pixels).
can be determined using the exact same criterion used to reject pixels from the
profile function fit (Eq. 3.11), but this time, using the profile fits themselves to
exclude the spurious data so revealed. The same rejection threshold need not be
employed. A good fit profile will still allow the recovery of an unbiased, if noisier,
flux at that wavelength, despite the missing data.
3.3.4 Order Curvature
The technique described in the previous section works well for relatively straight
orders. If orders experience significant tilt and/or curvature, a given row of the
spectral image often contains only a small portion of the full order range. The
profile is then poorly constrained and the extraction is afflicted by noise.
Fig. 3.4 illustrates a schematic curved order image, including line tilt. Line tilt
will be discussed in § 3.3.5, and for present purposes we can retain the associationof λ with detector columns, as in Fig. 3.3.
One seemingly obvious solution to this problem is to straighten the order prior to
further reduction. The simple unit pixel shift method for tracing the order depicted
in Fig. 3.4 will clearly introduce discontinuities in the profile, but of course one
24
could also consider resampling the order data before applying the optimal extraction
algorithm. Since the order location can be readily determined, there is no technical
barrier to this method, and in general the resampling can be performed without
introducing a bias to the computed spectrum.
Despite the intuitive appeal of this technique, it cannot work in practice. Any
resampling necessarily correlates the noise in adjacent pixels of the resampled image.
The analysis of § 3.3.1 – § 3.3.3 (e.g., Eq. 3.4) made use of the implicit assumptionthat the pixels contained within the slit image were uncorrelated, and therefore
resampling can defeat optimal extraction.
An additional complication reducing the effectiveness of this solution is resam-
pling noise (Allington-Smith et al., 1989). Whenever spatial features present in
a spectrogram are under-sampled (for instance, an unresolved stellar image), any
resampling introduces oscillatory noise along fixed rows in the resampled image.
This noise arises from the fundamental lack of high spatial order information for
the profile, and increases with the degree of under-sampling. Imagine for example
that all of the flux of an unresolved point source fell within a single pixel along
the slit. A resampling which shifts the spectral image by one-half pixel at some
wavelength must necessarily distribute the flux equally between two adjacent pixels,
even though the starlight may well have fallen near the edge of the original pixel.
Not enough spatial information was available to accurately reconstruct the profile,
and “ringing” will result.
To avoid the ill effects of resampling, Marsh (1989) introduced another layer of
abstraction between the spatial profile fitting and the weight determination of § 3.3.3.In Marsh’s reformulation, the polynomial profile fits of Horne (1986) are performed
not along rows or columns of the spectral image, but along a family of curves tracing
out the spectral order. These three dimensional curves (two dimensions tracing the
spatial position of the order, one dimension sampling the profile estimates) are
interpolated onto the straight detector grid to form a “virtually resampled” profile
image:
Pλj =N∑
n=1
QnλjGnλ (3.15)
where Qnλj are the interpolation coefficients which specify the contribution of poly-
nomial n to pixel λj, and Gnλ are the polynomials evaluated along the orders which
sample instrinsic changes in the profile with wavelength (one for each of N positions
25
along the slit):
Gnλ =K∑
k=0
Aknλk (3.16)
Usually N is chosen to be larger than the number of slit pixels (i.e. the spacing
between curves is made less than one pixel), in order to accurately recover the un-
binned profile. The polynomials Gnλ do include any information about and are not
sensitive to order curvature. They sample only the smoothly changing profile eval-
uated along lines parallel to the order, which has arbitrary form. The Gnλ would
be the exact analog of Horne’s row based profile fits (Eq. 3.12), had a hypotheti-
cal detector in which the placement of pixel rows varied smoothly to fall precisely
beneath the curved order been used. The interpolation coefficients Qnλj are fixed
by the chosen interpolation method, the number of curves, and the position of the
orders, and remain constant during the iterative extraction procedure. The polyno-
mial coefficients of the fitted profile in Eq. 3.15 are chosen by minimizing
χ2 =∑λj
P̃λj − Pλjσ2
P̃
(3.17)
where the P̃λj are the profile estimates as in Eq. 3.10, the variance of which can be
evaluated in a straightforward way using the individual pixel variances of Eq. 3.8.
This formulation can be used to fit the same noisy profile estimates (Eq. 3.10)
used in standard optimal extraction, and the extra step of interpolating the fitted
profile curves onto the data grid, rather than interpolating the data onto a more
convenient profile fitting grid, defeats the resampling noise. The procedure requires
solving N(K + 1) simultaneous equations, though K, the degree of the polynomials
used to fit the intrinsic profile variation, can usually be kept small, since most of
the variation due to changing order position is removed.
3.3.5 Line Tilt and Curvature
Often, spectral orders aren’t just curved or tilted, but the slit image itself is not
parallel to any detector axis. As illustrated in the schematic of Fig. 3.4, and empha-
sized in the Score atmospheric spectrum in Fig. 3.2, this line tilt can vary from
order to order and within individual orders. In general, lines can also be signifi-
26
cantly curved.2 Resampling the order data to straighten these lines (perhaps at the
same time the order itself is straightened), introduces the same correlation and noise
issues discussed in the previous section.
What is needed is simply an analogous algorithm which considers the line shape,
the pixel variances, and the profile estimates to generate a set of weights at each
wavelength, wλij for all pixels in the image Dij. The coordinate λ is no longer
identified directly with the detector column i, as it has been. As illustrated in
Fig. 3.4, it becomes a discrete parameter which follows the order. In principle, the
spacing of the parameter λ along the order is arbitrary, but the overall resolution of
the instrument is not increased by line tilt. A tilt gradient across the order implies
that one end of the slit offers effectively higher resolution than the other, similar
to the optical effect of a wedge shaped slit, but when taken together they average
out to the original, untilted resolution. A λ spacing significantly smaller than the
column spacing (in spectra with slit image more nearly aligned to columns), though
harmless, is unwarranted. Similar to the virtual resampling procedure outlined
in § 3.3.4, in which the profile was recovered by slightly oversampling the familyof polynomials, we can simultaneously oversample in both the slit and dispersion
directions.
The two dimensional weights wλij form a family of surfaces (one at each pa-
rameterized wavelength) which are typically local, spanning at most a few detector
columns. Similarly, two dimensional profile surfaces (as opposed to the profile func-
tions of § 3.3.4) can be developed which locally determine the fraction of the lightat a single wavelength which fell into a given pixel. Both of these constructions
require accurate knowledge of the distortion of the slit image, which can be quan-
tified as the function which maps a straight slit, oriented along one of the detector
axes, to the measured slit shape (curved, tilted, or otherwise). In practice a spa-
tially homogenous source with sharp spectral lines (astrophysical or in the lab) is
used to measure slit image distortions, which are taken as unchanging inputs to the
extraction process.
If the slit shape and orientation were fixed along the order, a single such mapping
function would suffice. For the general case of a non-constant slit distortion, a family
of such functions, Tλ(x, y) can be constructed from measurement, and formulated as
Tλij — i.e. defined on the physical pixel grid, and not the oversampled mesh which
2This type of optical line distortion is intrinsic to the instrument, and should not be confusedwith the physical line shape changes which can result from spatial variations of a line’s centralwavelength, as for, e.g., measurements of a galactic rotation curve.
27
overlays it.
The practical details of using a measured set of Tλij to develop profile surfaces
Pλij can be approached in a variety of ways, as long as care is taken to ensure each
pixel contributes only its included flux to the various profile surfaces at different
wavelengths which overlap there. The locality of Tλij and Pλij, which typically will
span at most three detector columns (bounded by the slit length), can be used to
simplify this calculation enormously.
The flux estimates analogous to those of Eq. 3.3 are no longer based on the
profile and data along just a single dimension (i.e. the slit direction). The individual
estimates are instead given by
f̃λij =DijTλij
Pλij(3.18)
At a given wavelength there are as many independent flux estimates as there are
non-zero entries in the slit curvature map, and a given pixel in the spectral image
will be used in constructing more than one flux estimate. To ensure no flux biasing,
we require ∑λ
Tλij = 1 (3.19)
for all (i, j), preventing pixels from over-contributing to the final flux.
In practice, using the tilt map Tλij to move between the two dimensional family-
of-surfaces formulation, and the one-dimensional representation of that family at a
given position along the slit allows the machinery of profile determination from the
previous section to be employed, so long as the various contributing components at
a single fit position are kept track of.
3.4 Scorex
The unusual properties of Score mid-infrared spectral data (see Chapter 2) mo-
tivated the development of a specifically tailored reduction package. Unlike the
optical CCD spectra for which the majority of cross-dispersed spectral reduction
techniques were designed, these data are characterized by very short slit length
(4–5 pixels), non-negligible order cross-talk (overlap) at shorter wavelengths, and
extreme background-dominated noise characteristics. Sky data are recorded at the
extremities of the slit in optical spectrography, whereas the extremely high back-
ground from the sky dominates Score data, and must be removed dynamically by
28
rapid chopping on and off the source. The strong sky signal is also the dominant
noise contributor in these data, and is in general far larger in magnitude than either
read noise or source photon noise — the dominant noise components of optical spec-
tra. Score’s relatively short slit and modest detector size, coupled with the larger
slit diffraction pattern at these wavelengths, produce more order cross-talk than in
optical spectrographs. The application of optimal extraction techniques to Score
data therefore required several modifications to account for these major differences
in the character of the recorded spectrum. The result is Scorex — a set of IDL3
routines which implement reduction and extraction of Score data.
3.4.1 Calibration
Calibrating Score data involves measuring the positions of the orders, assigning
wavelengths to positions within each order and aligning the wavelengths between
orders, measuring line tilt, and developing an efficiency map. Calibration need be
performed only after new instrumental parameters are encountered. All calibration
steps are performed using the results of the prior steps. Wavelength stability to
better than one quarter pixel was observed, though alignment differences between
opening and re-mounting the instrument did contribute to overall shifts of the order
positions on the detector by up to one pixel.
Individual calibration data are saved together as scoresets – standard templates
which contain all information necessary to reduce a given observation. To reduce
a given dateset, an individual scoreset is used to load all the relevant calibration
parameters.
Order Positioning
The initial step of calibration involves shifting, stretching, and rotating a conven-
tional ray-trace map to accurately overlay the spectrum. The ray-trace defines the
theoretical expectation of the location on the detector array of each wavelength in all
orders, and matches the realized echellogram reasonably well. To quantify the devi-
ation of order positions from this expectation, a test spectrum of a bright blackbody
source filling the slit is used. Three parameters are allowed to vary within the fit: an
overall shift of the order pattern in the cross-disperser direction, a linear stretch of
3IDL, The Interactive Data Language, is a registered trademark of Research Systems, Inc. (nowKodak).
29
the spacing between orders, and a rotation of the entire order pattern on the array.
An initial estimate of each parameter is made automatically by examining averaged
column slices for extrema. The three parameters of the fit are then varied with a
conjugate gradient technique to maximize a merit function which gives weight both
to the total flux underlying the map (proportional to the integrated efficiency of
the underlying spectral region), and to the total number of pixels it contains. This
technique allows accurate and reproducible mapping of the differential curvature of
the various orders, expressed as slight deviations from the order positioning of the
theoretical ray-trace.
Wavelength Calibration
Wavelength calibration is performed similarly to the order positioning, using the ray-
trace wavelength predictions, transformed along with the physical order positions
in the previous step, as a starting point. The wavelength calibration dataset used is
either of an NH3 absorption spectra (Fig. 3.5a) obtained in the lab from a ∼360Kblackbody source absorbed through ammonia vapors, or a sky emission spectrum,
as shown in Fig. 3.2, created from the raw, undifferenced sky frames of observed
sources. Both have many unresolved lines available, though sky emission data are
used when possible, since they offer more complete line coverage, most notably
at the longest wavelengths observed. The comparison reference spectra used were
either FTIR NH3 spectra from the EPA’s Emission Measurement Center, convolved
with the instrument profile to a resolution R ∼ 800, or atmospheric transmissionsmodels created at similar resolution with atmospheric modeling code ATRAN (Lord,
1992). Each order of the calibration spectra is individually reduced and de-tilted
by a local rotation-based interpolation.4 Each order is then extracted by simply
summing along the slit direction.
Since Score spectral orders substantially overlap each other in wavelength, with
most wavelengths present in two orders, and some wavelengths present in three, the
first step of wavelength calibration involves aligning the individual orders to each
other.
This is achieved by shifting the predicted wavelength on each order with re-
spect to its longer wavelength neighbor by a constant amount, determined by fitted
measurements of line centers of the same spectral line appearing in adjacent orders.
4Since line tilt measurement occurs after wavelength calibration, the tilt fit from a prior cali-bration is used initially, to bootstrap up to an accurate measurement.
30
a)
b)
Figure 3.5: a) An inverted ammonia absorption spectrum obtained in the lab asthe difference between a 350K blackbody source with and without NH3vapors interposed. b) The blackbody spectrum used in the subtraction,and also to create a map of combined optical and electronic throughputefficiency.
31
After the orders have been aligned, the overall wavelength calibration is under-
taken. A variety of NH3 or atmospheric lines are fit, along with the equivalent
line from the reference spectra. Line blends or otherwise asymmetrical lines were
avoided. The deviation between the expected and measured line center wavelengths
is fit with a straight line (Fig. 3.6b). The fractional deviation from the prediction
was approximately δλ/λ ∼ .005 for all calibrations.
Line Tilt
The line tilt evident in Fig. 3.2 is must be removed, to avoid suffering resolution
degradation at long wavelengths. Using the results of the order and wavelength cal-
ibration, individual lines tilts are measured by fitting centroid positions along each
individual row in which some portion of the line appears. Using four such measure-
ments, an angle of tilt from vertical is derived, as shown in Fig. 3.6a. This line tilt is
fit as a function of wavelength. Although not motivated by optical considerations,
which might instead specify fitting a center of deviation with spectral lines falling
tangent to circles centered on the fitted point, the magnitude of the errors in the
measured tilt angle imply a straightforward wavelength fit is as accurate as a physi-
cally more realistic, but more complex technique. This method is also motivated by
the incomplete explanation of the line tilt as a purely instrumental optical effect.5
Efficiency Maps
Creating flat fields for use in reduction of infrared spectra is a difficult proposition.
Finding a spectrally flat source at thermal wavelengths is alone challenging, but
H2O and CO2 absorption compound the problem. Often in near infrared observa-
tions, the sky itself is utilized as a blackbody source for flat-fielding, constructed by
median-filtering a stack of exposures, to remove the source contribution. This tech-
nique, however, relies on the atmosphere serving as a relatively stable background,
which is not often the case at longer wavelengths. Because most components within
the instrument radiate so efficiently in the MIR, the removal of scattered light is of
paramount importance. A non-negligible contribution to the Score background re-
sults from the reflection of the main instrument beam off the CdTe entrance window,
5Though curved or tilted slits are a well-studied consequence of cross-dispersed and other highangular dispersion spectrographs (see, e.g., Schroeder, 1987, p. 261), and tilts of this magnitudeare not difficult to obtain, the change in line tilt over this wavelength range is incompatible witha straightforward optical explanation.
32
8 10 12 14
-10
0
10
0
10
20
Figure 3.6: a) An example line tilt calibration, with linear and quadratic fits over-laid. The angle θtilt quantifies the line tilt with respect to the detectorcolumns, as a function of measured wavelength. b) The wavelengthcalibration, constructed by fitting the deviation of the measured fromthe reference wavelengths (δλ), as a function of the measured wave-length of the atmospheric calibration line. Notice the total differenceover the full wavelength range and all orders is ∼ .01µm, or just underone pixel at Score’s resolution and sampling.
33
producing radiation characterized by a poorly quantified temperature – literally, the
result of the instrument looking back at itself and the interior of the cryogenic de-
war, which is not of a uniform temperature. A technique was therefore developed to
eliminate this and other constant contributions to the scattered background. The
spectra of both hot (∼350K, see Fig. 3.5b) and room temperature blackbody sourcesfilling the beam were recorded, prior to each mounting of the instrument on the tele-
scope. The stacked difference between five or more pairs of these spectral images
produced a frame free from poorly characterized background and scattered light.
By dividing such a frame by the predicted relative fluxes of the two blackbodies, an
efficiency map, Eλj is created, which characterizes the optical efficiencies (grating
blaze efficiency, slit throughput, filter responses, detector substrate/filter banding,
etc.) combined with the detector response and electronic readout efficiency. Though
not strictly a flat field, Eλj is very useful for normalizing and combining orders.
3.4.2 Noise Model
Each individual spectral image which is reduced by Scorex is of the form (array
subscripts omitted)
D =
∑2Nchopk=1 skak∑Neff
k=1 Ek(3.20)
where ak are the individual chop frames with one frame for each chop+nod position
of an observation, and each sk is the sign of summation required to difference ad-
jacent chop frames, and remove the background gradient to first order (for typical
beam-switched observation in which successive nods share one central chop position,
sk = [+,−,−, +,−, +, +,−, . . .]).A critical component of the noise model is the factor which converts between
recorded digital units and photoelectrons. This factor, Q, depends on the individual
pixel capacitance, the voltage range and digital output of the analog-digital convert-
ers, and any electronic offsets involved. Table 3.1 lists the contributing elements and
an explicit calculation of this factor for the most typical Score parameters.
The error due to shot noise in photon arrival, or, equivalently, photoelectron
production, can be calculated for any quantity A in digital units, by noting it implies
the detection of QA photoelectrons. This quantity has expected variance QA (a
characteristic of the underlying Poisson distribution), or a noise estimate of√
QA
photoelectrons, which is√
A/Q digital units. In practice, to accomodate noise in the
readout electronics, we write the error as a combination of shot noise and electronic
34
Table 3.1: Digital Unit to Photoelectron Conversion Factor.
Quantity Description Value
C BIB Well Capacitance 1.33pF = 8302.1e−/mV
∆V Input A/D Voltage Range 20V
NA/D A/D resolution elements 214
U Bit shift right in hardware coadder 4
G Pre-amplifier gain (typical) 40
Q photo-e−/ADU factor C∆VUGNA/D
= 63.34
noise:
σ2A = σ2Aph
+ σ2Aelec =A− Ae
Q+
σ2eQ2
(3.21)
where Ae is any background source offset analogous to dark current, and σ2e is the
electronic noise estimate for a given observation, expressed in electrons.
Using Eqs. 3.20 and 3.21, the error estimate for the spectral image D (dropping
indices) is
σ2D =1
E2
2Nchop∑k=1
ak − ekQ
+2Nchopσ
2e
Q2+ D2σ2E
(3.22)where E is the total efficiency map (
∑Neffk=1 Ek) constructed from blackbody pairs,
as described at the end of § 3.4.1, the quantity ek is the analog of dark current— an electronic offset which depends on chop frequency and bandwidth, and in
which the electronic noise, σ2e , is formed. The first term in Eq. 3.22, including the
undifferenced total of all chop frames, corrected for background offset, represents
the large contribution of the sky to the noi
Top Related