Download - The Wolf--Rayet Phenomenon in the Infrared: Massive Stars ...tir.astro.utoledo.edu/jdsmith/download/jdsmith_thesis.pdfArmed with the brazen optimism of youth, and an admission letter

THE WOLF–RAYET PHENOMENON IN THE

INFRARED: MASSIVE STARS PROBING STELLAR

FORMATION

A Dissertation

Presented to the Faculty of the Graduate School

of Cornell University

in Partial Fulfillment of the Requirements for the Degree of

Doctor of Philosophy

by

John-David Thomas Smith

May 2001

c© John-David Thomas Smith 2001ALL RIGHTS RESERVED

THE WOLF–RAYET PHENOMENON IN THE INFRARED: MASSIVE STARS

PROBING STELLAR FORMATION

John-David Thomas Smith, Ph.D.

Cornell University 2001

We present 8–13µm spectra at resolution R∼ 600 of 29 northern Galactic Wolf–Rayet stars, covering a broad range of spectral subtypes, including 14 WC, 13 WN,

1 WN/WC, and an additional reclassified WN star. Most constitute the first ever

reported mid-infrared spectrum. Lines of He i and He ii, accompanied in some stars

by [Ne ii] and [S iv], are strongly present in 22 of the sources observed, while 6

of the sources exhibit the powerful emission of heated circumstellar carbon dust.

Correspondence with optically determined subtypes is found to be incomplete, with

significant deviations for later types seen in both WN and WC. For a single WC

star, WR121, neon abundance is estimated from [Ne ii] emission, and found to be

∼7× the cosmic value, as predicted by long-standing but contested evolved corenucleosynthesis calculations.

The observed line parameters are used in a population synthesis model to quan-

tify the contribution of WR stars to the total integrated emission of a single 106 M�

stellar population formed in an instantaneous burst, and including a T≤70 K dustemission component with Lir/Lblue = 0.1. After ∼5Myr, the infrared emissionlines which form in the WR winds achieve similar line luminosities and equivalent

widths in the aggregate starburst spectrum as the commonly used optical λ4686

“WR Bump”, when the stars are embedded in dust providing AV ≥ 5 magnitudesof screen extinction. We address the possibility of direct detection of these infrared

wind lines, which may be possible in star-forming galaxies if a significant popula-

tion of recently formed stars is hidden by dust, and the starburst is observed against

a relatively low-level stellar background. A mid-infrared spectrum of Wolf–Rayet

Galaxy NGC5253 is presented, but hot dust overwhelms these line features. We es-

timate that 5× 105 WR stars would be required for direct detection of mid-infraredwind lines in this nearby dwarf galaxy, with substantially fewer required for strong

near-infrared lines.

Observations were made with Score, a unique mid-infrared spectrograph built

as a prototype of Sirtf’s short-wavelength high-resolution spectrograph module.

Score achieves spectral sensitivity similar to the Infrared Space Observatory. Im-

portant details of the instrument are presented, along with new techniques developed

for the extraction of Score spectral data.

Biographical Sketch

On an auspicious Monday, September 24, 1973, John-David Thomas Smith was

born to young parents Thomas and Teressa, sharing his birthday with such notables

as Chief Justice John C. Marshall, author F. Scott Fitzgerald, and Jim Henson,

of Kermit the Frog fame. After a short stint traveling with his family in a green

converted school bus, following the orchard harvest from Southeast to Northwest and

living the care-free lifestyle of hippydom, he traded the jaunty caprice of migrant

fruit picking for a practical and rooted trailer-in-a-field existence. Continuing to

climb up the complex Southern social ladder in home town Owensboro, Kentucky,1

he and family, which had since become six, soon moved into a cabin and finally a

modest home in the lush countryside of the Miller Lakes Resort Park, operated by

his grandparents primarily, it seemed to him, for his enjoyment. Long summer days

spent in the lake, building stick forts in the woods, or raising fishing worms to sell to

deep-pocketed RV campers left him generally happy, but unprepared for the harsh

reality of West Louisville Elementary School, with its mandatory square-dancing

classes, and jelly-bean field days.

Acutely aware of the ever-waning attention span of modern Americans, he soon

adopted the foreshortened sobriquet “JD”, which he carries to this day, except when

addressed by his grandparents.

Despite various experiments of differing degrees of success (and punishment) in

the chemistry of flammability which he conducted before the tardy bell in class-

room trash cans throughout middle and early high school, JD continued to advance

through the peerless American educational system at a dizzying pace. Long camp-

ing trips in the South and Southwest reaffirmed his love of the natural world, which,

when coupled with the realization that his job at the Windy Hollow Drag Racing

1Barbecue capital of the world.

iii

Track offered little growth potential, seeded his first thoughts of becoming a scien-

tist, which he mistakenly thought implied some special lifestyle cachet. His interest

in the physical sciences was furthered by a high-school chemistry teacher who taught

him never to get acid “near your eye or your groin,” and a physics teacher who gave

him a new approach to problem-solving: “Whaddya gonna do, coach? Drop back

ten yards and punt.” The latter was an especially effective technique when applied

to calculations of the “momanumanertia.” His Latin teacher helped spawn a lifelong

love of literature, and reaffirm the wisdom of ipsa scientia potestas est, not to men-

tion semper ubi sub ubi. In the Spring of 1991, he graduated from that esteemed

center of erudition, Apollo High School, and spent the summer studying aquatic

lake life in situ, and jumping on trains with other indiscriminately safety-conscious

friends.

Armed with the brazen optimism of youth, and an admission letter to the Mas-

sachusetts Institute of Technology, JD headed north to Boston that Fall, after hav-

ing been duly warned by his grandfather of loose Yankee morals and politics. After

some adjustment to city life and the mysterious disappearance and reappearance

of r’s from the native dialect, he jump-started his budding Physics career by refin-

ing hydrous-balloon ballistic ranging techniques from his Back Bay roof deck. The

wealth of opportunities for the curious mind at MIT was sufficient to outweigh its

brutal spirit-crushing machinery, in the steely grip of which many strong friend-

ships were forged. A summer job at an engineering firmed cemented his distaste

for the overly practical, and various other research projects at MIT confirmed that

astrophysics was indeed the most interesting sub-branch of the field, as he had first

intuitively suspected sleeping under the stars as an eight-year old.

After graduating from MIT in 1995, and yearning for a return to his rural (feral?)

childhood, JD headed to Cornell University, centrally isolated in scenic upstate New

York. There he reconnected with his youthful self by roving through trails and

streambeds once again. After some thought, and a coin-toss (which mandate he

disobeyed), he joined the first-rate infrared group in the Department of Astronomy,

where he would spend many productive days rummaging through old drawers of

wires and components last shelved several decades earlier, until destiny had brought

them together. Almost six years later he emerged like a spotted wood moth from

the chrysalis of graduate education, revitalized and ready to chew with purpose and

vigor through the pulpy timbers of science.

JD heads next for the desert climes of Tucson, Arizona, with fingers crossed in

iv

anticipation of the launch of the spacecraft which will put food on his table for the

next several years. But first he will be married, and revel with his bride in one last

warm and fine Ithaca summer.

v

For Sara,

my fidus Achates.

vi

Acknowledgements

Truth persuades by teaching, but does not teach by persuading.

—Quintus Septimius Tertullianus

I have been privileged to have been taught many things by many remarkable teach-

ers, not all of whom would consider that their foremost profession. I am pleased

to acknowledge the lasting impact they have had on my graduate career, and the

production of this thesis. First and foremost, my advisor, Jim Houck, always im-

pressed me with his uncanny ability to reduce month-long projects into ten words

or less, which accurately summed-up a) what I had done, b) what I hadn’t done

that I thought I did, and c) what I really should have done. Occasionally all of these

analyses could be further compressed into one poignant pair: “What’s happening?”

— a query of incomparable motivational power. I have always tried to emulate his

startling and consistent faculty for simultaneously seeing both the forest and the

trees.

I thank my other advising committee members, Gordon Stacey, Jim Cordes, and

David Chernoff, for their close reading and excellent suggestions. I am particularly

indebted to Gordon for the inexhaustible patience and good humor which he brought

to all our discussions, science and otherwise.

I am very grateful to the Score team and contributors, including John Wilson,

Stephen Rinehart, Mike Colonno, Chuck Henderson, and especially Jeff Van Cleve,

without whom the instrument would likely have been shaken to death by Sirtf

contractors, instead of being used for science (and the good of humanity). Others

who contributed substantially to Score, and to most of the projects in the infrared

group, were George Gull, Bruce Pirger, and Justin Schoenwald.

The Palomar staff was exceptionally responsive and helpful. I thank in particular

telescope operators Karl Dunscombe and Rick Burruss, who always kept the mood

vii

light and humor good, even at the bitter end of ten day observing runs. Mike

Doyle, John Henning, Dave Tennent, and the rest of the Palomar crew deserve

special commendation for their expertise and patience.

For their helpful discussion and assistance with modifying Starburst99, I

thank Claus Leitherer, and especially Daniel Devost, with whom I learned more

about this code than I probably should have. Pat Morris and Roberta Humphreys

offered very illuminating discussion on the properties of hot stars and spectral di-

agnostics. Vassilios Charmandaris was always available with interesting anecdotes

on the lives of galaxies, and offered encouragement in the darkest hours of thesis

preparation, assuring me on multiple occasions that “it does add up.” I also thank

Matt Bradford for having finished before me, so that I could profit from his triumphs

and learn from his mistakes.

The support of friends and family is of course a vital ingredient in any graduate

career, and I have been fortunate to have had both. I thank my parents, who never

told me I couldn’t, and my brother and sisters, for showing me how much more

there is to life. Since there are far too many to list, I thank all my friends who have

so generously given of themselves, and reminded me that “life is too important to

take seriously.”

Special thanks are due Sara Ann Lederman, who has been my beacon and my

hope for the past eight years, and without whom I would know very little about life,

loyalty, love, and true happiness. May I always be so fortunate.

viii

Table of Contents

1 Introduction 11.1 Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 The SIRTF Cornell Echelle Spectrograph, SCORE 62.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2.1 Two Paths Diverge . . . . . . . . . . . . . . . . . . . . . . . . 72.2.2 Electronics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.3 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3 Spectral Extraction 143.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143.2 Issues of Cross Dispersion . . . . . . . . . . . . . . . . . . . . . . . . 143.3 Optimal Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.3.2 Profile and Variance . . . . . . . . . . . . . . . . . . . . . . . 203.3.3 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.3.4 Order Curvature . . . . . . . . . . . . . . . . . . . . . . . . . 233.3.5 Line Tilt and Curvature . . . . . . . . . . . . . . . . . . . . . 25

3.4 Scorex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.4.1 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.4.2 Noise Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.4.3 Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.4.4 Order Overlap . . . . . . . . . . . . . . . . . . . . . . . . . . . 363.4.5 Observing Efficiency . . . . . . . . . . . . . . . . . . . . . . . 363.4.6 Flux Calibration . . . . . . . . . . . . . . . . . . . . . . . . . 373.4.7 Flux Renormalization . . . . . . . . . . . . . . . . . . . . . . . 37

3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

ix

4 Wolf-Rayet Stars 394.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394.2 Physical Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . 414.3 Mid-Infrared Spectroscopy: Background . . . . . . . . . . . . . . . . 474.4 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484.5 Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.5.1 Reddening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524.6 Observed Spectra and Results . . . . . . . . . . . . . . . . . . . . . . 54

4.6.1 The WN Stars . . . . . . . . . . . . . . . . . . . . . . . . . . . 734.6.2 The WC Stars . . . . . . . . . . . . . . . . . . . . . . . . . . . 754.6.3 Line Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

4.7 Terminal Velocities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 884.8 Mass Loss Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

4.8.1 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 904.8.2 Observations and Results . . . . . . . . . . . . . . . . . . . . . 93

4.9 Neon Abundance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 974.9.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 974.9.2 Abundance Calculation . . . . . . . . . . . . . . . . . . . . . . 984.9.3 Observations and Inputs . . . . . . . . . . . . . . . . . . . . . 1004.9.4 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . 103

4.10 Neon/Sulfur Abundance . . . . . . . . . . . . . . . . . . . . . . . . . 1054.11 Dust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1064.12 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

5 The Wolf-Rayet Phenomenon in Star-Formation 1085.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1095.2 Introduction and Background . . . . . . . . . . . . . . . . . . . . . . 1125.3 Starburst Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

5.3.1 STARBURST99 . . . . . . . . . . . . . . . . . . . . . . . . . . 1155.3.2 Infrared Activity . . . . . . . . . . . . . . . . . . . . . . . . . 1155.3.3 Defining the FIR flux . . . . . . . . . . . . . . . . . . . . . . . 1165.3.4 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1165.3.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1185.3.6 Extinction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

5.4 Infrared Detectability . . . . . . . . . . . . . . . . . . . . . . . . . . . 1285.4.1 Continuum Dilution . . . . . . . . . . . . . . . . . . . . . . . 1285.4.2 Line Strength . . . . . . . . . . . . . . . . . . . . . . . . . . . 1295.4.3 Background Continuum Sources . . . . . . . . . . . . . . . . . 1315.4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

5.5 NGC 5253 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1335.5.1 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . 1345.5.2 Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1345.5.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

5.6 Required Number of WR Stars . . . . . . . . . . . . . . . . . . . . . 136

x

5.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

6 Conclusions and Future Direction 1396.1 Wolf-Rayet Stars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1396.2 Wolf-Rayet Galaxies . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

A Derivation of the Optimal Extraction Weights 142

xi

List of Tables

3.1 Digital Unit to Photoelectron Conversion Factor. . . . . . . . . . . . 34

4.1 Source observations and parameters . . . . . . . . . . . . . . . . . . 494.2 Undetected Sources with Upper Flux Limits. . . . . . . . . . . . . . 514.3 Object Counts in Survey by Subtype. . . . . . . . . . . . . . . . . . 534.4 Observed Line Data, by Line. . . . . . . . . . . . . . . . . . . . . . 784.5 He i and He ii Line Blend Constituents. . . . . . . . . . . . . . . . . 864.6 Terminal Wind Velocity from [S iv] . . . . . . . . . . . . . . . . . . . 894.7 Mass Loss Rate from Extrapolated Continuum . . . . . . . . . . . . 964.8 Atomic Data for Neon and Sulfur Lines. . . . . . . . . . . . . . . . . 1004.9 Example Mass-Loss Rates from Various Methods. . . . . . . . . . . . 1024.10 Ne+ and S3+ Abundances. . . . . . . . . . . . . . . . . . . . . . . . . 104

5.1 Starburst99 Model Inputs . . . . . . . . . . . . . . . . . . . . . . 1185.2 Maximum WR Flux Contribution at Selected Wavelengths . . . . . . 1225.3 WR Line Ratios at Maximum Contribution . . . . . . . . . . . . . . 130

xii

List of Figures

2.1 Optical Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2 Production Optical Schematic . . . . . . . . . . . . . . . . . . . . . . 92.3 NGC 7027 Spectrum and Slit View . . . . . . . . . . . . . . . . . . . 102.4 Electronics Schematic . . . . . . . . . . . . . . . . . . . . . . . . . . 112.5 Sensitivity Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.1 Curved Order Schematic . . . . . . . . . . . . . . . . . . . . . . . . . 153.2 Unreduced Atmospheric Spectrum . . . . . . . . . . . . . . . . . . . 163.3 Straight Array Order Schematic . . . . . . . . . . . . . . . . . . . . . 183.4 Curved Array Order Schematic with Line Tilt . . . . . . . . . . . . . 233.5 Laboratory Ammonia and Blackbody Spectra . . . . . . . . . . . . . 303.6 Score Wavelength and Line Tilt Calibration . . . . . . . . . . . . . 323.7 Score Electronic Noise Variance . . . . . . . . . . . . . . . . . . . . 35

4.1 Upper Main Sequence Luminosity Limit . . . . . . . . . . . . . . . . 424.2 WR Star Lifetime as a Function of Mass and Metallicity . . . . . . . 444.3 The WN9–WN8 spectra . . . . . . . . . . . . . . . . . . . . . . . . . 564.4 The WN8–WN7 spectra . . . . . . . . . . . . . . . . . . . . . . . . . 584.5 The WN6 spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604.6 The WN5 spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624.7 The WN5–WN4 spectra . . . . . . . . . . . . . . . . . . . . . . . . . 644.8 The WC9 spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . 654.9 The WC8–WC7 spectra . . . . . . . . . . . . . . . . . . . . . . . . . 684.10 The WC7–WC6 spectra . . . . . . . . . . . . . . . . . . . . . . . . . 704.11 The WC5–WC4 spectra . . . . . . . . . . . . . . . . . . . . . . . . . 724.12 The NaSt1 Spectrum. . . . . . . . . . . . . . . . . . . . . . . . . . . 744.13 He ii 9.7µm to He i+He ii 11.3µm Line Ratios . . . . . . . . . . . . . 844.14 He ii 9.7µm:He i+He ii 11.3µm:He i+He ii 12.36 Line Ratios . . . . . 854.15 Wind Model Geometry . . . . . . . . . . . . . . . . . . . . . . . . . 91

5.1 Spectral Energy Distribution of Galaxies . . . . . . . . . . . . . . . . 1115.2 Optical WR Galaxy diagnostic . . . . . . . . . . . . . . . . . . . . . 1135.3 Model Spectra Energy Distribution . . . . . . . . . . . . . . . . . . . 1195.4 WR and O Star Count Evolution . . . . . . . . . . . . . . . . . . . . 121

xiii

5.5 WR Contribution to Total Spectral Energy . . . . . . . . . . . . . . 1245.6 Extincted Model Spectral Energy Distribution . . . . . . . . . . . . . 1275.7 NGC 5253 Score Spectrum . . . . . . . . . . . . . . . . . . . . . . . 135

xiv

We are all in the gutter, but someof us are looking at the stars.

Oscar Wilde

Chapter 1

Introduction

The search for the true history of star formation in the universe is not unlike any

other historical analysis: plagued by incomplete and inaccurate information, con-

flicting accounts, impassioned but opposing viewpoints — the same ambiguities and

inconsistencies that confront historians seeking fundamental understanding of any

event which has been clouded over by the inexorable passage of time.

Uncovering the accurate history of stellar creation, from the first light of primeval

galaxies to the familiar glow of the local neighborhood, would have immediate and

sweeping impact on the theories and frameworks of cosmology, stellar evolution,

galaxy and cluster formation — bridging the substantial gap in knowledge and

time between the creation of the primordial elements, and the complex and highly

structured processes which shape the present day universe. Yet despite intensive

research effort and rapid recent progress, this history remains poorly understood.

A variety of attempts have been made to quantify the total amount of star

formation that occurred at early epochs (e.g. Madau et al., 1996). Some accounts

indicate that the formation rate went through a sharp peak quite recently, near

redshift z ∼ 1, and then fell dramatically to the current, relatively quiescent levels.These efforts to formulate a consistent history, however, have been complicated by

a poor understanding of the distribution and evolution of dust in the most luminous

early stellar environments. The same dust whose formation is intimately connected

to the birth of stars is quite effective at concealing it.

Recent analysis of the luminosity density function of high redshift Lyman-break

galaxies (Steidel et al., 1999) has provided strong support of other evidence indicat-

ing that the peak star forming epoch has not been observed, to z & 4. We simply

have not yet seen the onset of star formation in the universe. The resolution of the

1

2

sub-mm background into individual sources at high redshift, and the subsequent es-

timation of the surprisingly large fraction of star-forming activity concealed by dust

among these objects (e.g. Peacock et al., 2000) have begun to resolve this ambiguity,

but further uncertainties remain, including the importance of active galactic nuclei

(as might help explain the recently resolved hard X-Ray background — Barger

et al., 2001), and associated nuclear starbursts to the energy output of the early

star-forming universe.

The most luminous galaxies known emit strongly in the infrared, out-numbering

quasars in the local universe (z . 0.3) by 2:1 at luminosities greater than 1012 L�(Sanders & Mirabel, 1996). The source of luminosity in these so called infrared

luminous galaxies (ILGs) is uncertain — members of the class display a range of

contribution both from central, active nuclei and extended starbursts. Do the ILGs

have corollaries or progenitors in the distant universe, and will they be related to

the other high z populations already known? What role do normal galaxies play in

the overall history of star formation?

We will soon be able to begin answering these questions, thanks to new space

and airborne observatories including SIRTF, Herschel, and SOFIA, along with a

developing contingent of more capable ground-based instruments deployed on larger,

more efficient telescopes at infrared-favorable sites.

1.1 Techniques

To probe the evolution of star formation within a population of objects, it is vital

to understand the various contributions to their energy output. Increased sensitiv-

ity may uncover relationships between currently known classes of objects, or may

make possible the discovery of entirely new classes, but alone it cannot address the

nature of their energy generation. Without techniques available to approach and

disentangle the dominant underlying power mechanisms, we will be limited to sim-

ply counting objects, albeit in a variety of ways (luminosity functions, co-moving

volume densities, spatial correlation functions, etc.). Though much can be learned

from object counts, without deeper insight into the nature of the luminous content at

every epoch in the star-forming universe, its accurate history cannot be developed.

The measured emission from local luminous galaxies is dominated by several

processes: direct starlight from massive stars, highly non-thermal radiation from

active nuclei, and thermal re-radiation by dust. These emission components com-

3

bine, sometimes with significant dust extinction, to yield energy distributions which

are often difficult to interpret. A variety of techniques exists to help resolve this

ambiguity and differentiate star formation from other power sources.

Tracing the Hα recombination emission from H ii regions excited by ionizing

radiation of the embedded hot stars is a common method of measuring stellar con-

tent, and has been used with good results (e.g. Kennicutt, 1998); however, it fails

for objects suffering more than moderate extinction. Also, since atomic hydrogen is

so abundant in galaxies, and can be excited or ionized in a variety of astrophysical

contexts, its power to distinguish stellar from other excitation mechanisms can be

limited. Other hydrogen recombination lines (e.g. Hβ or Brγ) can be used in the

same way, but they quickly become quite weak at longer wavelengths, before appre-

ciable relief from heavy extinction is obtained.1 A more fundamental difficulty is

the reliance on properties of the surrounding medium. Since the lines observed have

been reprocessed by recombination, a “leaky” medium which permits the escape of

ionizing photons from the galaxy can confuse or obstruct this diagnostic.

Ultraviolet (UV) continuum measurements directly sample the photospheres of

hot stars, which emit the bulk of their energy at these wavelengths. Hence the total

UV flux scales directly with the star formation rate of active star-forming galaxies.

Emission just longward of the Lyman limit is in principle an extremely powerful

diagnostic of the stellar population, and ground-based rest frame UV observations of

high redshift galaxies have already provided compelling constraints on the evolution

of star and element formation (Madau et al., 1996). The chief disadvantage of

ultraviolet techniques is extreme sensitivity to extinction. The extinction just above

the Lyman limit is A.1µm/AV ∼ 400 — an increasingly severe difficulty to overcomefor studying environments of very recent star formation, in which the hottest and

youngest stars can often be the most deeply embedded. For distant galaxies the

Lyman forest of absorption lines can also remove considerable flux between Lyα

and the limit.

Energy source discrimination techniques which are based on far-infrared and ra-

dio continuum measurements, while virtually free from the effects of dust extinction,

are more prone to misinterpretation, since in dusty galaxies with hidden AGN they

sample only reprocessed radiation. Alternatively, the ionized regions surrounding

the hot stars in a stellar population can be directly investigated with mid- and

1Brα (4.05µm) may be an important line available from space, since it sits at a relative minimumin dust extinction.

4

far-infrared fine structure lines of neon, sulfur, and other elements, which suffer

relatively little extinction (AV /A15µm ∼ 100). The hardness of the ionizing flux(and hence the nature of the source of ionizing photons) can be determined using

well-chosen ratios among these lines, but age dependence, upper-mass cutoff, and

abundance effects may dilute their discriminatory power (Thornley et al., 2000).

A technique for identifying star formation which uses unambiguously stellar ra-

diation has led to a class of objects known as Wolf-Rayet Galaxies (Conti, 1991),

so named because of their measured contingent of Wolf-Rayet (WR) stars, a class

of peculiar emission line objects characterized by extreme mass loss rates Ṁ ∼10−4 M� yr

−1 = 1010Ṁ�. The mass loss is driven in fast (365 ≤ v∞ ≤ 5000 km/s),dense stellar winds, in which broad emission lines are formed. Though detecting the

unusual emission features which signal the presence of WR stars in the integrated

spectra of galaxies is challenging, this direct diagnostic provides incontrovertible

evidence of massive star formation.

Although the WR direct detection technique is powerful, it has so far been

applied using only a handful of strong optical wind lines, which cannot probe envi-

ronments in which star formation is embedded in dust. Though intrinsically weaker,

the possibility of extending this diagnostic to use infrared WR wind emission lines to

pierce the veil of extinction which may cloak a substantial fraction of star formation

in the local and distant universe is what motivated this thesis.

1.2 Organization

In an effort to extend the available spectral templates of WR stars to mid-infrared

(MIR) wavelengths, we undertook an 8–13µm spectrophotometric survey of fifty

northern Galactic stars with Score, a novel MIR spectrograph built as a prototype

of one of the instruments to fly aboard Sirtf. Score will be described briefly in

Chapter 2, and its sensitivity compared with past and future space-borne instru-

ments. Chapter 3 will document new analysis techniques which were developed for

the extraction of the Score spectral data. The spectra themselves will be pre-

sented in Chapter 4, along with constraints on Wolf-Rayet stellar evolution from

mass-loss rates, terminal wind velocities, and neon and sulfur abundances derived

from the data. In Chapter 5, population synthesis models of very young, instan-

taneously formed single stellar populations will be presented, which draw on the

measured MIR spectra to evaluate the conditions for which infrared direct WR

5

detection might be favorable. Score observations of nearby blue compact dwarf

galaxy NGC5253 will also be presented. Final conclusions and future possibilities

will be given in Chapter 6.

Chapter 2

The SIRTF Cornell Echelle

Spectrograph, SCORE

2.1 Introduction

The Space Infrared Telescope Facility (Sirtf) is a NASA observatory slated to

launch in July, 2002. It will consist of two cameras and a four-module spectrograph,

the Infrared Spectrograph (IRS; Houck et al., 2000; Roellig et al., 1998). Two of

the four IRS modules operate at low resolution with long slits, and two trade slit

length for increased resolution.

Compared to ground based infrared observations, Sirtf’s expected sensitivity

enhancements are phenomenal. Atmospheric molecular bands of H2O, N2O, CO2,

CH4, O3, and others absorb large amounts of infrared radiation. A blackbody of tem-

perature 300K reaches its peak flux at ∼10µm, and hence the thermal backgroundof the telescope, and the atmosphere, which is substantially emissive in the same

molecular bands mentioned, is extremely high. Sirtf gains significantly both by

getting out from beneath the atmosphere and away from the thermal background of

a warm telescope in its solar, Earth-trailing orbit, and by passively cooling the tele-

scope assembly; however, these advantages alone do not account for the tremendous

Sirtf sensitivity gains. Other advancements combine to make it several hundred

times more sensitive than the similarly sized ISO space observatory. Efficient, large-

format infrared detector array technology has recently progressed at a rapid pace.

Coupled with spectrographic designs which make full use of the higher pixel count,

these detectors contribute significantly to the improved performance. Though still

plagued by the difficulties of high thermal background and atmospheric opacity,

6

7

ground-based mid-infrared instruments can take advantage of the same detectors

and designs to achieve greatly improved performance over earlier efforts.

The Sirtf Cornell Echelle (Score) is a mid-infrared (MIR) spectrograph built

as a prototype of the IRS’s short wavelength, high resolution (“short-high”) module,

and was the first cross-dispersed MIR spectrograph in operation (Van Cleve et al.,

1998; Smith et al., 1998). Score operates at a resolution of R ∼ 600 over the N-Band wavelength range from 8–13.5µm, using versions of the same detectors which

will fly aboard Sirtf, modified to accommodate the notably higher background

fluxes which attend ground-based observation. It has been operated with success

at the f/70 Cassegrain focus of Palomar Observatory’s 5-meter Hale telescope since

November, 1996. Score was designed with the same philosophy as the Sirtf

instrument: no moving parts, and “bolt-and-go” construction, which eliminates the

fine tuning adjustments most instruments require upon assembly, and decreases cost.

2.2 Hardware

The original Score testbench was constructed entirely of aluminum by Ball Aero-

space, permitting room temperature focusing to hold (except for a second order

focus shift upon cooling) at cryogenic temperatures. The Score dewar was man-

ufactured by Precision Cryogenics, Inc., and contains two cryogenic tanks, which

permit cooling the detector arrays and spectrograph to liquid helium (LHe) temper-

atures for proper operation. Additionally, the 3.7 liter liquid nitrogen (LN2) tank

decreases emission from internal optics and increases the LHe hold time (&24 hours

for 5 liters LHe).

Score uses two 128×128 Si:As BIBIB focal plane arrays, developed at RockwellInternational (now Boeing, Seib et al., 1994). One of the focal-plane arrays (FPAs)

serves as the detector for the spectrograph, and the other serves as the slit-viewer

detector.

2.2.1 Two Paths Diverge

The Sirtf IRS prototype module which forms the core of the Score instrument

was modified with an additional optical path which will not be present on Sirtf—

a slit viewer. Fig. 2.1 shows diagrammatically Score’s two optical paths. The

actual as-built schematic is shown in Fig. 2.2. Incoming light from the telescope is

8

Slit-Viewer FPA

Slit-viewer lens

SlitCollimator

PredisperserGrating

EchelleGrating

Camera Spectrograph FPA

Blocking Element

Filter

f/12 beamfrom f-converter

Cut-on Filter

Figure 2.1: A simple diagram of the Score optical system (not to scale), showingthe two major pathways. The light from the telescope passes throughan f-converter, yielding an f/12 beam on the slit. Light passing throughthe slit plane first encounters a 7.3µm cut-on filter. It is then colli-mated, cross-dispersed, dispersed by the echelle, and imaged onto thespectrograph FPA. Light reflected from the slit plane passes through afilter/blocking element pair and is re-imaged onto the slit-viewer FPA.

focussed on the reflecting slit surface, which is tilted at a 45◦ angle to the optical

axis. The slit itself has a projected size of 240×480µm, which is matched to the5-m telescope’s 10µm diffraction limit of 1′′ (2.44λ/D).

Light passing through the slit is collimated and encounters a grating pre-disperser

and then the echelle grating (which disperse at right angles to each other), and is

then refocused by a camera mirror onto the main detector. This produces the

echellogram spectral format seen in Fig. 2.3, in which 15 individual orders of the

main echelle grating, sorted by the cross-disperser, fall in adjacent positions.

Light which is reflected by the slit surface enters the slit-viewer module, gets

filtered by a blocking element and a “silicate filter” centered at 11.3µm with 1µm

bandpass, and is re-imaged onto the second detector. This provides a 12′′ diameter

field of view surrounding the slit, which is used for target acquisition, and absolute

flux calibration of point sources. The slit-viewer data are recorded simultaneously

with the spectral data, and provide an accurate register of the source position on

the slit. Fig. 2.3 also shows an example slit-viewer image for planetary nebula

9

Figure 2.2: The production schematic of the Score optical elements in two or-thogonal views, starting at the cold Lyot stop formed by the telescopeand f/converter optics. See text for description.

10

Figure 2.3: The crossed-echelle spectrum (right), and simultaneous 11.3µm slit-viewer image (left) of planetary nebula NGC 7027. A bright line of[S iv] is seen in the center of the spectrum, along with Ar iii, Ar iv,[Ne ii], and a broad PAH feature. The slit-viewer camera shows theprecise locate of the slit (dark rectangle) on the southwest lobe.

NGC 7027.

2.2.2 Electronics

The Score control electronics which operate the two arrays are similar to those

employed by SpectroCam, a 10µm spectrograph and camera built by Cornell for

the Palomar 5-m (Hayward et al., 1993). A schematic diagram of the operating

electronics is shown in Fig. 2.4. In brief, the two arrays are driven in parallel, and

clocked out exactly as if they were one larger 256×128 array with eight outputchannels. Most array input clocks and biases are slaved together for the two arrays,

and duplicate pre-amplifier and 14-bit analog-to-digital converters handle digitizing

the four output channels per detector. A dedicated PC mounted near the instrument

at the back of the telescope controls the arrays, while a Sun Sparcstation collects the

data and serves as the instrument’s user interface. The large and highly variable

background seen by Score at Palomar requires rapid chopping (∼5Hz) on andoff source to cancel the changing sky flux and imposes the additional requirement

that the arrays must be read every 60ms to avoid saturating the detector wells. A

11

data

controlMetrabytePIO24PC

NationalAT-DIO-32F

Chop interface box

Choppingsecondary

ObservatoryEthernet

Sparcstation

Telescopecomputer

Telescopeautoguider

Clock LevelShift

Pre-AmpCoadder+ A/D

DC bias

Gain, BW, and S/H enable

Analog box

Digitalbox

Dewar

cam

era

spec

8

4

44

14

CS

4

C = camera detector substrate biasS = spectrograph detector substrate bias

Figure 2.4: A schematic of the Score electronics, indicating the major bias, clock-ing, and data pathways. The two FPA’s are indicated at right in thedewar. Duplicate pre-amplifiers and analog-to-digital converters forthe four output channels of each array are shown.

hardware co-adder is therefore employed to digitally stack consecutive frames in an

integration prior to readout.

2.3 Performance

Score’s performance is illustrated in Fig. 2.5. The point source flux density sensi-

tivity (1-σ, 500s), with chopping overhead removed, is shown. The predicted value

agrees well with the observed sensitivity. Sensitivity is degraded at the edges of the

atmospheric N-Band window, and at a strong ∼9.7µm O3 telluric feature.For reference, the Sirtf IRS sensitivity (J. Van Cleve, priv. communication),

and the ISO sensitivity (R ∼ 103) are shown. The latter was computed using thelaboratory noise model outlined in the ISO handbook (Leech et al., 2001), and

is based on detector testing which does not include in-orbit S/N degradation by

effects such as glitches and fringing. The predicted ISO sensitivity in two detector

bands falls remarkably close to Score’s measured value (though at somewhat higher

resolution). In practice, however, Score’s realized sensitivity is at least as good and

in some instances appears to be noticeably better than ISO’s SWS spectrograph, as

12

Figure 2.5: The Score sensitivity for a 500s staring 1–σ detection. The solid lineshows the predicted sensitivity for 25% combined telescope and skyemissivity, which is in close agreement with the measured sensitivity.Also shown are the sensitivities predicted for the SIRTF IRS module,and computed for the ISO–SWS, along with the continuum flux froma nearby brown dwarf.

13

confirmed by comparison of Score and ISO observations of the same target star.1

Also shown for reference is the flux density of nearby brown dwarf Gl229b.

2.4 Conclusions

The phenomenal performance gains achieved by Score for ground-based mid-

infrared spectroscopy presage the fundamental role Sirtf will play in making this

the “decade of the infrared.” It has vindicated the design concepts and no-moving-

parts philosophy Sirtf’s spectrographic instrument is based on, and has proved

extremely scientifically productive, contributing to projects on Be Stars, planetary

nebulae, M dwarfs, and many others. Future mid-infrared instruments utilizing

modern large detector arrays will benefit from the pioneering role Score has played.

1ISO, however, had much larger wavelength coverage than Score, including regions blockedby the atmosphere. The [Ne iii] (15.5 µm) and [Nev] (24.3 µm) lines, unavailable to ground-basedinstruments, proved particularly valuable.

Chapter 3

Spectral Extraction

3.1 Introduction

The advent of large format optical and infrared detector arrays has substantially im-

pacted spectrograph design and data analysis techniques. Since individual spectral

orders are usually imaged as approximately linear entities, and are generally more

extended in the dispersion direction than the spatial (slit) direction, taking full ad-

vantage of the available detector area requires careful placement of the spectrum

on the array. Partitioning the image delivered by the telescope using fiber bundles

or differential slicing elements allows additional spatial information to be recorded

— spectra from distinct object locations but with the same wavelength coverage

are replicated in different locations on the array. Alternatively, cross dispersing the

spectrum permits additional spectral orders to be separated, filling the detector ar-

ray, and greatly increasing the wavelength coverage available. These modern optical

configurations have revolutionized astronomical spectrography, but have also neces-

sitated a new class of sophisticated algorithms to efficiently extract the spectra they

produce.

3.2 Issues of Cross Dispersion

In spectrographs which cross-disperse the incoming beam (such as Score — Chap-

ter 2), the spectral orders formed by the main dispersing element (often an echelle

grating operated in high order) will not in general fall parallel to the detector rows or

columns. The large wavelength coverage at relatively high resolution made possible

by cross-dispersed echelle spectrograph designs comes with a cost: the necessity of

14

15

cols

row

s

Figure 3.1: A schematic curved order image with exaggerated slit rotation, andspectral axes depicted.

sampling the non-linear dispersion regime of the order-sorting element (typically a

grating or a prism), resulting in curved orders with variable spacing between them.

Grisms, or immersed gratings, which combine the opposing non-linear disper-

sions of prisms and gratings in a single optical element, can potentially be used as

cross-dispersers to reduce order curvature and differential spacing, much as an achro-

matic doublet balances to first order the opposing chromatic aberrations of positive

and negative lenses of different glass materials. Despite the increased placement

efficiency and more evenly spaced orders made possible by grisms, the higher order

aberrations remain significant. The effective dispersion direction and slit direction

(the spectral axes) will in general still not be perpendicular, requiring either one

or the other to cross the detector axes (usually the slit direction is chosen to align

approximately with one of detector axes).

For more typical, curved-order spectra, not only are the spectral axes non-

orthogonal and non-parallel to the detector axes, they also rotate and diverge or

converge along the order, illustrated schematically in Fig. 3.1. Additional, unre-

lated optical effects can serve to introduce wavelength dependent line tilt, as seen in

Fig. 3.2, a Score spectrum of atmospheric emission obtained by summing individ-

ual sky frames, and dividing out the lab-measured optical and electronic efficiency

map to accentuate the data located at the extremes of the slit. Note the pronounced

counter-clockwise rotation of lines in the lower orders.

Long slit lengths, though not usually present in cross-dispersed spectrograph

designs, can complicate matters further by creating slit images which are not just

tilted but also curved. A tilted slit image can be regarded as a small segment of

a curved slit image which samples such a small portion of the curvature that it

appears locally straight.

Curved, unevenly spaced orders and a rotating slit image combine to make stan-

dard, row or column-based extraction undesirable for many digital two-dimensional

16

Figure 3.2: An unreduced atmospheric spectrum showing various emission and ab-sorption lines of H2O, CO2, and O3. Longer wavelengths are towardsthe bottom and left. The combined optical and electronic efficiencymeasured in the lab as the difference of warm blackbodies has beendivided out to enhance the signal at the extremes of the slit. Note thepronounced tilting of lines at the long wavelength (bottom) end of thespectrum, and the differential tilt within individual orders.

17

spectra.

3.3 Optimal Extraction

Early methods for extracting digitally recorded spectra involved identifying the re-

gion within the slit image occupied by the source and sky, and by the sky alone, and

then individually summing over the pixels in these regions in a direction perpen-

dicular to the dispersion axis (the spatial direction). These two sums can be scaled

to each other using the number of pixels involved, and the sky subtraction trivially

performed by differencing the pixel-weighted sums. While this technique is concep-

tually and computationally quite simple, modern spectrographs have increasingly

rendered it obsolete.

Even if the spectral orders are well aligned with the detector axes, the standard

extraction technique suffers. Since an object’s flux is not usually equally distributed

along the slit, but instead is described by a spatial profile — a wavelength depen-

dent function specifying the fraction of light which falls in each pixel along the

slit (see Fig. 3.3) — adding together all of the spatial pixels containing object flux

unnecessarily degrades the signal-to-noise (S/N) of the resultant spectrum, even if

all these pixel values are characterized by the same underlying noise (not often the

case). One alternative possibility is summing over only those object pixels with

the highest S/N , above some preset threshold. Unfortunately, this technique will

compromise spectrophotometric accuracy, since the profile will not be uniformly

sampled. This problem is especially severe for the common case of slit profiles

which change shape with wavelength. An immediate solution is apparent: perform

a weighted total along the slit direction, with weights chosen to maximize the S/N .

The class of algorithms which prescribe these weights statistically are called optimal

extraction algorithms.

3.3.1 Background

The first optimal extraction algorithm was documented by Horne (1986), and later

expanded by others. Assuming a known profile function at a single wavelength, Pλj,

with normalization∑

j Pλj = 1, where j is the discrete coordinate in the spatial

direction (1 ≤ j ≤ n along a slit image covering n pixels) and λ is the dispersioncoordinate (see Fig. 3.3), we can estimate the expected value of a given pixel at

18

x

1/3

j

y5

4

3

2

1 2 4 6 7 8 93 5 11 12 13 14 15 16 17 1810 2019

λ

Figure 3.3: A schematic portion of an ideal straight array order, labelled with thecoordinates systems used in the discussion. The discrete coordinatesare (λ, j), enumerating pixels in the dispersion direction (rows), andslit direction (columns), respectively. The analogous continuous coor-dinates are (x, y). To the right, an example profile, Pλj, at a singlewavelength shows how a given hypothetical flux distribution along theslit (smooth curve) becomes a normalized profile function.

some wavelength λ in a single order of the spectral image D as

〈Dλj〉 = Pλjfλ (3.1)

where fλ is the true and unknown flux at one wavelength in an order, which we wish

to recover. The profile, Pλj, will in general be affected by the source intensity and

angular size (e.g., point-like vs. extended), atmospheric seeing, slit throughput, and

various other instrumental optical effects.

An estimate of the true flux, fλ, based on the data at a given wavelength (in

one spectral order on the array — order overlap is addressed in § 3.4.4) is denotedf̃λ. Dropping the redundant λ from all quantities, this estimate can be written as a

linear combination of all the pixels along the slit:

f̃ =∑

j

wjDj (3.2)

where wj are an optimal set of weights at this wavelength which we wish to de-

termine. Alternatively, we could consider a weighted sum of the individual flux

estimates at each spatial position — f̃j = Dj/Pj. Each of the f̃j represents the

flux we would estimate using the profile if we had only the jth pixel’s data to con-

sider; if the profile is accurately determined, the independent estimates f̃j should all

have the same mean (though different variances). We can write this alternatively

19

formulated, but tantamout estimate with a different set of weights, w′j:

f̃ =∑

j

w′j f̃j =∑

j

w′jDjPj

(3.3)

It will be convenient to retain both of these interchangeable formulations of the

weighted sum. Note that as of yet we have said nothing of the origin or form of the

profile function Pj (but see § 3.3.2). The variance of the equivalent flux estimates inEqs. 3.2 & 3.3 is

V (f̃) =∑

j

w2jVj =∑

j

w′2jVjP 2j

(3.4)

where Vj is the variance at pixel j along the slit. The expected value of each data

element (Eq. 3.1) implies an expected value of the flux estimate, using Eqs. 3.2 &

3.3, of 〈f̃〉

= f∑

j

wjPj = f∑

j

w′j (3.5)

For an unbiased estimate, we require the expected value of the flux estimate to equal

the true flux —〈f̃〉

= f , which translates into the pair of constraints:

∑j

wjPj =∑

j

w′j = 1 (3.6)

Minimizing each of the equivalent variances in Eq. 3.4 subject to these constraints

with the method of Lagrange multipliers (see Appendix A), the optimal weights are

found to be

wj = w′j/Pj =

Pj/Vj∑i P

2i /Vi

(3.7)

The weights for the two different formulations are related as we would have predicted

had the final equality in Eq. 3.5 held term by term. Since the variance of each

individual flux estimates f̃j is given by Vj/P2j (see Eq. 3.4), we can understand

Eq. 3.7 as simply the statement that the minimum variance in the weighted average

of a population (in this case, the one-pixel flux estimates) with identical mean is

achieved with weights inversely proportional to the individual variances (see, e.g.

Bevington, 1992, p. 59).

20

3.3.2 Profile and Variance

The profile and the noise model used to predict pixel variance are the two most im-

portant ingredients in optimal extraction. Incorrectly estimating the pixel variances

will degrade the S/N , but will not bias the extraction, since any set of Vj in Eq. 3.7

will satisfy the enforced unbiased estimate condition (Eq. 3.6). An incorrect profile,

however, will disrupt the individual per-pixel flux estimates, so that the assumption

that each of the f̃j are drawn from a population with the same mean is invalidated,

and the calculations which led to the optimal weights in Eq. 3.7 are flawed. This can

introduce a bias into the final calculated flux. Obviously, to maintain an unbiased

flux estimate, the error in the profile utilized must be appreciably smaller than the

individual pixel noise.

The earliest techniques developed to ensure a well-determined spatial profile

function involved summing over each row corresponding to a given pixel (a fixed

value of j in Fig. 3.3), and using the same fractional profile present within this

order sum at all wavelengths in the order (Robertson, 1986). For orders with any

curvature, or with intrinsic variation in the profile, this method fails. To accommo-

date variations in the profile, newer formulations specified fitting a given function,

such as a Gaussian, to the spatial profile data at each wavelength, and then smoothly

interpolating over the family of functions so found. This method forces the choice of

some predefined form for the profile, which may not always hold, due to changes in

seeing, optical distortions, and different intrinsic angular distributions for different

sources. Even if the profile remains relatively unchanged across the order, it may

not have a form simple enough to yield to an easy analytic fit.

The strength of the profile generation method introduced by Horne (1986) is

that it specifies no special form for the underlying profile function — assuming

only that it varies smoothly across the order.1 Thus, the technique is powerful for

point sources and extended sources, in the presence of all aberrations which serve

to introduce continuous profile distortions.

The pixel variances, while immaterial to the overall bias of the extracted flux, are

critical if performance better than standard extraction (the special case of optimal

extraction in which all the wj are equal) is to be obtained. Typically, a model

1This assumption breaks down for extended sources in which the shape or center of the spa-tial profile shifts with wavelength — for instance objects with discrete line emission offset fromcontinuum emission. It is still valid for spatially resolved source components with different, evenopposite, continuum slopes, as long as the continua vary smoothly, and for objects with line andcontinuum emission not separately resolved.

21

variance is formulated as:

Vλj = V0 + |Pλj f̃λ|/Q (3.8)

where V0 is a constant term, including detector readout noise, which is indepen-

dent of the impinging flux. The second term measures the expected variance with

the implicit assumption that the arriving photon flux is characterized by Poisson

statistics, such that for a number of photo-events N , the realized noise in the mea-

surement is√

N . The factor Q is the number of photo-electrons per recorded data

number, and is fixed by the detector and readout electronic characteristics. If the

constant term is zero, and the recorded flux is contributed entirely by the source

object (i.e. the measurement is source noise limited), then Vλj ∝ Pλj and all pixelsin the estimate of Eq. 3.7 are given equal weight. If background or detector noise

are important, Eq. 3.8 must be modified, since the noise affecting the measurement

of f̃λ will be dominated not by f̃λ itself. Since other types of noise may appear

which are arbitrary functions of the flux impinging on a pixel (linear or nonlinear),

or even the prior history of the detector, care must be taken to develop an accurate

noise model (see § 3.4.2 for a simple example).

3.3.3 Algorithm

The Horne algorithm involves establishing the profile and variance map for all pixels

in an order, and using these to compute the optimal weights (Eq. 3.7) used in

deriving the final flux. Since both the profile and variance themselves depend on

the flux calculated, the process is necessarily iterative. Typically the procedure

consists of generating an initial flux estimate as a starting point (again omitting the

redundant subscript λ):

f̃ (0) =∑

j

Dj (3.9)

This is simply the standard extraction flux estimate. The initial profile function,

P(0)j , is found by fitting low order polynomials over wavelength to a set of profile

estimates, one for each pixel position along the slit. The initial estimate of the

fraction of flux which pixel j contains is simply

P̃(0)j =

Dj

f̃ (0)(3.10)

22

To prevent cosmic ray or other blemishes from affecting the profile, outliers are

iteratively removed from the profile values being fit if

(Dj − f̃jP̃j)2/Vj ≥ σ2clip (3.11)

where σ2clip is a clipping variance. This simply enforces the constraint that the

square deviation from the expected pixel value in units of the pixel variance (the

mean square deviation) cannot exceed a given threshold.

Negative values of the profile function are truncated at zero, and it is normalized

at each wavelength:

P(0)j =

max(0, P(0)j )∑

i max(0, P(0)i )

(3.12)

where the max function returns the maximum value of its two operands. The pixel

variance is then recomputed using not the pixel value Dj, but the expected pixel

value f̃P(0)j :

V(1)j = V (f̃P

(0)j ) (3.13)

Using this updated variance to derive a new set of weights in Eq. 3.7, a new weighted

flux estimate, f̃ (1), is calculated. From the updated flux comes a new profile estimate

at each pixel position:

P̃(1)j =

Dj

f̃ (1)(3.14)

This updated profile is fit again to determine the smooth profile function, using

the new variance to control outliers. Iteration continues in this way until the flux

converges. The algorithm is not sensitive to the ordering of operations within the

iteration sequence, and typically convergence requires only a few iterations.

If the assumption that the spatial profile remains smooth is valid, the profile

function fit described should be relatively less noisy than the individual pixels on

which it depends, since it is affected by data at so many wavelengths.

Another strength of the optimal extraction algorithm comes as a byproduct

of the well-determined profile and pixel variance, both of which were required for

determining the correct weights. Cosmic rays or other blemishes can be effectively

removed from the final spectrum by examining data at each wavelength for extreme

deviation from the expected profile. To avoid detection and introduce spurious line

features into the spectrum, these artifacts would need to closely mimic the spatial

profile — an unlikely scenario for otherwise uncorrelated signals. A bad pixel mask

23

ji

x

y

λ

5

4

3

2

1

5

4

3

2

1

5

4

3

2

1

5

4

3

2

1

5

4

3

2

1

5

4

3

2

1

5

4

3

2

1

5

4

3

2

1

5

4

3

2

1

5

4

3

2

1

5

4

3

2

1

5

4

3

2

1

5

4

3

2

1

5

4

3

2

1

5

4

3

2

1

5

4

3

2

1

5

4

3

2

1

5

4

3

2

1

5

4

3

2

1

5

4

3

2

1

5

4

3

2

1

5

4

3

2

1

5

4

3

2

1

Figure 3.4: A portion of a curved order image, with an example tilted line profilein gray. The order envelope is smooth, but a pixel-binned version isshown. The discrete coordinate λ no longer corresponds to columns, asin Fig. 3.3 (i now serves this purpose), but is instead a parameter whichfollows the curved order. The different tick angle at each λ correspondsto the changing line tilt across the order. The j coordinate is as before,but a pixel-based resampling tracing the order is shown (shifting byentire pixels).

can be determined using the exact same criterion used to reject pixels from the

profile function fit (Eq. 3.11), but this time, using the profile fits themselves to

exclude the spurious data so revealed. The same rejection threshold need not be

employed. A good fit profile will still allow the recovery of an unbiased, if noisier,

flux at that wavelength, despite the missing data.

3.3.4 Order Curvature

The technique described in the previous section works well for relatively straight

orders. If orders experience significant tilt and/or curvature, a given row of the

spectral image often contains only a small portion of the full order range. The

profile is then poorly constrained and the extraction is afflicted by noise.

Fig. 3.4 illustrates a schematic curved order image, including line tilt. Line tilt

will be discussed in § 3.3.5, and for present purposes we can retain the associationof λ with detector columns, as in Fig. 3.3.

One seemingly obvious solution to this problem is to straighten the order prior to

further reduction. The simple unit pixel shift method for tracing the order depicted

in Fig. 3.4 will clearly introduce discontinuities in the profile, but of course one

24

could also consider resampling the order data before applying the optimal extraction

algorithm. Since the order location can be readily determined, there is no technical

barrier to this method, and in general the resampling can be performed without

introducing a bias to the computed spectrum.

Despite the intuitive appeal of this technique, it cannot work in practice. Any

resampling necessarily correlates the noise in adjacent pixels of the resampled image.

The analysis of § 3.3.1 – § 3.3.3 (e.g., Eq. 3.4) made use of the implicit assumptionthat the pixels contained within the slit image were uncorrelated, and therefore

resampling can defeat optimal extraction.

An additional complication reducing the effectiveness of this solution is resam-

pling noise (Allington-Smith et al., 1989). Whenever spatial features present in

a spectrogram are under-sampled (for instance, an unresolved stellar image), any

resampling introduces oscillatory noise along fixed rows in the resampled image.

This noise arises from the fundamental lack of high spatial order information for

the profile, and increases with the degree of under-sampling. Imagine for example

that all of the flux of an unresolved point source fell within a single pixel along

the slit. A resampling which shifts the spectral image by one-half pixel at some

wavelength must necessarily distribute the flux equally between two adjacent pixels,

even though the starlight may well have fallen near the edge of the original pixel.

Not enough spatial information was available to accurately reconstruct the profile,

and “ringing” will result.

To avoid the ill effects of resampling, Marsh (1989) introduced another layer of

abstraction between the spatial profile fitting and the weight determination of § 3.3.3.In Marsh’s reformulation, the polynomial profile fits of Horne (1986) are performed

not along rows or columns of the spectral image, but along a family of curves tracing

out the spectral order. These three dimensional curves (two dimensions tracing the

spatial position of the order, one dimension sampling the profile estimates) are

interpolated onto the straight detector grid to form a “virtually resampled” profile

image:

Pλj =N∑

n=1

QnλjGnλ (3.15)

where Qnλj are the interpolation coefficients which specify the contribution of poly-

nomial n to pixel λj, and Gnλ are the polynomials evaluated along the orders which

sample instrinsic changes in the profile with wavelength (one for each of N positions

25

along the slit):

Gnλ =K∑

k=0

Aknλk (3.16)

Usually N is chosen to be larger than the number of slit pixels (i.e. the spacing

between curves is made less than one pixel), in order to accurately recover the un-

binned profile. The polynomials Gnλ do include any information about and are not

sensitive to order curvature. They sample only the smoothly changing profile eval-

uated along lines parallel to the order, which has arbitrary form. The Gnλ would

be the exact analog of Horne’s row based profile fits (Eq. 3.12), had a hypotheti-

cal detector in which the placement of pixel rows varied smoothly to fall precisely

beneath the curved order been used. The interpolation coefficients Qnλj are fixed

by the chosen interpolation method, the number of curves, and the position of the

orders, and remain constant during the iterative extraction procedure. The polyno-

mial coefficients of the fitted profile in Eq. 3.15 are chosen by minimizing

χ2 =∑λj

P̃λj − Pλjσ2

P̃

(3.17)

where the P̃λj are the profile estimates as in Eq. 3.10, the variance of which can be

evaluated in a straightforward way using the individual pixel variances of Eq. 3.8.

This formulation can be used to fit the same noisy profile estimates (Eq. 3.10)

used in standard optimal extraction, and the extra step of interpolating the fitted

profile curves onto the data grid, rather than interpolating the data onto a more

convenient profile fitting grid, defeats the resampling noise. The procedure requires

solving N(K + 1) simultaneous equations, though K, the degree of the polynomials

used to fit the intrinsic profile variation, can usually be kept small, since most of

the variation due to changing order position is removed.

3.3.5 Line Tilt and Curvature

Often, spectral orders aren’t just curved or tilted, but the slit image itself is not

parallel to any detector axis. As illustrated in the schematic of Fig. 3.4, and empha-

sized in the Score atmospheric spectrum in Fig. 3.2, this line tilt can vary from

order to order and within individual orders. In general, lines can also be signifi-

26

cantly curved.2 Resampling the order data to straighten these lines (perhaps at the

same time the order itself is straightened), introduces the same correlation and noise

issues discussed in the previous section.

What is needed is simply an analogous algorithm which considers the line shape,

the pixel variances, and the profile estimates to generate a set of weights at each

wavelength, wλij for all pixels in the image Dij. The coordinate λ is no longer

identified directly with the detector column i, as it has been. As illustrated in

Fig. 3.4, it becomes a discrete parameter which follows the order. In principle, the

spacing of the parameter λ along the order is arbitrary, but the overall resolution of

the instrument is not increased by line tilt. A tilt gradient across the order implies

that one end of the slit offers effectively higher resolution than the other, similar

to the optical effect of a wedge shaped slit, but when taken together they average

out to the original, untilted resolution. A λ spacing significantly smaller than the

column spacing (in spectra with slit image more nearly aligned to columns), though

harmless, is unwarranted. Similar to the virtual resampling procedure outlined

in § 3.3.4, in which the profile was recovered by slightly oversampling the familyof polynomials, we can simultaneously oversample in both the slit and dispersion

directions.

The two dimensional weights wλij form a family of surfaces (one at each pa-

rameterized wavelength) which are typically local, spanning at most a few detector

columns. Similarly, two dimensional profile surfaces (as opposed to the profile func-

tions of § 3.3.4) can be developed which locally determine the fraction of the lightat a single wavelength which fell into a given pixel. Both of these constructions

require accurate knowledge of the distortion of the slit image, which can be quan-

tified as the function which maps a straight slit, oriented along one of the detector

axes, to the measured slit shape (curved, tilted, or otherwise). In practice a spa-

tially homogenous source with sharp spectral lines (astrophysical or in the lab) is

used to measure slit image distortions, which are taken as unchanging inputs to the

extraction process.

If the slit shape and orientation were fixed along the order, a single such mapping

function would suffice. For the general case of a non-constant slit distortion, a family

of such functions, Tλ(x, y) can be constructed from measurement, and formulated as

Tλij — i.e. defined on the physical pixel grid, and not the oversampled mesh which

2This type of optical line distortion is intrinsic to the instrument, and should not be confusedwith the physical line shape changes which can result from spatial variations of a line’s centralwavelength, as for, e.g., measurements of a galactic rotation curve.

27

overlays it.

The practical details of using a measured set of Tλij to develop profile surfaces

Pλij can be approached in a variety of ways, as long as care is taken to ensure each

pixel contributes only its included flux to the various profile surfaces at different

wavelengths which overlap there. The locality of Tλij and Pλij, which typically will

span at most three detector columns (bounded by the slit length), can be used to

simplify this calculation enormously.

The flux estimates analogous to those of Eq. 3.3 are no longer based on the

profile and data along just a single dimension (i.e. the slit direction). The individual

estimates are instead given by

f̃λij =DijTλij

Pλij(3.18)

At a given wavelength there are as many independent flux estimates as there are

non-zero entries in the slit curvature map, and a given pixel in the spectral image

will be used in constructing more than one flux estimate. To ensure no flux biasing,

we require ∑λ

Tλij = 1 (3.19)

for all (i, j), preventing pixels from over-contributing to the final flux.

In practice, using the tilt map Tλij to move between the two dimensional family-

of-surfaces formulation, and the one-dimensional representation of that family at a

given position along the slit allows the machinery of profile determination from the

previous section to be employed, so long as the various contributing components at

a single fit position are kept track of.

3.4 Scorex

The unusual properties of Score mid-infrared spectral data (see Chapter 2) mo-

tivated the development of a specifically tailored reduction package. Unlike the

optical CCD spectra for which the majority of cross-dispersed spectral reduction

techniques were designed, these data are characterized by very short slit length

(4–5 pixels), non-negligible order cross-talk (overlap) at shorter wavelengths, and

extreme background-dominated noise characteristics. Sky data are recorded at the

extremities of the slit in optical spectrography, whereas the extremely high back-

ground from the sky dominates Score data, and must be removed dynamically by

28

rapid chopping on and off the source. The strong sky signal is also the dominant

noise contributor in these data, and is in general far larger in magnitude than either

read noise or source photon noise — the dominant noise components of optical spec-

tra. Score’s relatively short slit and modest detector size, coupled with the larger

slit diffraction pattern at these wavelengths, produce more order cross-talk than in

optical spectrographs. The application of optimal extraction techniques to Score

data therefore required several modifications to account for these major differences

in the character of the recorded spectrum. The result is Scorex — a set of IDL3

routines which implement reduction and extraction of Score data.

3.4.1 Calibration

Calibrating Score data involves measuring the positions of the orders, assigning

wavelengths to positions within each order and aligning the wavelengths between

orders, measuring line tilt, and developing an efficiency map. Calibration need be

performed only after new instrumental parameters are encountered. All calibration

steps are performed using the results of the prior steps. Wavelength stability to

better than one quarter pixel was observed, though alignment differences between

opening and re-mounting the instrument did contribute to overall shifts of the order

positions on the detector by up to one pixel.

Individual calibration data are saved together as scoresets – standard templates

which contain all information necessary to reduce a given observation. To reduce

a given dateset, an individual scoreset is used to load all the relevant calibration

parameters.

Order Positioning

The initial step of calibration involves shifting, stretching, and rotating a conven-

tional ray-trace map to accurately overlay the spectrum. The ray-trace defines the

theoretical expectation of the location on the detector array of each wavelength in all

orders, and matches the realized echellogram reasonably well. To quantify the devi-

ation of order positions from this expectation, a test spectrum of a bright blackbody

source filling the slit is used. Three parameters are allowed to vary within the fit: an

overall shift of the order pattern in the cross-disperser direction, a linear stretch of

3IDL, The Interactive Data Language, is a registered trademark of Research Systems, Inc. (nowKodak).

29

the spacing between orders, and a rotation of the entire order pattern on the array.

An initial estimate of each parameter is made automatically by examining averaged

column slices for extrema. The three parameters of the fit are then varied with a

conjugate gradient technique to maximize a merit function which gives weight both

to the total flux underlying the map (proportional to the integrated efficiency of

the underlying spectral region), and to the total number of pixels it contains. This

technique allows accurate and reproducible mapping of the differential curvature of

the various orders, expressed as slight deviations from the order positioning of the

theoretical ray-trace.

Wavelength Calibration

Wavelength calibration is performed similarly to the order positioning, using the ray-

trace wavelength predictions, transformed along with the physical order positions

in the previous step, as a starting point. The wavelength calibration dataset used is

either of an NH3 absorption spectra (Fig. 3.5a) obtained in the lab from a ∼360Kblackbody source absorbed through ammonia vapors, or a sky emission spectrum,

as shown in Fig. 3.2, created from the raw, undifferenced sky frames of observed

sources. Both have many unresolved lines available, though sky emission data are

used when possible, since they offer more complete line coverage, most notably

at the longest wavelengths observed. The comparison reference spectra used were

either FTIR NH3 spectra from the EPA’s Emission Measurement Center, convolved

with the instrument profile to a resolution R ∼ 800, or atmospheric transmissionsmodels created at similar resolution with atmospheric modeling code ATRAN (Lord,

1992). Each order of the calibration spectra is individually reduced and de-tilted

by a local rotation-based interpolation.4 Each order is then extracted by simply

summing along the slit direction.

Since Score spectral orders substantially overlap each other in wavelength, with

most wavelengths present in two orders, and some wavelengths present in three, the

first step of wavelength calibration involves aligning the individual orders to each

other.

This is achieved by shifting the predicted wavelength on each order with re-

spect to its longer wavelength neighbor by a constant amount, determined by fitted

measurements of line centers of the same spectral line appearing in adjacent orders.

4Since line tilt measurement occurs after wavelength calibration, the tilt fit from a prior cali-bration is used initially, to bootstrap up to an accurate measurement.

30

a)

b)

Figure 3.5: a) An inverted ammonia absorption spectrum obtained in the lab asthe difference between a 350K blackbody source with and without NH3vapors interposed. b) The blackbody spectrum used in the subtraction,and also to create a map of combined optical and electronic throughputefficiency.

31

After the orders have been aligned, the overall wavelength calibration is under-

taken. A variety of NH3 or atmospheric lines are fit, along with the equivalent

line from the reference spectra. Line blends or otherwise asymmetrical lines were

avoided. The deviation between the expected and measured line center wavelengths

is fit with a straight line (Fig. 3.6b). The fractional deviation from the prediction

was approximately δλ/λ ∼ .005 for all calibrations.

Line Tilt

The line tilt evident in Fig. 3.2 is must be removed, to avoid suffering resolution

degradation at long wavelengths. Using the results of the order and wavelength cal-

ibration, individual lines tilts are measured by fitting centroid positions along each

individual row in which some portion of the line appears. Using four such measure-

ments, an angle of tilt from vertical is derived, as shown in Fig. 3.6a. This line tilt is

fit as a function of wavelength. Although not motivated by optical considerations,

which might instead specify fitting a center of deviation with spectral lines falling

tangent to circles centered on the fitted point, the magnitude of the errors in the

measured tilt angle imply a straightforward wavelength fit is as accurate as a physi-

cally more realistic, but more complex technique. This method is also motivated by

the incomplete explanation of the line tilt as a purely instrumental optical effect.5

Efficiency Maps

Creating flat fields for use in reduction of infrared spectra is a difficult proposition.

Finding a spectrally flat source at thermal wavelengths is alone challenging, but

H2O and CO2 absorption compound the problem. Often in near infrared observa-

tions, the sky itself is utilized as a blackbody source for flat-fielding, constructed by

median-filtering a stack of exposures, to remove the source contribution. This tech-

nique, however, relies on the atmosphere serving as a relatively stable background,

which is not often the case at longer wavelengths. Because most components within

the instrument radiate so efficiently in the MIR, the removal of scattered light is of

paramount importance. A non-negligible contribution to the Score background re-

sults from the reflection of the main instrument beam off the CdTe entrance window,

5Though curved or tilted slits are a well-studied consequence of cross-dispersed and other highangular dispersion spectrographs (see, e.g., Schroeder, 1987, p. 261), and tilts of this magnitudeare not difficult to obtain, the change in line tilt over this wavelength range is incompatible witha straightforward optical explanation.

32

8 10 12 14

-10

0

10

0

10

20

Figure 3.6: a) An example line tilt calibration, with linear and quadratic fits over-laid. The angle θtilt quantifies the line tilt with respect to the detectorcolumns, as a function of measured wavelength. b) The wavelengthcalibration, constructed by fitting the deviation of the measured fromthe reference wavelengths (δλ), as a function of the measured wave-length of the atmospheric calibration line. Notice the total differenceover the full wavelength range and all orders is ∼ .01µm, or just underone pixel at Score’s resolution and sampling.

33

producing radiation characterized by a poorly quantified temperature – literally, the

result of the instrument looking back at itself and the interior of the cryogenic de-

war, which is not of a uniform temperature. A technique was therefore developed to

eliminate this and other constant contributions to the scattered background. The

spectra of both hot (∼350K, see Fig. 3.5b) and room temperature blackbody sourcesfilling the beam were recorded, prior to each mounting of the instrument on the tele-

scope. The stacked difference between five or more pairs of these spectral images

produced a frame free from poorly characterized background and scattered light.

By dividing such a frame by the predicted relative fluxes of the two blackbodies, an

efficiency map, Eλj is created, which characterizes the optical efficiencies (grating

blaze efficiency, slit throughput, filter responses, detector substrate/filter banding,

etc.) combined with the detector response and electronic readout efficiency. Though

not strictly a flat field, Eλj is very useful for normalizing and combining orders.

3.4.2 Noise Model

Each individual spectral image which is reduced by Scorex is of the form (array

subscripts omitted)

D =

∑2Nchopk=1 skak∑Neff

k=1 Ek(3.20)

where ak are the individual chop frames with one frame for each chop+nod position

of an observation, and each sk is the sign of summation required to difference ad-

jacent chop frames, and remove the background gradient to first order (for typical

beam-switched observation in which successive nods share one central chop position,

sk = [+,−,−, +,−, +, +,−, . . .]).A critical component of the noise model is the factor which converts between

recorded digital units and photoelectrons. This factor, Q, depends on the individual

pixel capacitance, the voltage range and digital output of the analog-digital convert-

ers, and any electronic offsets involved. Table 3.1 lists the contributing elements and

an explicit calculation of this factor for the most typical Score parameters.

The error due to shot noise in photon arrival, or, equivalently, photoelectron

production, can be calculated for any quantity A in digital units, by noting it implies

the detection of QA photoelectrons. This quantity has expected variance QA (a

characteristic of the underlying Poisson distribution), or a noise estimate of√

QA

photoelectrons, which is√

A/Q digital units. In practice, to accomodate noise in the

readout electronics, we write the error as a combination of shot noise and electronic

34

Table 3.1: Digital Unit to Photoelectron Conversion Factor.

Quantity Description Value

C BIB Well Capacitance 1.33pF = 8302.1e−/mV

∆V Input A/D Voltage Range 20V

NA/D A/D resolution elements 214

U Bit shift right in hardware coadder 4

G Pre-amplifier gain (typical) 40

Q photo-e−/ADU factor C∆VUGNA/D

= 63.34

noise:

σ2A = σ2Aph

+ σ2Aelec =A− Ae

Q+

σ2eQ2

(3.21)

where Ae is any background source offset analogous to dark current, and σ2e is the

electronic noise estimate for a given observation, expressed in electrons.

Using Eqs. 3.20 and 3.21, the error estimate for the spectral image D (dropping

indices) is

σ2D =1

E2

2Nchop∑k=1

ak − ekQ

+2Nchopσ

2e

Q2+ D2σ2E

(3.22)where E is the total efficiency map (

∑Neffk=1 Ek) constructed from blackbody pairs,

as described at the end of § 3.4.1, the quantity ek is the analog of dark current— an electronic offset which depends on chop frequency and bandwidth, and in

which the electronic noise, σ2e , is formed. The first term in Eq. 3.22, including the

undifferenced total of all chop frames, corrected for background offset, represents

the large contribution of the sky to the noi