Time Series Analysis of Data With Gaps -...

6
Time Series Analysis of Data With Gaps – Correlation Functions of MAXI Time Series – Jeffrey D. Scargle, 1 1 Space Science Division, NASA Ames Research Center Jeff[email protected] April 12, 2011 Abstract All standard time series analysis functions can be computed for data of any type and with arbitrary sampling in time, by using the concept underlying the Edelson and Krolik algorithm for correlation functions. This time-domain approach has the advantage that the effect of the sampling is multiplicative and can therefore be readily corrected for, as illustrated here for a selected set of variable MAXI sources. Key words: time series, gaps, correlation function 1 Introduction One of the major scientific goals of MAXI, “the Monitor of All-sky X-ray Image,” is to eluci- date variability of a number of active objects in the broad energy band from 0.5 to 30 keV. A partial list of time series analysis methods that turn time-sequences of flux measurements into useful scientific quantities includes: correlation functions, Fourier and wavelet spectra (both amplitude and phase), and structure functions. All can be computed in both auto- forms for single time series and cross- forms for two or more time series. All can be computed in a time-resolved way in order to study processes that are not stationary, by analyzing the data in a sliding window. For example the dynamic power spectrum tracks changes over time of the harmonic content of the time series. Such time-frequency and time-scale distributions are won- derfully explained in [Flandrin 1999]; see also [Galleani, Cohen, Nelson, and Scargle (2001)]. 2 Representing MAXI Light Curves: Bayesian Blocks Many astronomical time series are not evenly spaced, with samples occurring at more or less random times. Others, including those generated by MAXI, are evenly spaced in time but with occasional gaps. Figure 1 shows the distribution of the sizes of the gaps in the data for a typical MAXI source. It is commonly thought that such gaps effectively defeat most

Transcript of Time Series Analysis of Data With Gaps -...

Page 1: Time Series Analysis of Data With Gaps - Rikenmaxi.riken.jp/FirstYear/proceedings/pdf/Scargle_Jeffrey_D.-bin... · Time Series Analysis of Data With Gaps ... partial list of time

Time Series Analysis of Data With Gaps– Correlation Functions of MAXI Time Series –

Jeffrey D. Scargle,1

1 Space Science Division, NASA Ames Research [email protected]

April 12, 2011

Abstract

All standard time series analysis functions can be computed for data of any type and witharbitrary sampling in time, by using the concept underlying the Edelson and Krolik algorithmfor correlation functions. This time-domain approach has the advantage that the effect of thesampling is multiplicative and can therefore be readily corrected for, as illustrated here for aselected set of variable MAXI sources. Key words: time series, gaps, correlation function

1 Introduction

One of the major scientific goals of MAXI, “the Monitor of All-sky X-ray Image,” is to eluci-date variability of a number of active objects in the broad energy band from 0.5 to 30 keV. Apartial list of time series analysis methods that turn time-sequences of flux measurements intouseful scientific quantities includes: correlation functions, Fourier and wavelet spectra (bothamplitude and phase), and structure functions. All can be computed in both auto- forms forsingle time series and cross- forms for two or more time series. All can be computed in atime-resolved way in order to study processes that are not stationary, by analyzing the datain a sliding window. For example the dynamic power spectrum tracks changes over time of theharmonic content of the time series. Such time-frequency and time-scale distributions are won-derfully explained in [Flandrin 1999]; see also [Galleani, Cohen, Nelson, and Scargle (2001)].

2 Representing MAXI Light Curves: Bayesian Blocks

Many astronomical time series are not evenly spaced, with samples occurring at more or lessrandom times. Others, including those generated by MAXI, are evenly spaced in time butwith occasional gaps. Figure 1 shows the distribution of the sizes of the gaps in the datafor a typical MAXI source. It is commonly thought that such gaps effectively defeat most

Page 2: Time Series Analysis of Data With Gaps - Rikenmaxi.riken.jp/FirstYear/proceedings/pdf/Scargle_Jeffrey_D.-bin... · Time Series Analysis of Data With Gaps ... partial list of time

0 5 10 15 20 25 300

500

1000

1500

2000

2500Nu

mbe

r of I

nter

vals

Distribution of Interval Sizes: GRS 1915+105 (2885 orbits)

0 5 10 15 20 25 300

0.5

1

1.5

2

2.5

3

3.5

log 10

Num

ber

Interval (number of MAXI orbits)

Figure 1: Number of cases vs. observation interval. Most observations are at successive orbits, buta significant number of intervals of n > 1 correspond to gaps of length n-1 orbits.

time series analysis methods. However, for data with any sample times (evenly spaced, withor without gaps, or randomly spaced) very simple methods effectively implement all of theabove-mentioned analysis tools, as well an optimal piece-wise constant representation of thetime series [Scargle 1998, Scargle, Norris, Jackson and Chiang 2011] called Bayesian Blocks.Figure 2 shows the results of applying this algorithm to the 16 MAXI lightcurves studied here.

3 Correlations: Edelson and Krolik Algorithm

For the other analysis tools the starting point is an ingenious if straightforward algorithm[Edelson and Krolik 1988] for computing correlation functions. One simply averages the prod-uct of the measured values satisfying the condition that the corresponding time difference,called the lag τ = tm − tn, falls within a given τ -bin. For all measured pairs (xn, ym) define

Cnm =xnym√

(σ2x − e2x)(σ2

y − e2y), (1)

2

Page 3: Time Series Analysis of Data With Gaps - Rikenmaxi.riken.jp/FirstYear/proceedings/pdf/Scargle_Jeffrey_D.-bin... · Time Series Analysis of Data With Gaps ... partial list of time

where σx is the standard deviation of the X-observations, ex is the X-measurement error, andsimilarly for Y . The estimate of the correlation function is then

Rxy(τ) =1Nτ

∑Cnm (2)

where the sum is over the pairs, Nτ in number, for which tm−tn lies in the corresponding τ -bin.This factor, which has a roughly triangular shape corresponding to the 1/(T −τ) factor for thecontinuous case, corrects for the sampling. Note of course that Nτ is zero if no data pairs lie inthat interval, and the autocorrelation function is undefined there. The power spectrum can becomputed via the Fourier transform of the correlation function, in which case such undefinedpoints can be filled in with interpolation. The transition from time series analysis of evenlysampled data to the case of arbitrary sampling is thus straightforward. As in the continuous→discrete samples case [Priestly 1981], the transition even→ uneven samples is straightforward,involving slight modification of summations and of the frequencies at which spectral quantitiescan be evaluated. Figures 3 shows the autocorrelation functions of 16 MAXI sources obtainedin this way. the latter focusing on the small lag region. Figure 4 zooms in on the small lagregion, important because it contains information on variability at short time-scales and onthe observational errors (the zero lag spike measure the corresponding error variance).

I am grateful to Tatehiro Mihara and Mutsumi Sugizaki for assistance with this work, andto the NASA Applied Information Systems Research Program for funding.

References

[Edelson and Krolik 1988] “The Discrete Correlation Function: A New Method for AnalyzingUnevenly Sampled Variability Data,” ApJ, 333, 646

[Flandrin 1999] Time-Frequency/Time-Scale Analysis, Vol. 10 of the series Wavelet Analysisand Its Applications (Academic Press: London)

[Galleani, Cohen, Nelson, and Scargle (2001)] “Time-Evolution of the Power Spectrum of theBlack Hole X-ray Nova XTE J1550-564,” Proceedings of the IEEE - EURASIP Workshopon Nonlinear Signal and Image Processing

[Priestly 1981] and Time Series, Academic Press Limited: London; Section 4.8.3.

[Scargle 1982] Studies in astronomical time series analysis. II - Statistical aspects of spectralanalysis of unevenly spaced data ApJ, 263, 835-853.

[Scargle 1989] Studies in astronomical time series analysis. III - Fourier transforms, auto-correlation functions, and cross-correlation functions of unevenly spaced data ApJ, 343,874-887.

[Scargle 1998] “Studies in Astronomical Time Series Analysis. V. Bayesian Blocks, A NewMethod to Analyze Structure in Photon Counting Data”, ApJ, 504, 405

[Scargle, Norris, Jackson and Chiang 2011] Studies in Astronomical Time Series Analysis. VI.Bayesian Blocks, Triggers (and Histograms), in preparation.

3

Page 4: Time Series Analysis of Data With Gaps - Rikenmaxi.riken.jp/FirstYear/proceedings/pdf/Scargle_Jeffrey_D.-bin... · Time Series Analysis of Data With Gaps ... partial list of time

Figure 2: Bayesian block representation of the MAXI light curves discussed in this paper. The rawdata are shown as points, and blue lines and black bars indicate the blocks.

4

Page 5: Time Series Analysis of Data With Gaps - Rikenmaxi.riken.jp/FirstYear/proceedings/pdf/Scargle_Jeffrey_D.-bin... · Time Series Analysis of Data With Gaps ... partial list of time

−5000 0 5000−0.02

00.020.04

Aql X−1

−5000 0 5000−0.05

0

0.05

Crab

−5000 0 5000−0.2

0

0.2

Cyg X−1

−5000 0 5000−0.05

00.05

0.10.15

Cyg X−2

−5000 0 5000−0.05

0

0.05

0.1

Cyg X−3

−5000 0 5000−1

0

1GRS 1915+105

−5000 0 5000−0.02

00.020.040.06

GX 17+2

−5000 0 5000

−0.1

0

0.1

GX 339−4

−5000 0 5000

0

0.05

0.1

Her X−1

−5000 0 5000−0.01

0

0.01

0.02MAXI J1659−152

−5000 0 5000

−5

0

5

x 10−3Mrk 421

−5000 0 5000

−2

0

2x 10

−3NGC 4151

−5000 0 5000

−200

2040

Sco X−1

−5000 0 5000−2

−1

0

1

x 10−3XTE J1650−500

−5000 0 5000−0.04−0.02

00.020.040.06

XTE J1752−223

−5000 0 5000−2

0

2

4x 10

−3XTE J1946+274

Figure 3: This figure shows the complete autocorrelation functions, computed using the Edelson andKrolik algorithm. The red dot indicates the zero-lag point, which includes both the true varianceof the signal and that corresponding to the observational errors.

5

Page 6: Time Series Analysis of Data With Gaps - Rikenmaxi.riken.jp/FirstYear/proceedings/pdf/Scargle_Jeffrey_D.-bin... · Time Series Analysis of Data With Gaps ... partial list of time

−50 0 50

0.03

0.035

0.04

Aql X−1

−50 0 500

0.02

0.04

0.06

Crab

−100 0 1000.120.140.160.18

0.20.220.24

Cyg X−1

−20 0 20

0.05

0.1

Cyg X−2

−10 0 10

0.03

0.04

0.05

Cyg X−3

−10 0 10

0.5

1GRS 1915+105

−20 0 20

0.02

0.04

0.06

GX 17+2

−5 0 5

0.14

0.145

0.15

GX 339−4

−5 0 5

0.01

0.015

Her X−1

−5 0 50.016

0.018

0.02

MAXI J1659−152

−10 0 102

4

6x 10

−3Mrk 421

−5 0 5

0.51

1.5

x 10−3NGC 4151

−5 0 51020304050

Sco X−1

−4 −2 0 2 4468

101214

x 10−4XTE J1650−500

−50 0 50

0.07

0.075

XTE J1752−223

−50 0 502

3

4x 10

−3XTE J1946+274

Figure 4: This figure shows the central part of the autocorrelation functions as in Figure 3, toindicate how a smooth fit across the zero-lag point can be used to estimate the observationalvariance.

6