A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice...

23
SAS INSTITUTE INC. FALL 1997 ISSUE 5 A NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics book we noticed that the paired t-test was very rich in opportunities for alternate interpretations. In this article we will review those alternate views and add a new view, to be implemented in Version 4 of JMP. Different Views Here are four ways to view a paired t-test: ¥ Distribution of difference: the paired t-test examines the distribution of the difference of two variables, and tests if the mean of those differences is different from zero. The mean of the differences is the same number as the difference of the means, as in the 2-group t-test, but the variance of the distribution is not the same as if it were two groups. ¥ Restricted regression: We adopted the scatterplot in the Fit Y-by-X (continuous by continuous) as the best place to show the paired t-test graphically. The scatterplot shows the distribution of each variable, A New Angle on Matched Pairs............1 Calculator Corner.................................. 11 Checking the Proportional Hazard Assumption with Kaplan-Meier SurvivalEstimates.................................18 Finding the Area Under a Curve Using JMP and a Trapezoidal Rule.. ...9 JMP Data Discovery Conference.........7 Tips and Techniques.............................21 Just Mooning Around...........................13 In This Issue SAS Institute Acquires Abacus Concepts................................................8 and the correlation between them. A 45 degree line through the origin shows the boundary where the values are equal. The test for the mean difference is a tug of war across that boundary. If you fit a regression line that is constrained to have a slope of 1, the vertical distance between the fitted line and the line through the origin, will be the mean difference, and the test for it is equivalent to the paired t-test ( Figure A left plot). ¥ MANOVA: If you regard both variables as responses in a MANOVA model that has no effects other than an intercept, and then you test a contrast across the two intercepts, this will be equivalent k

Transcript of A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice...

Page 1: A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics

SAS INSTITUTE INC.FALL 1997

ISSUE 5

A NEW ANGLE ON MATCHEDPAIRS

John SallSenior Vice President

SAS Institute Inc.

As we wrote up the paired t-test for theJMP Start Statistics book we noticed thatthe paired t-test was very rich inopportunities for alternateinterpretations. In this article we willreview those alternate views and add anew view, to be implemented inVersion 4 of JMP.

Different Views

Here are four ways to view a pairedt-test:¥ Distribution of difference: the paired

t-test examines the distribution ofthe difference of two variables, andtests if the mean of those differencesis different from zero. The mean ofthe differences is the same numberas the difference of the means, as inthe 2-group t-test, but the variance ofthe distribution is not the same as ifit were two groups.

¥ Restricted regression: We adoptedthe scatterplot in the Fit Y-by-X(continuous by continuous) as thebest place to show the paired t-testgraphically. The scatterplot showsthe distribution of each variable,

A New Angle on Matched Pairs............1

Calculator Corner..................................11

Checking the Proportional HazardAssumption with Kaplan-Meier SurvivalEstimates.................................18

Finding the Area Under a Curve Using JMP and a Trapezoidal Rule.. ...9

JMP Data Discovery Conference.........7

Tips and Techniques.............................21

Just Mooning Around...........................13

In This Issue

SAS Institute Acquires Abacus Concepts................................................8

and the correlation between them.A 45 degree line through the originshows the boundary where thevalues are equal. The test for themean difference is a tug of waracross that boundary. If you fit aregression line that is constrained tohave a slope of 1, the verticaldistance between the fitted line andthe line through the origin, will bethe mean difference, and the test forit is equivalent to the paired t-test(Figure A left plot).

¥ MANOVA: If you regard bothvariables as responses in a MANOVAmodel that has no effects other thanan intercept, and then you test acontrast across the two intercepts,this will be equivalent k

Page 2: A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics

SAS INSTITUTE INC.SPRING 1997ISSUE 5

JMPer Cable -2-

to the paired t-test. The paired t-testis the simplest MANOVA model.

¥ Two-way ANOVA: You stack the data,mapping the values from the twocolumns into one column with twiceas many observations. You now runa two-way model withoutinteraction, where one termidentifies the original column fromwhich the value came, and the otherterm identifies the originalobservation that the value camefrom. You might call them thewithin-subject and between-subjectseffects. The resulting F test on thefirst term is equivalent to the originalpaired t-test. The square root of the Fvalue will match the t-value.

Before and After, Version 3 and Version 4There are at least these four ways to do apaired t-test in version 3 of JMP. Theproblem was that users had a hard timefinding these ways. The preferred waywas in continuous-by-continuous Fit Y-by-X platform, which is the same

platform that does simple regression.With version 4 of JMP, we resolved tofind a more prominent place for thepaired t-test that users could easilyfind, a new platform called ÒMatchedPairs.Ó

In version 3 of JMP, you have to tiltyour head to visualize the distributionof the difference between pairedvalues. With version 4 of JMP, werotate the plot by 45 degrees so that thedifference is now vertical.Figure A shows the before-and-afterpictures. Remember that these are thesame graphs, except for the rotation.Pick out some points and features onthe left, and note that the graph on theright is the same, tilted. The immediatebenefits are:

¥ You will find Matched Pairs on theAnalyze menu, rather than having todig for it inside Fit Y by X.

¥ The new platform scales the axes forX and Y, so that you donÕt have tomanually scale plots to get axes.

Graphical t test for Version 3 (left) and Version 4 (right)

After by Before

Afte

r

Before

Paired Data

Before

After

Mean:(Before+After)/2

Diff

eren

ce:A

fter–

Bef

ore

Page 3: A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics

SAS INSTITUTE INC.FALL 1997

ISSUE 5

-3- JMPer Cable

¥ You donÕt have to turn your head.The distribution of interest is nowvertical. You are testing that itÕsmean is not equal to zero, thehorizontal line across the middle.The mean difference line is the lineabove the middle, with two dotted95% confidence limits shown aboveand below it. The test is significantat 5% if the confidence interval doesnot contain the zero line.

Notice that the variance of thedifference is heavily dependent on thecorrelation between the two variables.You have positively correlated data, itwill tend to be close to the line of fit,

and the variance of the difference,which is the distance away from theline, will be relatively small. If you havenegatively correlated data, it will tendto vary in the other direction away fromthe line of fit, and the variance of thedifference will be relatively large. Onthe new rotated plot, these patterns willnow be on the vertical and horizontaldirections directly, rather than on the 45and Ð45 degree directions.

The New AngleNow we can notice that the new anglehelps unify the different views of theanalysis. A 45-degree negative rotation

y1

y1

y1

y2y2

y2

Illustration of aNegative 45-

DegreeRotation

Figure B

Page 4: A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics

SAS INSTITUTE INC.SPRING 1997ISSUE 5

JMPer Cable -4-

is the same as multiplying the data by acertain ÒorthogonalÓ matrix.The first coordinate of the rotation isthe sum Y1+Y2, which becomes thenew horizontal axis. The secondcoordinate is the difference Y2-Y1,which becomes the new vertical axis. Sothe 45 degree rotation is equivalent tochanging two variables into the sumand difference (see Figure B).Where have we seen this before?If you consider this as a MANOVA, thenthe first coordinate is the SUM responsefunction, and the second coordinate theCONTRAST response function. This is theclassical pattern of a repeated measuresanalysis. The alternate representation ofthat is as a split plot. The between-subjects is the whole plot effect. Thewithin-subjects is the sub plot effect.You can still see a plot of the originaldata by rotating your head 45 degrees.You can see which points are low in Y1and high in Y2 and so on. But the focusis now on sum and difference, and evena relationship between them.There are many possibilities for makingstatements regarding the patterns to bediscovered in the new coordinates. Forthe six following examples (see FigureC), we have:1. No Change. The distribution

vertically is small, and centered atzero. The change from Y1 to Y2 isnot significant. This was the high-positive-correlation pattern of theoriginal scatterplot, and is typical.

2. Shift Up. The Y2 score isconsistently higher than Y1 acrossall subjects.

3. Shift Down. The Y2 score isconsistently lower than Y1 across allsubjects.

4. No average shift, but amplifiedrelationship. Overall the mean is thesame from Y1 to Y2, butindividually the high scores gothigher, and the low scores gotlower.

5. No average shift, but reverserelationship. Overall the mean wasthe same from Y1 to Y2, but the highY1s got low Y2s and vice-versa. Thiswas the high-negative-correlationpattern of the original scatterplotand is unusual.

6. No average shift, but dampedrelationship. Overall the mean wasthe same from Y1 to Y2, but the highscores went down a little, and thelow scores went up a little.

It is interesting that rotation turnscorrelation phenomena into variancephenomena and vice versa.And of course the real value of theanalysis might be to discoverindividuals that donÕt fit the patternsuch as individuals that got betterwhen others got worse.

So this is one more small chapter inthe continuing quest for JMP toprovide illumination and innovationin the discovery and understandingprocess.

Page 5: A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics

SAS INSTITUTE INC.FALL 1997

ISSUE 5

-5- JMPer Cable

Page 6: A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics

SAS INSTITUTE INC.SPRING 1997ISSUE 5

JMPer Cable -6-

Figure C (continued)

Notes: The tilt idea has been done for other situations. You can also tilt for comparing means in groups. Formore information see the following references:

Chambers, Cleveland, Kleiner and Tukey (1983) Graphical Methods for Data Analysis, page 304, DuxburyPress,CA.

Hsu and Peruggia (1994) ÒGraphical Representations of TukeyÕs Multiple Comparison Method,Ó Journal ofComputational and Graphical Statistics,June, v3 p 156.

Congratulations to ASA WinnersThese students were finalists in the 1997 Mu Sigma Rho College Bowl at theAnaheim, California American Statistical Association conference, and eachwon a copy of the professional version of JMP:

A. Oedekoven George CapuanoUniversity of Nebraska University of Nebraska

John Castelloe Tiare StoneUniversity of Iowa Brigham Young University

Page 7: A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics

SAS INSTITUTE INC.FALL 1997

ISSUE 5

-7- JMPer Cable

"without reservation,the clinical trials thatwe did, and theadvances that we madefor the health of people,never would have beenpossible had it not beenfor JMPÓ

JMP DATA DISCOVERYCONFERENCE

Professional Services DivisionSAS Institute Inc.

JMP software users from across thecountry met July 15th-18th at SASInstitute Inc.Õs Training Center in Caryfor the second JMP Data DiscoveryConference. The stimulating, four-dayinteractive training conference gaveattendees the opportunity to cometogether and share ideas about theirexperiences with JMP StatisticalDiscovery Software, and interact one-on-one with John Sall, Senior VicePresident and Co-founder of SASInstitute Inc. and the JMP softwaredevelopment team.

Highlights of the conference included:

¥ a new JMP Statistical DiscoverySoftware training curriculum

¥ Version 4 preview led by John Salland the JMP development team

¥ keynote speakers Drs. Lenore andLeonard Herzenberg, world-renowned geneticists who use JMPsoftware in their research efforts.

During her presentation, Dr. LenoreHerzenberg stated that "withoutreservation, the clinical trials that wedid, and the advances that we madefor the health of people, never wouldhave been possible had it not been forJMP (software)." She also said that"discovery was a very important partof statistics."

Complimenting the technologicalaspects of the conference were aninformal question-and-answer lunchwith John Sall, a SAS campus tour ledby Koka Booth of the Public AffairsDepartment, and an evening dinnerevent which featured a talk about theTriangle by Teresa Tesh, Director ofNorth American Sales Operations.

Mark your calendar for the nextexciting JMP Data DiscoveryConference to be held at SAS Institutein Cary from October 28 to October 31,1997. Bob Stuart, from MotorolaUniversity in the College ofTechnology, will be the InvitedSpeaker. His areas of expertise are SixSigma Quality, Robust Design, Process

Page 8: A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics

SAS INSTITUTE INC.SPRING 1997ISSUE 5

JMPer Cable -8-

Improvement, Optimization and TotalProcess Control. Bob will address theconference on "Integrating StatisticalThinking in the Culture."

For more information about theupcoming conferences or to register,call

919-677-8000 X5005

or send a FAX to

919-677-8225

SAS INSTITUTE ACQUIRES ABACUS CONCEPTSCary, NC - (September26, 1997) SAS Instituteannounced that it hasacquired a market-leadingdesktop statisticalanalysis package for thelife sciences market,StatView¨ software fromAbacus Concepts ofBerkeley, CA.

Abacus Concepts has beena leading developer ofeasy-to-use statisticalsoftware for more than tenyears. It is often used byresearchers who dostatistical analysis as justone part of their jobs suchas scientists, businessanalysts, educators, and soon. First developed forMacintosh computers,

StatView has received

more awards from theMacintosh press than anyother desktop statisticspackage. StatView is nowavailable for bothMacintosh and Windows.

John Sall, cofounder ofSAS Institute and SeniorVice President, said,ÒWeÕve had a long,amicable relationshipwith Abacus. Statviewhas a graphical userinterface that makes theproduct extremely easy touse for researchers whoare not trainedstatisticians. It also hasphenomenal presentationgraphics.Ó

This technologyacquisition complementsSAS Institute's statistical

offerings, SAS¨ soft-ware, for analysts whoneed a tool to use in aclient/serverenvironment andintegrated with anenterprise datawarehouse, and JMP¨

software for desktopdata discovery.

SAS Institute willdistribute StatView withdirect sales from nearly100 offices in 60 countries,through mail order, andinternationally via anetwork of distributors.For more informationabout StatView, pleasevisit the following Website:

<http://www.abacus.com>

Page 9: A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics

SAS INSTITUTE INC.FALL 1997

ISSUE 5

-9- JMPer Cable

FINDING THE AREA UNDER ACURVE USING JMP AND A

TRAPEZOIDAL RULEby Nicole Hill Jones

SAS Institute Inc.Researchers who conduct clinical trialsoften have to measure the concentrationof a new investigational drug in asubjectÕs blood and compare it tobaseline values. To do this, data aremeasured at discrete time points over aspecified interval. The measured datarepresent the bodyÕs reaction asevidenced by an increase or decrease inthe concentration of a particularsubstance. If you give a drug by IVinjection, collect blood samples atvarious times, and measure the plasmaconcentrations of the drug, you mightexpect to see a steady decrease inconcentration as the drug dissolves.Figure A shows data and the plasmaconcentration time curve from D.W.A.Borne (1995).

The following example illustrates howto find the area under the plasmaconcentration time curve. This methoduses a variation of the TrapezoidalRule. The function of the time curve isdivided into segments or periodsbetween each time point, the area ineach segment is calculated and thenthe pieces are added together to obtainthe total area under the curve.

If each time segment is considered atrapezoid, its area is given by thesegment width and the averageconcentration within the segmentwidth. LetÕs define ti as the ith timepoint where time is given in hours,and Ci as the drug concentration at theith time point. So the concentrationbetween the 2nd and 3rd time period,AUC2,3, is computed:

as illustrated in the Figure B.

Plot of Concentration vs. TimeFigure A

Time

C

Page 10: A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics

SAS INSTITUTE INC.SPRING 1997ISSUE 5

JMPer Cable -10-

This works for all segments except thelast, which is the time segment from 10hours (in this example) until the drughas completely dissolved. The areaunder the last segment Cn can becalculated using the elimination rateconstant (kel), which is computed fromthe sample data.

The kel constant is the negative slopeof the relationship of the log ofconcentration and time. This can becomputed in JMP using the standardformula for the slope in a simple linearregression of Y on X:

You do this for the concentrationexample as follows:1) Create a new column in the data

table (call it ln(C)) and use the NaturalLog function in the calculatorTranscendental functions to computethe log of concentration.

2) Create another new column (call itbeta) and use the calculator tocompute the slope parameter of theregression of time on ln(C) as shownbelow.

This calculator formula to compute betauses an assignment function to create atemporary variable for the numerator(XY) and for the denominator (x2) of theslope computation. It then computesand assigns XY/x2 to the result.

Now you can find the total area underthe curve by summing the areas of theindividual segments and the computedfinal segment:

Page 11: A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics

SAS INSTITUTE INC.FALL 1997

ISSUE 5

-11- JMPer Cable

You can reuse thisdata table by savingit as a templateÑjustdelete all the rowsand save the emptytable with itsformulas. To use thetemplate, open itand paste your datainto the time and Ccolumns. You canrename the columns

Figure C Computed Slope Parameter and Total AUC

to fit your data, and the formulas automatically use the new column names andcompute the AUC you are looking for.ReferencesBourne, D.W.A. (1995), ÒA First Course in Pharmacokinetics & BiopharmaceuticsÓ, University of Oklahoma

College of Pharmacy.http://157.142.72.143/gaps/pkbio/Ch04/Ch0403.html

JMP Statistics and Graphics Guide, Version 3.1, (1995), SAS Institute, Inc.

by Michael HechtSAS Institute Inc.

Calculator Corner

{ if (generationj = ÒfatherÓ) and ( family idi = family idj ) heightiheightj

,

otherwise,j=1

i

åSUMMATION SIDE EFFECTSThe formula shown above isdeceptively complex looking. Itspurpose is simply to scan columns ofvalues. When it finds what it is lookingfor it computes a ratio and assigns it tothe column it is building (s/f ratio shownin Figure A).

This table has rows for a fatherfollowed by a row for each of his sons,with a family id and heightmeasurement for each row. Supposeyou want to compute the ratio of eachson to his father (possibly for thepurpose of testing whether height hasincreased over a generation.)

Page 12: A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics

SAS INSTITUTE INC.SPRING 1997ISSUE 5

JMPer Cable -12-

Figure A Diagram of Ratio Computations

family id 10 Rows

4 Cols

generation height s/f ratio

5.906.00

=.

One easy way to do this is to create acolumn (s/f ratio) and use the formulashown at the top of this article. Thisformula makes use of the summationfunction to scan columns of values.Scanning for a value is a logical sideeffect of the summation function whenits argument is a condition rather thanone or more simple subscriptedvariables. The process for finding s/fratio goes like this:

For each row the summation functionbegins at the first row of the table andlooks at the values of generation andfamily id. It continues until it locates therow where the value of generation isÒfather,Ó and the family id matches thecurrent rowÕs family id. It then stops andcomputes the ratio of the current sonÕs

height (heighti) to his fatherÕs height(heightj).

As the formula stands it simplydivides each fatherÕs height into itselfgiving 1, which could be eliminatedby including an if or match conditionalstatement to exclude computing anyvalue for fathers. This would bedesirable if you wanted to comparethe mean ratio to 1 and draw aninference about the change in heightsover the father-son generation.

The next article, Just Mooning Around,uses a more elaborate form of thislogic to compute the ratio of satellitesto their respective parent planets, andplanets to the sun.

Page 13: A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics

SAS INSTITUTE INC.FALL 1997

ISSUE 5

-13- JMPer Cable

JUST MOONING AROUNDby Ann LehmanSAS Institute Inc.

The name of this article, Just MooningAround is the same as an articlepublished by Dr. Isaac Asimov in 1975,in which he conjectures that our moonis not a satellite of the earth, but ratherÒa planet in its own right, movingabout the Sun in careful step with theearth.Ó He supports this conjectureadmirably with a dozen or so pages ofexplanation, formulae, and things tovisualize (if you can). The centralsupport is a computation he called thetug-of-war (TOW) value, which is theratio of the gravitational pull by asatelliteÕs parent, Fp, to thegravitational force between thesatellite and the sun, Fs.

You can picture this ratio (Fp/Fs) as atug of war going on for a satellite withits planet on one side of thegravitational rope and the sun on other.If this ratio is greater than 1 the parentplanet is winning, if less than 1 the sunis winning. You expect all satellites tohave a TOW value greater than 1. If aTOW value is less than 1, the moonwannabe falls away from its parentplanet and into the sun or into its own

orbit around the sun. Dr. Asimovshows that the formula to compute asatelliteÕs TOW value is:

where mp is the mass of the parent, ms

is the mass of the sun, dc is thedistance from the satellite to the sun,and dp is the distance from the parentplanet to its satellite child.

Asimov computed this TOW value forall the planetary satellites for whichdistance and mass data were availableat that timeÑthere were only 22satellites that had both a name andphysical information. (These did notinclude PlutoÕs moon Charon, whichwas not discovered until 1978.) Wewere able to match these computationswithin reasonable limits and alsocompute a TOW for Charon andseveral other newer satellites.

Another factor Asimov used toexamine satellite properties is theRoche limit. The Roche limit was firstdescribed by Edouard Roche in 1848.

Page 14: A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics

SAS INSTITUTE INC.SPRING 1997ISSUE 5

JMPer Cable -14-

It is the closest distance an object cancome to another object without beingpulled apart by tidal forces. If a planetand a moon have identical densities,then the Roche limit is 2.446 times theradius of the planet. The Roche limitfor the Earth is 11,470 miles. If ourMoon ever ventured this close to theearth, it would be pulled apart by tidalforces and the Earth could end uphaving rings. The four gaseous outerplanets do have systems of rings insideof their respective Roche limits. OnJuly 7, 1992, comet Shoemaker-Levy 9broke into 21 pieces when itapproached to within JupiterÕs Rochelimit. On subsequent passes, each ofthe pieces of the comet impactedJupiter.The Roche limit defines the closest amoon can exist to a planet, and theTOW can be used to compute amaximum distance a moon can be fromits parent before it is captured by thesun. For example, MarsÕ Roche limit is5,160 miles above the surface and thedistance defined by its TOW value is15,000 miles. Both Diemos with adistance of 14,569 miles from Mars andPhobos at 5,824 miles fall within thisinterval and can be classified as truesatellites:5,160 <

(both 5,824 and 14,569)< 15,000

Another interesting example isMercury, which has a Roche limit of3,706 and a maximum distance

defined by its TOW value of 2,675,which falls within its Roche limit.This means it is highly improbablefor Mercury to ever have a satellite.

What about Earth and Luna?For starters, the TOW valuecomputed for the Earth and Luna is.455, which implies that Earth islosing its tug of war with the sun tokeep Luna as a satellite. Next, forearth a minimum distance definedby the Roche limit is 9,734 miles, andthe maximum distance defined bythe TOW is 29,000. An earth satelliteshould lie in the small band between9,734 and 29,000 miles from theearthÕs center. But as we all know thedistance of the moon from earth is237,000 miles!

So whatÕs going on here? Accordingto Asimov, if you draw a picture ofthe orbits of the Earth and Moonexactly to scale as they orbit the Sun,you would see that the MoonÕs orbitis everywhere concave toward theSun; it is always falling toward thesun. All the other satellites fall awayfrom the sun through part of theirorbits and are contained by thesuperior pull of their parent. So itseems Luna does not orbit the earthas its satellite, but rather the earthand Luna orbit the sun together as abinary system .

You can find the details of theseobservations and a couple of otherarguments Dr. Asimov uses to

Page 15: A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics

SAS INSTITUTE INC.FALL 1997

ISSUE 5

-15- JMPer Cable

question the cosmic relationshipbetween the earth and its moon in hisarticle.

What about Pluto and Charon?Although Dr. Asimov didnÕt have dataon Pluto and Charon he would havebeen pleased when his neat logicsupported the moonness of Charon;Charon is nearly half the size of Pluto,and its TOW value computes as arespectable 601.66. Further, it is 12,196miles from Pluto, nicely outside PlutoÕsRoche limit of 1,784 miles and insidethe maximum allowable distance of54,619 miles. Nevertheless, astronomerstoday prefer to view Charon and Plutoas a binary system because:¥ They are so close together and so

nearly the same size that they bothorbit the center of mass that liesbetween them, and that center ofmass orbits the sun. Although this istrue for all planets and their moons,the relative sizes of the bodies andthe distances between them placestheir joint center of mass near thecenter of the planet.

¥ Further, they orbit each other withthe same sides permanently facingeach other, which is also like double-planet (binary) systems.

A JMP Look at Moons and PlanetsThis brings us to a JMP view of thesolar system. Exploring the moons dataand plotting the ratio of a child mass toits parent reveals a simple relationshipthat puts both the Earth/Luna and andPluto/Charon pairs into the same

category, which seems to support theastronomers and might give Dr.Asimov food for thought (mosteverything did).ItÕs easy to create a JMP table that hasone row for each moon and eachplanet in the solar system, and usingreadily available information youinclude its mass, giving the top tableshown in Figure A.

Note: In this example mass istransformed by ln(cube root), whichmakes for a more readable plot laterin the discussion.

We want to compare two groups ofmass ratios: the ratios of all planetsand moons to their respective parentbody (call this the Child group) to thegroup containing only the ratios of theplanets to the sun (called the Parentgroup). Note that planetary satellitesall fall only into the Child group andthe sun is only in the Parent group. Butthe planets fall into both categoriesand need to be included in bothgroupsÑas children of the sun andparents to their children.

To compare these two groups of massratios we first stacked the child andparent columns using Tables®Stack.The new stacked column name is body

and the new ID column name is type,as shown in the bottom table inFigure A. There are now a pair ofrows (a child row and a parent row)for each single row in the top table inFigure A. However, the mass rowduplicates the mass of the ÒchildÓ

Page 16: A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics

SAS INSTITUTE INC.SPRING 1997ISSUE 5

JMPer Cable -16-

for each pair. This is fine for all theÒchildÓ rows but is incorrect for theÒparentÓ sun, and also whenever aplanet is in the ÒparentÓ role. So, thecomputation of the correct mass ratiofor each row can be found with a (notfor the faint hearted) calculatorformula, which does the following:¥ For the purpose of the example the

sun is considered its own parentand has mass ratio of one (1).

¥ When body is a planet, compute itsmass ratio to the sun.

¥ When body is a moon, compute themass ratio to its parent planet.

This appears to be a lot to ask of thisdata table, which doesnÕt tell youwhether a body (other than the Sun) isa planet or a moon. However, acalculator formula can figure that outby using the summation operator toscan the values of body and type, andthus find the correct denominator tocompute the ratio column shown in thebottom table of Figure A.

Figure A Table of MassValues (top) andStacked With ComputedMass Ratios (bottom)

The Calculator Cornerarticle in this news-letterexplains how thesummation operatorscans a column to find aspecific value.

The formula used tocompute ratio is shownin Figure C.

49 Rows

4 Cols

ID child parent mass

98 Rows

5 Cols

ID mass type ratiobody

Page 17: A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics

SAS INSTITUTE INC.FALL 1997

ISSUE 5

-17- JMPer Cable

The final step is to plot thechild-parent pairs of ratios.To do this the we usedAnalyze®Fit Y by X with ratio asY and type as X (the one-wayAnova platform). The MatchingVariable option from the Analysispopup menu used the IDcolumn to match the childrento their parents, giving the plotin Figure B.

This plot clearly shows that allthe children look up to theirparents except Charon andLuna.

ReferenceAsimov, Isaac (1975), Of Time, Space, andOther Things, pgs 87-98, Avon Books, New

Figure B Two groups of Mass Ratios

ratio By type

ratio

0.6000

0.7000

0.8000

0.9000

1.0000

child parent

type

Pluto

Sun

Earth

Mars

Jupiter Saturn

UranusNeptune

Pluto

LunaJupiterSaturn

Charon

bodies with satelllites

satellite bodies

Figure C Formula to Compute Mass Ratio

Page 18: A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics

SAS INSTITUTE INC.SPRING 1997ISSUE 5

JMPer Cable -18-

CHECKING THE PROPORTIONALHAZARD ASSUMPTION WITH

KAPLAN-MEIER SURVIVALESTIMATES

by Rolf E. TaffsFood and Drug Administration

The Cox proportional hazards model(Cox, 1972) evaluates treatment,diagnostic, or prognostic factors todetermine the magnitude andsignificance of their effects onpopulation survival or failure time.For each factor (e.g. an exposure levelor a prognostic index) a predictorvariable is used to describe thesubjects according to the factor level towhich each subject or unit belongs.The model can be used in multivariateanalysis to test the hypothesis thatsurvival does not depend on the levelof a treatment or risk factor (Kalbfleishand Prentice, 1980; Christensen, 1987).

Unlike parametric methods, whichcompare regression coefficients underthe constraints of a givenmathematical model, the proportionalhazards model does not require thatthe precise nature of the survivalfunction be known nor that it beconstrained by the assumption of aparticular mathematical form.However, the model assumes that thehazard functions for the levels of agiven factor or treatment areproportional.

This assumption of proportionalitymust be met for the analysis to be

valid. The proportional hazard can bewritten as the ratio of cumulativehazard functions:

LA(t)

LB(t)=eb·d

where, for the factor being analyzed,b is the regression coefficient and dis the difference between subjectsreceiving treatment A or B. Underthe assumption of proportionalhazards, the correspondingproportion between the cumulativeintegrated hazards would be thesame, and plots of the logarithm ofthe cumulative hazardscorresponding to values differing byd should be parallel andapproximately b¥d apart vertically.This allows the assumption to beexamined graphically when a formalstatistical test is unavailable.

The hazard functions needed toperform the graphic comparison canbe calculated from the Kaplan-Meier(product-limit) survival estimatesS(t) obtained from the data accordingto the formula:

L(t) = ±ln(S(t))

The Kaplan-Meier Method in theSurvival platform in JMP generates theproduct-limit survival estimatesneeded to examine the proportionalhazards assumption.The sample data shown in Figure Ashow frequencies (Freq) of time tofailure (FT) for 379 units observed for

Page 19: A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics

SAS INSTITUTE INC.FALL 1997

ISSUE 5

-19- JMPer Cable

a period of seven days. Units that didnot fail during the observation periodare censored. In this example, theassumption of proportionality fortreatments A and B are graphicallyexamined.

Begin by choosing Analyze®Survival. Inthe Kaplan-Meier Method window,select FT as the Time variable, trt(treatment) as the Grouping variable,and the Censor and Freq columns as theCensor and Frequency variables. ClickOK for survival rate calculations and asurvival plot.

Next, select the Save Estimates option inthe $ border popup window to

Figure A Example Survival Data

FT trt censor

4 Cols

15 Rows freq

generate a new data table that lists thecumulative survival probabilities foreach treatment and for the combineddata, as shown in Figure B.Now you overlay the values in eachtreatment group to graphically

compare them. To do this you needfirst to split the table by the columnyou want to plot, log(–log(Survival)).

To split the table, chooseTables®Split Columns and complete theSplit Columns dialog as follows.

Figure B New Data Table With Survival Estimates

log(–log(Surv)) FT trt Survival

11 Cols

23 Rows

• log(–log(Survival)) asthe Split variable,

• trt as Col ID• FT as the Group

variable.Using FT as a Groupvariable causes the splitoperation to look at FTvalues within treatmentlevels and create a rowfor each unique valuefound.

Page 20: A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics

SAS INSTITUTE INC.SPRING 1997ISSUE 5

JMPer Cable -20-

Also, click the Drop All radio button onthe dialog to keep only the variablesneeded for the overlay plot.

When you click OK JMP splits thelog(–log(Survival)) column into threecolumns as defined by the levels ofthe trt variable, and creates the newuntitled table in Figure C. For theoverlay plot, assign X and Y roles asshown.

Now choose Graph®Overlay Plot andclick OK to see the plot in Figure D. Bydefault, the Connect option for theOverlay Plot is in effect and thetreatment groups have the markersidentified in the plot legend.

In Figure D you can visually examinethe relationship between thetreatment levels as defined by thetreatment groups. The group curvesshould be parallel if the proportionalhazards assumption is met. Curves

Figure C Data Table Reshaped to OverlaySurvival Estimates

that intersect or are decidedlynonparallel indicate that theassumption may not be met, in whichcase the proportional hazardsanalysis may be invalid. This exampleshows the curves to be nearly parallel(equidistant vertically at each pointon the FT axis), indicating furtheranalysis using a proportional hazardsmodel is appropriate.

Figure D Comparison of Survival Curves forTwo Groups

-5

-4

-3

-2

-1

0

1

0 1 2 3 4 5 6 7 8

FT

A

B

ReferencesCox, D.R. (1972, Regression Models in Life tables

(with discussion), J.R. Statistical Society B 34:187-220

Kalbfleisch, J.D. and Prentice, R.L. (1980), TheStatistical Analysis of Failure Time Data, NewYork: John Wiley and Sons.

Christensen, E. (1987), Multivariate Survival AnalysisUsing CoxÕs Regression Model, Hepatology7:1346-1358.

Page 21: A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics

SAS INSTITUTE INC.FALL 1997

ISSUE 5

-21- JMPer Cable

AN ALTERNATIVE TO JITTERING

The Jitter option was recently added tothe Outlier Box plot in the

To label points instead of jitteringrequires several steps:

1) First, use Tables®Group/Summary tosummarize the data by the X and Yvariables you are using (height andsex in this example). The values inthe N Rows summary column tellsthe number of rows in the sourcetable with the same sex and height

Distribution platform and to theNominal by Continuous scatterplot inthe Fit Y by X platform. The Jitter optionadds random horizontal jitter so thatyou can see points that overlay on thesame Y value. The horizontaladjustment of points varies from .375 to.625 with a 4*(UniformÐ.5)**5distribution. This is helpful when youare plotting a large number of values.For smaller numbers of points, simplylabeling a point with the number ofvalues it represents is an alternative tojittering.

The following example uses the Heightand sex variables in the BIG CLASSsample data table. Use Analyze®Fit Y byX with height as Y and sex as X to see theleft-hand plot in Figure A.

To see the second plot in Figure A,choose the Jitter option from the Displaypopup menu for that plot. The jitteredpoints indicate that the same heightvalue occurs multiple times (for bothmales and females).

values (Figure B).

Figure B Summary Table With LabelColumn

2) Before doing anything else, useTables®Subset to make a duplicateof the summary table. You do thisto have a copy of the summaryinformation that is not linked to thesource table.

3) Next, duplicate the N Rows column.The easiest way to do that is tohighlight N Rows and copy it to theclipboard. Then create a newcolumn and paste the values into it.We named this new column label(Figure B).

Page 22: A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics

SAS INSTITUTE INC.SPRING 1997ISSUE 5

JMPer Cable -22-

4) Use the modeling type popup menuat the top of the columns to assignroles as follows:¥ height to Y (Y)¥ N Rows to Freq (F)¥ label to Label (L).Now the same values (N Rows andlabel) act as both frequencies andlabels.

5) The last step (almost) is to create acolumn that labels and marks all therows whose value of N Rows is greaterthan 1. To do this create a newcolumn and specify Row State

as its Data Type and Formula as itsData Source. Then use thefollowing calculator formula toassign row state constants:

Ç Èlabel(N Rows > 1), marker(N Rows > 1)

Use the Copy to Row State commandin the popup menu at the top of therow state column to activate therow states, as shown in Figure C.

With this table active selectAnalyze®Fit Y by X to see the right-handtable in Figure A.

Figure A Comparison of default points (left-hand plot) with jittered points (middle plot) andlabeled points (right-hand plot).

heig

ht

50

55

60

65

70

F M

sex

heig

ht

50

55

60

65

70

F M

sex

heig

ht

50

55

60

65

70

F M

sex

234

2

23323

.

Page 23: A NEW ANGLE ON MATCHED In This Issue - JMPA NEW ANGLE ON MATCHED PAIRS John Sall Senior Vice President SAS Institute Inc. As we wrote up the paired t-test for the JMP Start Statistics

SAS INSTITUTE INC.FALL 1997

ISSUE 5

-23- JMPer Cable

EDITORAnn Lehman

CONTRIBUTORSMichael HechtColleen JenkinsNicole Hill JonesAnn LehmanJohn SallRolf E. Taffs

If you have questions or commentsabout JMPer Cable write to

JMPer CableSAS Institute Inc.SAS Campus Drive

Cary, NC 27513

© Copyright 1997 SAS Institute Inc.All rights reserved.JMPer Cable is sent only to JMP userswho are registered with SASInstitute.For more information on JMP, or toorder a copy, contactSAS Institute, JMP Salesphone: 919-677-8000 x 5071FAX: 919-677-8224You can also browse our web site athttp://www.jmpdiscovery.com

SAS, JMPer Cable, and JMP are registeredtrademarks of SAS Institute Inc. Other brandand product names are registered trademarksor trademarks of their respective companies.