Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from...

31
**************************************************************************************************** **************************************************************************************************** /* File: HIVPediatrics_Syntax02_Data merge and analysis.do Author: Ken Tapia University of Washington/Fred Hutch Center For AIDS Research (CFAR) Biometrics Core Date: 2019_08aug_05 For actual Stata files (rather than .pdf versions), please contact [email protected] . All data is simulated. **************************************************************************************************** 0) My data folders have this structure: Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs the following steps: 4) Prepare files for incorporation with either of two files: (1) Main (1 record/participant), (2) Longitudinal (1 record/visit). Subsequent steps merge these two files together to create additional variables (e.g. 1st pre-HAART viral load), and analysis follows. 5) Append and merge to create the Main and Longitudinal files. 6) Create other analysis variables. 7) Example descriptives, figures, and analyses. */ **************************************************************************************************** **************************************************************************************************** clear cd "C:\Users\ktapia\OneDrive - UW(1)\CFAR\Projects\HIV Pediatrics\Data" set more off

Transcript of Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from...

Page 1: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

**************************************************************************************************** **************************************************************************************************** /* File: HIVPediatrics_Syntax02_Data merge and analysis.do Author: Ken Tapia University of Washington/Fred Hutch Center For AIDS Research (CFAR) Biometrics Core Date: 2019_08aug_05 For actual Stata files (rather than .pdf versions), please contact [email protected] . All data is simulated. **************************************************************************************************** 0) My data folders have this structure: Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs the following steps: 4) Prepare files for incorporation with either of two files: (1) Main (1 record/participant), (2) Longitudinal (1 record/visit). Subsequent steps merge these two files together to create additional variables (e.g. 1st pre-HAART viral load), and analysis follows. 5) Append and merge to create the Main and Longitudinal files. 6) Create other analysis variables. 7) Example descriptives, figures, and analyses. */ **************************************************************************************************** **************************************************************************************************** clear cd "C:\Users\ktapia\OneDrive - UW(1)\CFAR\Projects\HIV Pediatrics\Data" set more off

Page 2: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

**************************************************************************************************** **************************************************************************************************** /*4) Prepare files for incorporation with either of two files: (1) Main (1 record/participant), (2) Longitudinal (1 record/visit). Subsequent steps merge these two files together to create additional variables (e.g. 1st pre-HAART viral load), and analysis follows. */ **************************************************************************************************** **************************************************************************************************** ******************************************************************************************** *RANDOMIZATION: PREPARE FOR MERGING WITH MAIN. ******************************************************************************************** use "OutputData\Stata\Randomization0.dta", clear ********************************************** sort study_id by study_id : gen indexn= _n by study_id : gen indexN= _N tab indexN if ( indexn==1 ), m drop indexn indexN ********************************************** codebook randarm_rand tab randarm_rand, m ********************************************** rename redcap_event_name redcap_event_name_rand ********************************************** order study_id sort study_id save "OutputData\Stata\Randomization1.dta", replace use "OutputData\Stata\Randomization1.dta", clear *Merge with Main file.

Page 3: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

******************************************************************************************** *VISITS_ENROLLMENT: PREPARE FOR MERGING WITH MAIN. ******************************************************************************************** use "OutputData\Stata\Visits_Enrollment0.dta", clear *********************** *Ensure 1 record per person. * indexN: Tracks the total number of observations per subject (ie the variable after “by”) * indexn: Tracks from 1 to indexN * By tabulating indexN once per subject (ie if indexn==1) we can see how many records each ID has. sort study_id visitdate_en by study_id : gen indexn= _n by study_id : gen indexN= _N tab indexN if ( indexn==1 ), m list study_id visitdate_en if ( indexN>1 ) drop indexn indexN ********************************************** rename redcap_event_name redcap_event_name_en *********************** sort study_id save "OutputData\Stata\Visits_Enrollment1.dta", replace use "OutputData\Stata\Visits_Enrollment1.dta", clear *Merge with Main file.

Page 4: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

******************************************************************************************** *VISITS_ENROLLMENT: PREPARE FOR MERGING WITH LONGITUDINAL. ******************************************************************************************** use "OutputData\Stata\Visits_Enrollment0.dta", clear *********************** *DROP ENROLLMENT-ONLY VARIABLES. *********************** drop scid_en ageyrs_en sex_en *********************** *Rename variables so that they match the names in the followup file. rename *_en * rename en_indata visits_enroll_indata *********************** order study_id visitdate sort study_id visitdate save "OutputData\Stata\Visits_Enrollment_Longit.dta", replace use "OutputData\Stata\Visits_Enrollment_Longit.dta", clear *Append with follow-up visit file, and then merge with Longitudinal file.

Page 5: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

******************************************************************************************** *VISITS_FOLLOW-UP: PREPARE FOR MERGING WITH LONGITUDINAL. ******************************************************************************************** use "OutputData\Stata\Visits_Followup0.dta", clear *********************** *Ensure 1 record per visit date. * indexN: Tracks the total number of observations per visit date (ie “ptid visitdate”, the variables after “by”) * indexn : Tracks from 1 to indexN sort study_id visitdate by study_id visitdate: gen indexn= _n by study_id visitdate: gen indexN= _N tab indexN if ( indexn==1 ), m list study_id visitdate if ( indexN>1 ) *keep if ( indexn==1 ) drop indexn indexN *********************** rename *_fu * gen visits_followup_indata=1 *********************** order study_id visitcode sort study_id visitcode save "OutputData\Stata\Visits_Followup1.dta", replace use "OutputData\Stata\Visits_Followup1.dta", clear *Merge with Longitudinal file.

Page 6: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

******************************************************************************************** *LABS (ENROLLMENT+FOLLOWUP): PREPARE FOR MERGING WITH LONGITUDINAL. ******************************************************************************************** use "OutputData\Stata\Labs0.dta", clear *********************** *Ensure 1 record per visit date. * indexN: Tracks the total number of observations per visit date (ie “ptid visitdate”, the variables after “by”) * indexn: Tracks from 1 to indexN sort study_id visitdate_lr by study_id visitdate_lr: gen indexn= _n by study_id visitdate_lr: gen indexN= _N tab indexN if ( indexn==1 ), m list ptid visitdate_lr if ( indexN>1 ) *keep if ( indexn==1 ) drop indexn indexN sort study_id visitdate_lr by study_id : gen indexn= _n by study_id : gen indexN= _N tab indexN if ( indexn==1 ), m drop indexn indexN ********************************************** rename redcap_event_name redcap_event_name_lr rename visitcode_lr visitcode gen labs_indata=1 ********************************************** order study_id visitcode sort study_id visitcode save "OutputData\Stata\Labs1.dta", replace use "OutputData\Stata\Labs1.dta", clear *Merge with Longitudinal file.

Page 7: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

**************************************************************************************************** **************************************************************************************************** *5) Append and merge to create the Main and Longitudinal files. **************************************************************************************************** **************************************************************************************************** ******************************************************************************************** *MAIN1 (1 record/person). ******************************************************************************************** use "OutputData\Stata\Visits_Enrollment1.dta", clear sort study_id merge 1:1 study_id using "OutputData\Stata\Randomization1.dta" tab _merge, m * _merge == 1: Identifies records coming only from the first/master file (ie main_enrollment1) * _merge == 2: Identifies records coming only from the second/using file (ie caregiver _enrollment1) * _merge == 3: Identifies records coming from both files) list study_id redcap_event_name_en visitcode_en visitdate_en randdate_rand if ( _merge==1 ) list study_id redcap_event_name_en visitcode_en visitdate_en randdate_rand if ( _merge==2 ) *Identify fixes that need to be made. Go back to the import statements, and actually make the changes there. *Then, rerun the syntax01_import files and continue here again. drop _merge *********************** sort study_id save "OutputData\Stata\Main1.dta", replace use "OutputData\Stata\Main1.dta", clear

Page 8: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

******************************************************************************************** *LONGITUDINAL1 (1 record/visit). ******************************************************************************************** use "OutputData\Stata\Visits_Enrollment_Longit.dta", clear append using "OutputData\Stata\Visits_Followup1.dta" sort study_id visitdate rename visitdate visitdate_visits *********************** *Tabulate the number of records each subject has. sort study_id visitcode by study_id visitcode: gen indexn= _n by study_id visitcode: gen indexN= _N tab indexN if ( indexn==1 ), m list study_id visitcode visitdate if ( indexN>1 ) drop indexn indexN sort study_id visitdate by study_id visitdate: gen indexn= _n by study_id visitdate: gen indexN= _N tab indexN if ( indexn==1 ), m list study_id visitcode visitdate if ( indexN>1 ) drop indexn indexN *********************** sort study_id visitcode merge 1:1 study_id visitcode using "OutputData\Stata\Labs1.dta" tab _merge, m list ptid visitcode visitdate_visits visitdate_lr if ( _merge==1 ) list ptid visitcode visitdate_visits visitdate_lr if ( _merge==2 ) drop _merge tab redcap_event_name redcap_event_name_lr, m *********************** *Merge other files (ex viral load, resistance data). *********************** *********************** *VISIT DATE AGGREGATE. *********************** gen visitdate = . replace visitdate = visitdate_visits if ( visitdate==. & visitdate_visits~=. ) replace visitdate = visitdate_lr if ( visitdate==. & visitdate_lr~=. ) format visitdate %dM_d,_CY sum visitcode visitdate visitdate_visits visitdate_lr

Page 9: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

*********************** gen viralload_log = log10(viralload_lr) order study_id - viralload_lr viralload_log frmcomplby_lr - lab_results_lr_complete label variable viralload_log "Viral load (log10 copies/ml)" *********************** order study_id visitcode redcap_event_name visitdate /// crfversion ptid visitdate_visits sympfev - visits_followup_indata /// redcap_event_name_lr - labs_indata *********************** sort study_id visitcode save "OutputData\Stata\Longitudinal1.dta", replace use "OutputData\Stata\Longitudinal1.dta", clear

Page 10: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

******************************************************************************************** *6) Create other analysis variables. ******************************************************************************************** ******************************************************************************************** *VISITS: WIDE, TO MERGE WITH MAIN. ******************************************************************************************** use "OutputData\Stata\Longitudinal1.dta", clear *Sort in descending order (latest at top, earliest at bottom). gsort study_id visitcode by study_id : gen indexn= _n by study_id : gen indexN= _N by study_id : gen visitdate_last = visitdate[1] format visitdate_last %dM_d,_CY tab indexN if ( indexn==1 ), m drop indexn indexN keep study_id visitdate_last visitdate visitcode cd4cnt_lr viralload_lr viralload_log reshape wide visitdate cd4cnt_lr viralload_lr viralload_log, i(study_id) j(visitcode) order study_id visitdate_last sort study_id save "OutputData\Stata\visits_wide.dta", replace use "OutputData\Stata\visits_wide.dta", clear ******************************************************************************************** *VIRAL LOAD: LAST TEST. ******************************************************************************************** use "OutputData\Stata\Longitudinal1.dta", clear keep if ( viralload_lr ~=. ) sort study_id visitcode by study_id : gen indexn= _n by study_id : gen indexN= _N tab indexN if ( indexn==1 ), m keep if ( indexn==indexN ) drop indexn indexN keep study_id visitcode visitdate rename visitcode visitcode_vload_test_last rename visitdate visitdate_vload_test_last sort study_id save "OutputData\Stata\vload_test_last.dta", replace use "OutputData\Stata\vload_test_last.dta", clear

Page 11: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

******************************************************************************************** *VIRAL LOAD: TIME TO FIRST SUPPRESSION. ******************************************************************************************** use "OutputData\Stata\Longitudinal1.dta", clear keep if ( viralload_lr <= 50 ) sort study_id visitcode by study_id : gen indexn= _n by study_id : gen indexN= _N tab indexN if ( indexn==1 ), m keep if ( indexn==1 ) drop indexn indexN keep study_id visitcode visitdate rename visitcode visitcode_vload_le50_first rename visitdate visitdate_vload_le50_first sort study_id save "OutputData\Stata\vload_le50_first.dta", replace use "OutputData\Stata\vload_le50_first.dta", clear

Page 12: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

******************************************************************************************** *MAIN2. ******************************************************************************************** use "OutputData\Stata\Main1.dta", clear sort study_id merge 1:1 study_id using "OutputData\Stata\visits_wide.dta" tab _merge, m drop _merge sort study_id merge 1:1 study_id using "OutputData\Stata\vload_test_last.dta" tab _merge, m drop _merge sort study_id merge 1:1 study_id using "OutputData\Stata\vload_le50_first.dta" tab _merge, m drop _merge *************************************************************************************************** *HAD VLle50: TIME TO FIRST. *************************************************************************************************** *20 participants. All start with VL>50. 8 attain VL<50 (1 at m6, 7 at m12). tab visitcode_vload_le50_first, m sum study_id viralload_lr0 viralload_lr6 viralload_lr12 sum study_id viralload_lr0 viralload_lr6 viralload_lr12 visitcode_vload_le50_first visitdate_vload_le50_first if ( viralload_lr0~=. & viralload_lr0> 50 ) sum study_id viralload_lr0 viralload_lr6 viralload_lr12 visitcode_vload_le50_first visitdate_vload_le50_first if ( viralload_lr0~=. & viralload_lr0> 50 & (viralload_lr6<=50 | viralload_lr12<=50) ) gen TtestVLle50_set = 0 replace TtestVLle50_set = 1 if ( viralload_lr0~=. & viralload_lr0> 50 ) tab TtestVLle50_set, m ******************************************************* gen Tfirst_hadVLle50_fuevent = . label define l_Tfirst_hadVLle50_fuevent 0 "0: Never VL<=50" 1 "1: Always VL>50" label value Tfirst_hadVLle50_fuevent l_Tfirst_hadVLle50_fuevent replace Tfirst_hadVLle50_fuevent = -1 if ( TtestVLle50_set==1 ) replace Tfirst_hadVLle50_fuevent = 0 if ( visitcode_vload_le50_first==. ) replace Tfirst_hadVLle50_fuevent = 1 if ( visitcode_vload_le50_first~=. ) tab Tfirst_hadVLle50_fuevent, m

Page 13: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

******************************************************* gen Tfirst_hadVLle50_fuvisit = . replace Tfirst_hadVLle50_fuvisit = visitcode_vload_test_last if ( Tfirst_hadVLle50_fuevent==0 ) replace Tfirst_hadVLle50_fuvisit = visitcode_vload_le50_first if ( Tfirst_hadVLle50_fuevent==1 ) tab Tfirst_hadVLle50_fuvisit Tfirst_hadVLle50_fuevent, m tab Tfirst_hadVLle50_fuvisit visitcode_vload_le50_first if ( Tfirst_hadVLle50_fuevent==0 ), m tab Tfirst_hadVLle50_fuvisit visitcode_vload_le50_first if ( Tfirst_hadVLle50_fuevent==1 ), m ******************************************************* gen Tfirst_hadVLle50_fudate = . format Tfirst_hadVLle50_fudate %dM_d,_CY replace Tfirst_hadVLle50_fudate = visitdate_vload_test_last if ( Tfirst_hadVLle50_fuevent==0 ) replace Tfirst_hadVLle50_fudate = visitdate_vload_le50_first if ( Tfirst_hadVLle50_fuevent==1 ) ******************************************************* *hadVLle50: TIME. ******************************************************* gen Tfirst_hadVLle50_futime_days = ( Tfirst_hadVLle50_fudate - visitdate_en ) gen Tfirst_hadVLle50_futime_months = ( Tfirst_hadVLle50_futime_days )/30.4 gen Tfirst_hadVLle50_futime_years = ( Tfirst_hadVLle50_futime_days )/365.25 sum study_id visitdate_en visitcode_vload_le50_first visitcode_vload_test_last viralload_lr0 viralload_lr6 viralload_lr12 list study_id visitdate_en visitcode_vload_le50_first visitcode_vload_test_last viralload_lr0 viralload_lr6 viralload_lr12 if ( Tfirst_hadVLle50_fuevent==0 ) list study_id visitdate_en visitcode_vload_le50_first visitcode_vload_test_last viralload_lr0 viralload_lr6 viralload_lr12 if ( Tfirst_hadVLle50_fuevent==1 ) *The months goes up to 38.6: 2 had >37. ********************************************** *I'M ONLY KEEPING THE VARIABLES I NEED, BUT COULD KEEP MORE. ********************************************** order study_id /// visitdate_en scid_en ageyrs_en sex_en weight_en height_en en_indata /// randdate_rand randarm_rand rand_indata /// visitdate_last visitdate0 - viralload_log12 /// visitcode_vload_test_last visitdate_vload_test_last /// visitcode_vload_le50_first visitdate_vload_le50_first /// TtestVLle50_set-Tfirst_hadVLle50_futime_years keep study_id /// visitdate_en scid_en ageyrs_en sex_en weight_en height_en en_indata /// randdate_rand randarm_rand rand_indata /// visitdate_last visitdate0 - viralload_log12 /// visitcode_vload_test_last visitdate_vload_test_last /// visitcode_vload_le50_first visitdate_vload_le50_first /// TtestVLle50_set-Tfirst_hadVLle50_futime_years *********************** sort study_id save "OutputData\Stata\Main2.dta", replace use "OutputData\Stata\Main2.dta", clear

Page 14: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

******************************************************************************************** *LONGITUDINAL2. ******************************************************************************************** use "OutputData\Stata\Main2.dta", clear sort study_id merge 1:m study_id using "OutputData\Stata\Longitudinal1.dta" sort study_id visitdate tab _merge, m keep if ( _merge==3 ) drop _merge *********************** *TIME SINCE BASELINE/RANDOMIZATION. *********************** gen time_days = ( visitdate - randdate_rand ) gen time_months = ( time_days )/30.4 tabstat time_days , s(n sd min p25 median mean p75 max) c(s) lo by(visitcode) tabstat time_months, s(n sd min p25 median mean p75 max) c(s) lo by(visitcode) *********************** *DROP THE VARIABLES FROM MAIN2. (CAN ALWAYS MERGE THEM BACK IN.) *********************** drop visitdate_en - visitdate12 ********************************************** *I'M ONLY KEEPING THE VARIABLES I NEED, BUT COULD KEEP MORE. ********************************************** order study_id visitcode redcap_event_name visitdate labs_indata time_days time_months /// visits_enroll_indata visits_followup_indata labs_indata /// weight height /// cd4cnt_lr cd4cnt_lr cd8cnt_lr cd4to8ratio_lr wbc_lr rbc_lr hb_lr viralload_lr viralload_log keep study_id visitcode redcap_event_name visitdate labs_indata time_days time_months /// visits_enroll_indata visits_followup_indata labs_indata /// weight height /// cd4cnt_lr cd4cnt_lr cd8cnt_lr cd4to8ratio_lr wbc_lr rbc_lr hb_lr viralload_lr viralload_log *********************** sort study_id visitcode save "OutputData\Stata\Longitudinal2.dta", replace use "OutputData\Stata\Longitudinal2.dta", clear

Page 15: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

******************************************************************************************** ******************************************************************************************** *7) Example descriptives, figures, and analyses. ******************************************************************************************** ******************************************************************************************** ******************************************************************************************** *CD4 count. ******************************************************************************************** use "OutputData\Stata\Main2.dta", clear ******************************************************************************************** *CHOOSE THE SUBJECTS TO ANALYZE (ie Implement inclusion/exclusion criteria, and begin filling in analysis Flow Diagram) sum visitdate_en cd4cnt_lr0 cd4cnt_lr6 cd4cnt_lr12 sum visitdate_en cd4cnt_lr0 cd4cnt_lr6 cd4cnt_lr12 if ( cd4cnt_lr0~=. & (cd4cnt_lr6~=. | cd4cnt_lr12~=.) ) *Keep the subjects who have baseline preART CD4%: keep if ( cd4cnt_lr0~=. ) *Keep the subjects who at least one followup CD4%: keep if ( cd4cnt_lr6~=. | cd4cnt_lr12~=. ) sum visitdate_en cd4cnt_lr0 cd4cnt_lr6 cd4cnt_lr12 tab randarm_rand, m ******************************************************************************************** gen cd4cnt_00to06 = ( cd4cnt_lr6 - cd4cnt_lr0 ) gen cd4cnt_00to12 = ( cd4cnt_lr12 - cd4cnt_lr0 ) tabstat cd4cnt_lr0 cd4cnt_lr6 cd4cnt_00to06 cd4cnt_lr12 cd4cnt_00to12 , s(n sd min p25 median mean p75 max) c(s) lo by(randarm_rand) ********************************************* *COMPARE BASELNE AND FOLLOWUP VALUES, BETWEEN ARMS. ********************************************* ttest cd4cnt_lr0, by(randarm_rand) ranksum cd4cnt_lr0, by(randarm_rand) ttest cd4cnt_lr6, by(randarm_rand) ranksum cd4cnt_lr6, by(randarm_rand) ttest cd4cnt_lr12, by(randarm_rand) ranksum cd4cnt_lr12, by(randarm_rand)

Page 16: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

********************************************* *COMPARE BASELNE AND FOLLOWUP VALUES, WITHIN ARMS. (PAIRED TESTS) ********************************************* ttest cd4cnt_lr0 = cd4cnt_lr6 if ( randarm_rand==0 ) signrank cd4cnt_lr0 = cd4cnt_lr6 if ( randarm_rand==0 ) ttest cd4cnt_lr0 = cd4cnt_lr12 if ( randarm_rand==0 ) signrank cd4cnt_lr0 = cd4cnt_lr12 if ( randarm_rand==0 ) ****** ttest cd4cnt_lr0 = cd4cnt_lr6 if ( randarm_rand==1 ) signrank cd4cnt_lr0 = cd4cnt_lr6 if ( randarm_rand==1 ) ttest cd4cnt_lr0 = cd4cnt_lr12 if ( randarm_rand==1 ) signrank cd4cnt_lr0 = cd4cnt_lr12 if ( randarm_rand==1 ) ********************************************* *CALCULATE CHANGE, COMPARE BETWEEN ARMS. ********************************************* ttest cd4cnt_00to06, by(randarm_rand) ranksum cd4cnt_00to06, by(randarm_rand) ttest cd4cnt_00to12, by(randarm_rand) ranksum cd4cnt_00to12, by(randarm_rand)

Page 17: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

******************************************************************************************** sort study_id merge 1:m study_id using "OutputData\Stata\Longitudinal2.dta" sort study_id visitdate tab _merge, m keep if ( _merge==3 ) drop _merge ******************************************************************************************** *CHOOSE THE RECORDS TO ANALYZE: *Here we keep the records with CD4% values, over the first 12 months post-randomization (baseline). keep if (cd4cnt_lr ~=. ) keep if ( visitdate ~=. ) keep if ( time_months< 13 ) ******************************************************************************************** *DESCRIPTIVES AND BOXPLOTS. ******************************************************************************************** tabstat cd4cnt_lr if ( randarm_rand==0 ), s(n sd min p25 median mean p75 max) c(s) lo by(visitcode) tabstat cd4cnt_lr if ( randarm_rand==1 ), s(n sd min p25 median mean p75 max) c(s) lo by(visitcode) graph box cd4cnt_lr, over(visitcode) by(randarm_rand)

Page 18: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

******************************************************************************************** *LOWESS AND LINEAR FIT. ******************************************************************************************** sum cd4cnt_lr twoway /// (scatter cd4cnt_lr visitcode if ( randarm_rand==0 ), msymbol(O) mlcolor(navy) mfcolor(none) ) /// (lowess cd4cnt_lr visitcode if ( randarm_rand==0 ), lpattern(solid) lwidth(thick) lcolor(navy) ) /// (lfit cd4cnt_lr visitcode if ( randarm_rand==0 ), lpattern(dash) lwidth(thick) lcolor(navy) ) /// (scatter cd4cnt_lr visitcode if ( randarm_rand==1 ), msymbol(O) mlcolor(maroon) mfcolor(none) ) /// (lowess cd4cnt_lr visitcode if ( randarm_rand==1 ), lpattern(solid) lwidth(thick) lcolor(maroon) ) /// (lfit cd4cnt_lr visitcode if ( randarm_rand==1 ), lpattern(dash) lwidth(thick) lcolor(maroon) ) /// , /// ylabel(100(100)900, axis(1) angle(0)) ytick(100(100)900, axis(1)) /// yline(25, lwidth(vvthin) lcolor(black) ) /// ytitle("CD4 count", axis(1)) /// xscale(range(0 12)) xlabel(0(6)12) xmtick(0(6)12) /// xline(0 6 12, lwidth(vvthin) lcolor(gray) lpattern(shortdash) ) /// xline(0.0, lwidth(medium) lcolor(black) ) /// xtitle("Time (months)") /// title("Figure 1") /// legend(off) /// graphregion(color(white)) scale(1.1) graph export "C:\Users\ktapia\OneDrive - UW(1)\CFAR\Projects\HIV Pediatrics\Documents\Figures\Figure1_cd4cnt_Lowess.pdf", as(pdf) replace

Page 19: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

******************************************************************************************** *CONNECT LINES FOR EACH SUBJECT. ******************************************************************************************** tab study_id if ( randarm_rand==0 ), m tab study_id if ( randarm_rand==1 ), m twoway /// (connect cd4cnt_lr visitcode if ( study_id==9900000001 ), msymbol(O) mlcolor(navy) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(navy) sort yaxis(1) ) /// (connect cd4cnt_lr visitcode if ( study_id==9900000003 ), msymbol(O) mlcolor(navy) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(navy) sort yaxis(1) ) /// (connect cd4cnt_lr visitcode if ( study_id==9900000005 ), msymbol(O) mlcolor(navy) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(navy) sort yaxis(1) ) /// (connect cd4cnt_lr visitcode if ( study_id==9900000007 ), msymbol(O) mlcolor(navy) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(navy) sort yaxis(1) ) /// (connect cd4cnt_lr visitcode if ( study_id==9900000009 ), msymbol(O) mlcolor(navy) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(navy) sort yaxis(1) ) /// (connect cd4cnt_lr visitcode if ( study_id==9900000011 ), msymbol(O) mlcolor(navy) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(navy) sort yaxis(1) ) /// (connect cd4cnt_lr visitcode if ( study_id==9900000013 ), msymbol(O) mlcolor(navy) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(navy) sort yaxis(1) ) /// (connect cd4cnt_lr visitcode if ( study_id==9900000015 ), msymbol(O) mlcolor(navy) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(navy) sort yaxis(1) ) /// (connect cd4cnt_lr visitcode if ( study_id==9900000017 ), msymbol(O) mlcolor(navy) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(navy) sort yaxis(1) ) /// (connect cd4cnt_lr visitcode if ( study_id==9900000019 ), msymbol(O) mlcolor(navy) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(navy) sort yaxis(1) ) ///

(lowess cd4cnt_lr visitcode if ( randarm_rand==0 ), lpattern(solid) lwidth(vthick) lcolor(navy) ) /// (connect cd4cnt_lr visitcode if ( study_id==9900000002 ), msymbol(O) mlcolor(maroon) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(maroon) sort yaxis(1) ) /// (connect cd4cnt_lr visitcode if ( study_id==9900000004 ), msymbol(O) mlcolor(maroon) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(maroon) sort yaxis(1) ) /// (connect cd4cnt_lr visitcode if ( study_id==9900000006 ), msymbol(O) mlcolor(maroon) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(maroon) sort yaxis(1) ) /// (connect cd4cnt_lr visitcode if ( study_id==9900000008 ), msymbol(O) mlcolor(maroon) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(maroon) sort yaxis(1) ) /// (connect cd4cnt_lr visitcode if ( study_id==9900000010 ), msymbol(O) mlcolor(maroon) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(maroon) sort yaxis(1) ) /// (connect cd4cnt_lr visitcode if ( study_id==9900000012 ), msymbol(O) mlcolor(maroon) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(maroon) sort yaxis(1) ) /// (connect cd4cnt_lr visitcode if ( study_id==9900000014 ), msymbol(O) mlcolor(maroon) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(maroon) sort yaxis(1) ) /// (connect cd4cnt_lr visitcode if ( study_id==9900000016 ), msymbol(O) mlcolor(maroon) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(maroon) sort yaxis(1) ) /// (connect cd4cnt_lr visitcode if ( study_id==9900000018 ), msymbol(O) mlcolor(maroon) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(maroon) sort yaxis(1) ) /// (connect cd4cnt_lr visitcode if ( study_id==9900000020 ), msymbol(O) mlcolor(maroon) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(maroon) sort yaxis(1) ) ///

(lowess cd4cnt_lr visitcode if ( randarm_rand==1 ), lpattern(solid) lwidth(vthick) lcolor(maroon) ) /// , /// ylabel(100(100)900, axis(1) angle(0)) ytick(100(100)900, axis(1)) /// yline(25, lwidth(vvthin) lcolor(black) ) /// ytitle("CD4 count", axis(1)) /// xscale(range(0 12)) xlabel(0(6)12) xmtick(0(6)12) /// xline(0 6 12, lwidth(vvthin) lcolor(gray) lpattern(shortdash) ) /// xline(0.0, lwidth(medium) lcolor(black) ) /// xtitle("Time (months)") /// title("Figure 2") /// legend(off) /// graphregion(color(white)) scale(1.1) graph export "C:\Users\ktapia\OneDrive - UW(1)\CFAR\Projects\HIV Pediatrics\Documents\Figures\Figure2_cd4cnt_Connected.pdf", as(pdf) replace

Page 20: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

******************************************************************************************** *VIRAL LOAD: TIME TO SUPPRESSION (VL<=50). ******************************************************************************************** use "OutputData\Stata\Main2.dta", clear ******************************************************* *USING DISCRETE TIME. Here I'm not doing a formal discrete time analysis; I'm just using the visitcode variable. ******************************************************* /* . tab Tfirst_hadVLle50_fuvisit Tfirst_hadVLle50_fuevent, m

Tfirst_had | Tfirst_hadVLle50_fuev

VLle50_fuv | ent

isit | 0: Never 1: Always | Total

-----------+----------------------+----------

6 | 0 1 | 1

12 | 12 7 | 19

-----------+----------------------+----------

Total | 12 8 | 20

. stptime if ( randarm_rand==0 ) , per(100)

failure _d: Tfirst_hadVLle50_fuevent

analysis time _t: Tfirst_hadVLle50_fuvisit

id: study_id

Cohort | person-time failures rate [95% Conf. Interval]

-----------+-----------------------------------------------------------

total | 120 0 0 . .

--> No events in Control arm.

. stptime if ( randarm_rand==1 ) , per(100)

failure _d: Tfirst_hadVLle50_fuevent

analysis time _t: Tfirst_hadVLle50_fuvisit

id: study_id

Cohort | person-time failures rate [95% Conf. Interval]

-----------+-----------------------------------------------------------

total | 114 8 7.017544 3.509457 14.03235

--> 8 events, over a total person-time of 114 person-months, yielding a rate of 7.02 events per 100 person-months. */ tab Tfirst_hadVLle50_fuvisit Tfirst_hadVLle50_fuevent, m stset Tfirst_hadVLle50_fuvisit , id(study_id) failure(Tfirst_hadVLle50_fuevent) exit(failure) stptime , per(100) stptime if ( randarm_rand==0 ) , per(100) stptime if ( randarm_rand==1 ) , per(100) sts list, at (0 6 12) sts list, at (0 6 12) fail sts list, at (0 6 12) fail by(randarm_rand)

Page 21: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

sts test randarm_rand *There are no suppression events in the Control arm, so Cox regression isn't defined (infinite HR): xi: stcox i.randarm_rand sts graph, failure /// ytitle (Probability of Viral suppression) /// ylabel(0(.2)1.0) title(" ") /// xtitle("Months in study") /// xlabel(0 6 12) /// scheme(s2mono) /// graphregion(fcolor(white) lcolor(white) ifcolor(white) ilcolor(white)) /// scale(1.1) /// by(randarm_rand) legend(label(1 "Control") label(2 "Intervention")) /// ttext(0.95 1 "Log-rank P=0.0004", place(e)) graph export "C:\Users\ktapia\OneDrive - UW(1)\CFAR\Projects\HIV Pediatrics\Documents\Figures\Figure5_KM_VloadLE50.pdf", as(pdf) replace ******************************************************* *USING CONTINUOUS TIME. ******************************************************* stset Tfirst_hadVLle50_futime_months , id(study_id) failure(Tfirst_hadVLle50_fuevent) exit(failure) stptime , per(100) stptime if ( randarm_rand==0 ) , per(100) stptime if ( randarm_rand==1 ) , per(100) sts list, at (0 6 12) sts list, at (0 6 12) fail sts list, at (0 6 12) fail by(randarm_rand) sts test randarm_rand *There are no suppression events in the Control arm, so Cox regression isn't defined (infinite HR): xi: stcox i.randarm_rand sts graph, failure /// ytitle (Probability of Viral suppression) /// ylabel(0(.2)1.0) title(" ") /// xtitle("Months in study") /// xlabel(0 6 12) /// scheme(s2mono) /// graphregion(fcolor(white) lcolor(white) ifcolor(white) ilcolor(white)) /// scale(1.1) /// by(randarm_rand) legend(label(1 "Control") label(2 "Intervention")) /// ttext(0.95 1 "Log-rank P=0.0004", place(e)) graph export "C:\Users\ktapia\OneDrive - UW(1)\CFAR\Projects\HIV Pediatrics\Documents\Figures\Figure5b_KM_VloadLE50.pdf", as(pdf) replace

Page 22: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

******************************************************************************************** *VIRAL LOAD. ******************************************************************************************** use "OutputData\Stata\Main2.dta", clear ******************************************************************************************** *CHOOSE THE SUBJECTS TO ANALYZE (ie Implement inclusion/exclusion criteria, and begin filling in analysis Flow Diagram) sum visitdate_en viralload_log0 viralload_log6 viralload_log12 sum visitdate_en viralload_log0 viralload_log6 viralload_log12 if ( viralload_log0~=. & (viralload_log6~=. | viralload_log12~=.) ) *Keep the subjects who have baseline preART CD4%: keep if ( viralload_log0~=. ) *Keep the subjects who at least one followup CD4%: keep if ( viralload_log6~=. | viralload_log12~=. ) sum visitdate_en viralload_log0 viralload_log6 viralload_log12 tab randarm_rand, m ******************************************************************************************** gen viralload_log_00to06 = ( viralload_log6 - viralload_log0 ) gen viralload_log_00to12 = ( viralload_log12 - viralload_log0 ) tabstat viralload_log0 viralload_log6 viralload_log_00to06 viralload_log12 viralload_log_00to12 , s(n sd min p25 median mean p75 max) c(s) lo by(randarm_rand) ********************************************* *COMPARE BASELNE AND FOLLOWUP VALUES, BETWEEN ARMS. ********************************************* ttest viralload_log0, by(randarm_rand) ranksum viralload_log0, by(randarm_rand) ttest viralload_log6, by(randarm_rand) ranksum viralload_log6, by(randarm_rand) ttest viralload_log12, by(randarm_rand) ranksum viralload_log12, by(randarm_rand)

Page 23: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

********************************************* *COMPARE BASELNE AND FOLLOWUP VALUES, WITHIN ARMS. (PAIRED TESTS) ********************************************* ttest viralload_log0 = viralload_log6 if ( randarm_rand==0 ) signrank viralload_log0 = viralload_log6 if ( randarm_rand==0 ) ttest viralload_log0 = viralload_log12 if ( randarm_rand==0 ) signrank viralload_log0 = viralload_log12 if ( randarm_rand==0 ) ****** ttest viralload_log0 = viralload_log6 if ( randarm_rand==1 ) signrank viralload_log0 = viralload_log6 if ( randarm_rand==1 ) ttest viralload_log0 = viralload_log12 if ( randarm_rand==1 ) signrank viralload_log0 = viralload_log12 if ( randarm_rand==1 ) ********************************************* *CALCULATE CHANGE, COMPARE BETWEEN ARMS. ********************************************* ttest viralload_log_00to06, by(randarm_rand) ranksum viralload_log_00to06, by(randarm_rand) ttest viralload_log_00to12, by(randarm_rand) ranksum viralload_log_00to12, by(randarm_rand)

Page 24: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

******************************************************************************************** sort study_id merge 1:m study_id using "OutputData\Stata\Longitudinal2.dta" sort study_id visitdate tab _merge, m keep if ( _merge==3 ) drop _merge ******************************************************************************************** *CHOOSE THE RECORDS TO ANALYZE: * Ex analyze the viral load data from the first 24 months post ART initiation keep if (viralload_log ~=. ) keep if ( visitdate ~=. ) keep if ( time_months< 13 ) ******************************************************************************************** *DESCRIPTIVES AND BOXPLOTS. ******************************************************************************************** tabstat viralload_log if ( randarm_rand==0 ), s(n sd min p25 median mean p75 max) c(s) lo by(visitcode) tabstat viralload_log if ( randarm_rand==1 ), s(n sd min p25 median mean p75 max) c(s) lo by(visitcode) graph box viralload_log, over(visitcode) by(randarm_rand)

Page 25: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

******************************************************************************************** *LOWESS AND LINEAR FIT. ******************************************************************************************** display log10(50) *1.69897 twoway /// (scatter viralload_log visitcode if ( randarm_rand==0 ), msymbol(O) mlcolor(navy) mfcolor(none) ) /// (lowess viralload_log visitcode if ( randarm_rand==0 ), lpattern(solid) lwidth(thick) lcolor(navy) ) /// (lfit viralload_log visitcode if ( randarm_rand==0 ), lpattern(dash) lwidth(thick) lcolor(navy) ) /// (scatter viralload_log visitcode if ( randarm_rand==1 ), msymbol(O) mlcolor(maroon) mfcolor(none) ) /// (lowess viralload_log visitcode if ( randarm_rand==1 ), lpattern(solid) lwidth(thick) lcolor(maroon) ) /// (lfit viralload_log visitcode if ( randarm_rand==1 ), lpattern(dash) lwidth(thick) lcolor(maroon) ) /// , /// ylabel(0(1)7, axis(1) angle(0)) ytick(0(1)7, axis(1)) /// yline(1.69897, lwidth(vvthin) lcolor(black) ) /// ytitle("Viral load, log10 copies/ml", axis(1)) /// xscale(range(0 12)) xlabel(0(6)12) xmtick(0(6)12) /// xline(0 6 12, lwidth(vvthin) lcolor(gray) lpattern(shortdash) ) /// xline(0.0, lwidth(medium) lcolor(black) ) /// xtitle("Time (months)") /// legend(off) /// graphregion(color(white)) scale(1.1) graph export "C:\Users\ktapia\OneDrive - UW(1)\CFAR\Projects\HIV Pediatrics\Documents\Figures\Figure3_virallog10_Lowess.pdf", as(pdf) replace

Page 26: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

******************************************************************************************** *CONNECT LINES FOR EACH SUBJECT. ******************************************************************************************** twoway /// (connect viralload_log visitcode if ( study_id==9900000001 ), msymbol(O) mlcolor(navy) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(navy) sort yaxis(1) ) /// (connect viralload_log visitcode if ( study_id==9900000003 ), msymbol(O) mlcolor(navy) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(navy) sort yaxis(1) ) /// (connect viralload_log visitcode if ( study_id==9900000005 ), msymbol(O) mlcolor(navy) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(navy) sort yaxis(1) ) /// (connect viralload_log visitcode if ( study_id==9900000007 ), msymbol(O) mlcolor(navy) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(navy) sort yaxis(1) ) /// (connect viralload_log visitcode if ( study_id==9900000009 ), msymbol(O) mlcolor(navy) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(navy) sort yaxis(1) ) /// (connect viralload_log visitcode if ( study_id==9900000011 ), msymbol(O) mlcolor(navy) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(navy) sort yaxis(1) ) /// (connect viralload_log visitcode if ( study_id==9900000013 ), msymbol(O) mlcolor(navy) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(navy) sort yaxis(1) ) /// (connect viralload_log visitcode if ( study_id==9900000015 ), msymbol(O) mlcolor(navy) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(navy) sort yaxis(1) ) /// (connect viralload_log visitcode if ( study_id==9900000017 ), msymbol(O) mlcolor(navy) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(navy) sort yaxis(1) ) /// (connect viralload_log visitcode if ( study_id==9900000019 ), msymbol(O) mlcolor(navy) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(navy) sort yaxis(1) ) /// (lowess viralload_log visitcode if ( randarm_rand==0 ), lpattern(solid) lwidth(vthick) lcolor(navy) ) /// (connect viralload_log visitcode if ( study_id==9900000002 ), msymbol(O) mlcolor(maroon) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(maroon) sort yaxis(1) ) /// (connect viralload_log visitcode if ( study_id==9900000004 ), msymbol(O) mlcolor(maroon) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(maroon) sort yaxis(1) ) /// (connect viralload_log visitcode if ( study_id==9900000006 ), msymbol(O) mlcolor(maroon) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(maroon) sort yaxis(1) ) /// (connect viralload_log visitcode if ( study_id==9900000008 ), msymbol(O) mlcolor(maroon) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(maroon) sort yaxis(1) ) /// (connect viralload_log visitcode if ( study_id==9900000010 ), msymbol(O) mlcolor(maroon) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(maroon) sort yaxis(1) ) /// (connect viralload_log visitcode if ( study_id==9900000012 ), msymbol(O) mlcolor(maroon) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(maroon) sort yaxis(1) ) /// (connect viralload_log visitcode if ( study_id==9900000014 ), msymbol(O) mlcolor(maroon) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(maroon) sort yaxis(1) ) /// (connect viralload_log visitcode if ( study_id==9900000016 ), msymbol(O) mlcolor(maroon) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(maroon) sort yaxis(1) ) /// (connect viralload_log visitcode if ( study_id==9900000018 ), msymbol(O) mlcolor(maroon) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(maroon) sort yaxis(1) ) /// (connect viralload_log visitcode if ( study_id==9900000020 ), msymbol(O) mlcolor(maroon) mfcolor(none) lpattern(solid) lwidth(thin) lcolor(maroon) sort yaxis(1) ) /// (lowess viralload_log visitcode if ( randarm_rand==1 ), lpattern(solid) lwidth(vthick) lcolor(maroon) ) /// , /// ylabel(0(1)7, axis(1) angle(0)) ytick(0(1)7, axis(1)) /// yline(1.69897, lwidth(vvthin) lcolor(black) ) /// ytitle("Viral load, log10 copies/ml", axis(1)) /// xscale(range(0 12)) xlabel(0(6)12) xmtick(0(6)12) /// xline(0 6 12, lwidth(vvthin) lcolor(gray) lpattern(shortdash) ) /// xline(0.0, lwidth(medium) lcolor(black) ) /// xtitle("Time (months)") /// legend(off) /// graphregion(color(white)) scale(1.1) graph export "C:\Users\ktapia\OneDrive - UW(1)\CFAR\Projects\HIV Pediatrics\Documents\Figures\Figure4_virallog10_Connected.pdf", as(pdf) replace

Page 27: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

******************************************************************************************** *LONGITUDINAL ANALYSES. ******************************************************************************************** *I create this to motivate/compare how Stata handles interactions with categorical variables. gen visitXrandarm = ( visitcode * randarm_rand ) tab visitcode visitXrandarm if ( randarm_rand==0 ), m tab visitcode visitXrandarm if ( randarm_rand==1 ), m list randarm_rand i.randarm_rand visitcode i.visitcode visitXrandarm list randarm_rand 0.randarm_rand 1.randarm_rand visitcode 0.visitcode 6.visitcode 12.visitcode visitXrandarm 0.visitXrandarm 6.visitXrandarm 12.visitXrandarm ************************************************************ *RANDOM EFFECTS MODELS. ************************************************************ mixed viralload_log i.visitcode if ( randarm_rand==0 ) || study_id: mixed viralload_log i.visitcode if ( randarm_rand==0 & (visitcode==6 | visitcode==12) ) || study_id: mixed viralload_log i.visitcode if ( randarm_rand==1 ) || study_id: mixed viralload_log i.visitcode if ( randarm_rand==1 & (visitcode==6 | visitcode==12) ) || study_id: mixed viralload_log i.randarm_rand i.visitcode i.visitXrandarm || study_id: ereturn list display _b[_cons] display _b[0.visitcode] display _b[6.visitcode] display _b[12.visitcode] display _b[0.visitXrandarm] display _b[6.visitXrandarm] display _b[12.visitXrandarm] test 6.visitXrandarm test 12.visitXrandarm test 6.visitXrandarm 12.visitXrandarm *This is equivalent to the last model ran, which had our created interaction terms. Here Stata creates the interactions terms for us. mixed viralload_log i.randarm_rand i.visitcode i.randarm_rand##visitcode || study_id:

Page 28: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

/* 1) This output is the same between models. . mixed viralload_log i.randarm_rand i.visitcode i.visitXrandarm || study_id:

Mixed-effects ML regression Number of obs = 60

Group variable: study_id Number of groups = 20

Obs per group:

min = 3

avg = 3.0

max = 3

Wald chi2(5) = 725.83

Log likelihood = -46.607603 Prob > chi2 = 0.0000

2) This table is from our created interaction terms.

----------------------------------------------------------------------------------------

viralload_log | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-----------------------+----------------------------------------------------------------

randarm_rand |

Intervention | -.0704939 .2353398 -0.30 0.765 -.5317514 .3907637

|

visitcode |

6 | -.1602972 .2333567 -0.69 0.492 -.6176679 .2970735

12 | .0876282 .2333567 0.38 0.707 -.3697425 .544999

|

visitXrandarm |

6 | -2.996851 .3300162 -9.08 0.000 -3.643671 -2.350031

12 | -4.495647 .3300162 -13.62 0.000 -5.142466 -3.848827

|

_cons | 6.323548 .1664104 38.00 0.000 5.99739 6.649706

----------------------------------------------------------------------------------------

3) This table is from Stata-created interaction terms, and is equivalent.

----------------------------------------------------------------------------------------

viralload_log | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-----------------------+----------------------------------------------------------------

randarm_rand |

Intervention | -.0704939 .2353398 -0.30 0.765 -.5317514 .3907637

|

visitcode |

6 | -.1602972 .2333567 -0.69 0.492 -.6176679 .2970735

12 | .0876282 .2333567 0.38 0.707 -.3697425 .544999

|

randarm_rand#visitcode |

Intervention# 6 | -2.996851 .3300162 -9.08 0.000 -3.643671 -2.350031

Intervention#12 | -4.495647 .3300162 -13.62 0.000 -5.142466 -3.848827

|

_cons | 6.323548 .1664104 38.00 0.000 5.99739 6.649706

----------------------------------------------------------------------------------------

4) This table is the same for both models.

------------------------------------------------------------------------------

Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]

-----------------------------+------------------------------------------------

study_id: Identity |

var(_cons) | .0046474 .0363579 1.02e-09 21204.27

-----------------------------+------------------------------------------------

var(Residual) | .2722768 .0608817 .1756628 .422028

------------------------------------------------------------------------------

LR test vs. linear model: chibar2(01) = 0.02 Prob >= chibar2 = 0.4486

*/

Page 29: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

************************************************************ *GEE MODELS. ************************************************************ xtset study_id visitcode xtgee viralload_log i.visitcode if ( randarm_rand==0 ), family(gaussian) link(identity) corr(exch) robust xtgee viralload_log i.visitcode if ( randarm_rand==0 & (visitcode==6 | visitcode==12) ), family(gaussian) link(identity) corr(exch) robust xtgee viralload_log i.visitcode if ( randarm_rand==1 ), family(gaussian) link(identity) corr(exch) robust xtgee viralload_log i.visitcode if ( randarm_rand==1 & (visitcode==6 | visitcode==12) ), family(gaussian) link(identity) corr(exch) robust *These two are equivalent: xtgee viralload_log i.randarm_rand i.visitcode i.visitXrandarm, family(gaussian) link(identity) corr(exch) xtgee viralload_log i.randarm_rand i.visitcode i.visitXrandarm, family(gaussian) link(identity) corr(exch) robust xtgee viralload_log i.randarm_rand i.visitcode i.randarm_rand##visitcode, family(gaussian) link(identity) corr(exch) xtgee viralload_log i.randarm_rand i.visitcode i.randarm_rand##visitcode, family(gaussian) link(identity) corr(exch) robust

Page 30: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

/* . xtgee viralload_log i.randarm_rand i.visitcode i.randarm_rand##visitcode, family(gaussian) link(identity)

corr(exch)

Iteration 1: tolerance = 8.595e-15

GEE population-averaged model Number of obs = 60

Group variable: study_id Number of groups = 20

Link: identity Obs per group:

Family: Gaussian min = 3

Correlation: exchangeable avg = 3.0

max = 3

Wald chi2(5) = 725.83

Scale parameter: .2769241 Prob > chi2 = 0.0000

----------------------------------------------------------------------------------------

viralload_log | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-----------------------+----------------------------------------------------------------

randarm_rand |

Intervention | -.0704939 .2353398 -0.30 0.765 -.5317514 .3907637

|

visitcode |

6 | -.1602972 .2333567 -0.69 0.492 -.617668 .2970736

12 | .0876282 .2333567 0.38 0.707 -.3697426 .544999

|

randarm_rand#visitcode |

Intervention# 6 | -2.996851 .3300163 -9.08 0.000 -3.643671 -2.350031

Intervention#12 | -4.495647 .3300163 -13.62 0.000 -5.142467 -3.848827

|

_cons | 6.323548 .1664104 38.00 0.000 5.99739 6.649706

----------------------------------------------------------------------------------------

. xtgee viralload_log i.randarm_rand i.visitcode i.randarm_rand##visitcode, family(gaussian) link(identity)

corr(exch) robust

Iteration 1: tolerance = 8.595e-15

GEE population-averaged model Number of obs = 60

Group variable: study_id Number of groups = 20

Link: identity Obs per group:

Family: Gaussian min = 3

Correlation: exchangeable avg = 3.0

max = 3

Wald chi2(5) = 1371.81

Scale parameter: .2769241 Prob > chi2 = 0.0000

(Std. Err. adjusted for clustering on study_id)

----------------------------------------------------------------------------------------

| Robust

viralload_log | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-----------------------+----------------------------------------------------------------

randarm_rand |

Intervention | -.0704939 .2521817 -0.28 0.780 -.564761 .4237732

|

visitcode |

6 | -.1602972 .2447874 -0.65 0.513 -.6400717 .3194773

12 | .0876282 .238662 0.37 0.713 -.3801406 .555397

|

randarm_rand#visitcode |

Intervention# 6 | -2.996851 .3511342 -8.53 0.000 -3.685061 -2.308641

Intervention#12 | -4.495647 .31262 -14.38 0.000 -5.108371 -3.882923

|

_cons | 6.323548 .1847681 34.22 0.000 5.961409 6.685687

----------------------------------------------------------------------------------------

*/

Page 31: Data - depts.washington.edu...Data SourceData 2019_08aug_05 Stata (This file imports the data from here.) SAS R OutputData (This file write the data to here.) This syntax file performs

**************************************************************************************************** **************************************************************************************************** *END. **************************************************************************************************** ****************************************************************************************************