IQSS Presentation to Program in Health Policy

30
Research Technology Consulting Simo Goshev Alex Storer Steve Worthington Ista Zahn [email protected] http://rtc.iq.harvard.edu

Transcript of IQSS Presentation to Program in Health Policy

  • 1. Research Technology ConsultingSimo GoshevAlex StorerSteve WorthingtonIsta [email protected]://rtc.iq.harvard.edu

2. Consulting Goals Data analysis support and programming services Research project planning and guidance selectingappropriate technology for research projects Facilitating appropriate organization, storage andsharing of data Training on the use of both established softwarepackages and emerging tools 3. Scope Free! Support the entire social sciencecommunity Consults measured in hours ratherthan weeks or months Currently doing outreach todepartments, student groups andcenters Drop-ins on Fridays at 1pm in thetraining lab, Appointments, HelpTickets and casual chats in K306 4. Who WeScope Are 5. Simo Goshev BA Sofia, Bulgaria Applied Econometrics MS McMaster University Statistics PhD McMaster University EconomicsAnalysis: Tools: Econometrics Mainly Stata Applied MicroeconometricsSome R Panel Data Applied statistics 6. Help with econometrics What model is most suitable for my data on hospital IT innovation? I am looking at HIV in children. Can you help me design an overlapping generations model? Why are the confidence intervals of my spline of health care spending so wide/narrow? Could the interaction between an exogenous and endogenous variable be exogenous? I am looking for a way to compare survival between two cancer management programs. Can you help me? 7. Help with computation/estimation I am trying to estimate a model but forsome reason the routine fails. Could youhave a look at my script ? I am working with a large dataset and mymachine is giving up on me. Do you haveany suggestions? Which routine is best for? 8. Replication study in healtheconomicsGraduate StudentMake sense of a study and Stata code 11 .8.8 .6.6 .4.4 .2.2 65 70 75 8065 70 75 80 9. Predictors of hospital IT adoptionGraduate Student, School of Public Understand what factors facilitate/hinder Healthadoption of IT in US hospitals Data: Sample of hospitals clustered within states Count of ITs adopted by a hospital in 3 consecutive years Modeling strategy: Three-level mixed effects model 10. Alex Storer BS,BA - UC Berkeley Electrical Engineering & Computer Science, Cognitive Science PhD Boston University Cognitive & Neural SystemsAnalysis: Tools: Machine Learning Matlab, R, Python Signal ProcessingEmacs, LaTeX, Linux Surface Based Techniques Simulation Optimization 11. Text Analysis Topic Models Large corpus Prevalence ofSentimentcertain terms 12. Text AnalysisTwitter:#obamacare Positive/Ne gativeOpinions? 13. Text Analysis Distinct ContentGroupingsCongressSpeeches 14. Text Analysis NY TimesArchive Term: "Medicare" 15. Text Analysis Topic Models What models are appropriate to perform ouranalysis? What software is appropriate? Prevalence ofSentimentcertain terms 16. Text Analysis Where do we obtain this corpus? How do we pre-process it so we can analyzeit?Largecorpus 17. Federal Procurement Database 18. Federal Procurement Database Only first 500 hits, only a few columns All of the data, but 19. Federal Procurement DatabaseDownload atom feeds Parse XML Tree structurePython!Search for union of entriesOutput as CSVFor 20gb of data, there is no way to download by hand 20. Steve Worthington BA / MS Durham, UK Anthropology & Archeology PhD NYU Biological AnthropologyAnalysis:Tools: Linear models (OLS, GLS, PLS, etc.) Mainly R Resampling (permutation, bootstrap) Some SAS, SPSS Ordination (PCA, LDA, CVA, etc.) 21. Cleaning / reshaping dataDepartment of 171 files, 3 types (2 asciiParse messy data Economicstext, 1 binary) into a long-format StataDaily Lat/Long data onOne file for each yeardata frame rainfall in India (1951 (containing 365 daily 2007)matrices) June 21st 2007 22. Cleaning / reshaping data No common delimiter (spaces and tabs) Use regexp to parse each datum Use template to place each datum into correct row/column Template 23. Cleaning / reshaping data Long formatdata framein Stata Rainfall foreach dayand lat/long 24. Rainfall / CEO movie 25. Rainfall / CEO movie 26. Geospatial Analysis in R Spatial prediction: interpolation of data points Spatial autocorrelation analysis Drug resistant TB Moldova 27. Ista Zahn BS University of OregonPsychology PhD (ABD) University of Rochester Social PsychologyAnalysis:Tools: RegressionR, Stata, SAS, SPSS Mixed ModelsEmacs, LaTeX, Linux Scale Development 28. Workshops(schedule at http://rtc.iq.harvard.edu) 29. IQSS Services THE INSTITUTE FORQuantitative Social Science at Harvard University 30. Contact [email protected]://rtc.iq.harvard.edu/CGIS-Knafel, Room K306Fridays afternoons, K018