NCeSS e-Stat quantitative node Prof. William Browne & Prof. Jon Rasbash University of Bristol.

12
NCeSS e-Stat quantitative node Prof. William Browne & Prof. Jon Rasbash University of Bristol

Transcript of NCeSS e-Stat quantitative node Prof. William Browne & Prof. Jon Rasbash University of Bristol.

Page 1: NCeSS e-Stat quantitative node Prof. William Browne & Prof. Jon Rasbash University of Bristol.

NCeSS e-Stat quantitative node

Prof. William Browne & Prof. Jon Rasbash

University of Bristol

Page 2: NCeSS e-Stat quantitative node Prof. William Browne & Prof. Jon Rasbash University of Bristol.

e-Stat quantitative node

• Linking the ESRC Research Methods and e-Social Science programmes.

• Statistical methodology and software development researchers in Bristol.

• Computer Science (E-Science) researchers at Southampton.

• Other social science researchers at Institute of Education, Manchester and Stirling.

Page 3: NCeSS e-Stat quantitative node Prof. William Browne & Prof. Jon Rasbash University of Bristol.

e-Stat quantitative node

• Node started in September 2009 and is funded for 3 years.

• We aim to produce software tools to cater for three types of user: novice practitioners, advanced practitioners and statistical algorithm developers

• Capacity building and methodology development require iterative interaction between all three groups

• We are currently in the planning stage merging the skills of the various contributors.

• We are however building on many strengths as detailed in the following slides.

Page 4: NCeSS e-Stat quantitative node Prof. William Browne & Prof. Jon Rasbash University of Bristol.

MLwiN software package

• Statistics package for multilevel modelling.

• Development supported by ESRC for many years.

• Over 7,000 copies sold worldwide (free to UK academics)

• Cited in 1,500+ ISI journal articles

Page 5: NCeSS e-Stat quantitative node Prof. William Browne & Prof. Jon Rasbash University of Bristol.

MCMC in MLwiN

• Computationally – intensive Markov Chain Monte Carlo Simulation based methods

• Statistical model estimates obtained by generated lots of random number draws from the posterior distribution.

• Can benefit from parallelisation in many ways.

MCMC algorithm converges on a distribution. Parameter estimates and intervals are then calculated from the simulation chains.

Page 6: NCeSS e-Stat quantitative node Prof. William Browne & Prof. Jon Rasbash University of Bristol.

Sample Size calculations via simulation

• Original work funded in a recently finished ESRC grant.

• Is a computationally intensive method as involves running same model on many datasets and assessing how often a confidence interval contains 0

• Ideal candidate for

parallelisation.

Page 7: NCeSS e-Stat quantitative node Prof. William Browne & Prof. Jon Rasbash University of Bristol.

My experiment project (Universities of Southampton and Manchester)

Social Networking tool for sharing scientific artifacts

Page 8: NCeSS e-Stat quantitative node Prof. William Browne & Prof. Jon Rasbash University of Bristol.

Component wise approach

Also interested in interoperability between statistical software packages so components may be existing packages

Page 9: NCeSS e-Stat quantitative node Prof. William Browne & Prof. Jon Rasbash University of Bristol.

Prototype for Algebraic Processing Component

Page 10: NCeSS e-Stat quantitative node Prof. William Browne & Prof. Jon Rasbash University of Bristol.

Software Component for code generation

• Algebraic processing component outputs statistical distributions (posteriors) in MATHML

• MATHML files taken as input into Python.• Python code creates C++ code for running the

model for 1 iteration. • C++ code compiled to a Python function.• Function can be called within Python.• C++ code can be viewed and edited by expert

users.

Page 11: NCeSS e-Stat quantitative node Prof. William Browne & Prof. Jon Rasbash University of Bristol.

Specific areas of application in the social sciences

• Measuring segregation – proposal for complex modelling

• ESDS feasibility study project: Changing circumstances during childhood

• Social Networks in Multilevel structures

• Handling missing data via multiple imputation

• Sample size calculations

Page 12: NCeSS e-Stat quantitative node Prof. William Browne & Prof. Jon Rasbash University of Bristol.

Integration into workbooks

• Model Specification and Estimation software components – freestanding Python application(s) operated by own GUI or script.

• These components along with tables, graphs, equations and diagrams can be incorporated in SAGE executable books.

• myExperiment will then act as a searchable repository for locating and sharing executable books