NCeSS e-Stat quantitative node Prof. William Browne & Prof. Jon Rasbash University of Bristol.

Post on 28-Mar-2015

217 views 1 download

Tags:

Transcript of NCeSS e-Stat quantitative node Prof. William Browne & Prof. Jon Rasbash University of Bristol.

NCeSS e-Stat quantitative node

Prof. William Browne & Prof. Jon Rasbash

University of Bristol

e-Stat quantitative node

• Linking the ESRC Research Methods and e-Social Science programmes.

• Statistical methodology and software development researchers in Bristol.

• Computer Science (E-Science) researchers at Southampton.

• Other social science researchers at Institute of Education, Manchester and Stirling.

e-Stat quantitative node

• Node started in September 2009 and is funded for 3 years.

• We aim to produce software tools to cater for three types of user: novice practitioners, advanced practitioners and statistical algorithm developers

• Capacity building and methodology development require iterative interaction between all three groups

• We are currently in the planning stage merging the skills of the various contributors.

• We are however building on many strengths as detailed in the following slides.

MLwiN software package

• Statistics package for multilevel modelling.

• Development supported by ESRC for many years.

• Over 7,000 copies sold worldwide (free to UK academics)

• Cited in 1,500+ ISI journal articles

MCMC in MLwiN

• Computationally – intensive Markov Chain Monte Carlo Simulation based methods

• Statistical model estimates obtained by generated lots of random number draws from the posterior distribution.

• Can benefit from parallelisation in many ways.

MCMC algorithm converges on a distribution. Parameter estimates and intervals are then calculated from the simulation chains.

Sample Size calculations via simulation

• Original work funded in a recently finished ESRC grant.

• Is a computationally intensive method as involves running same model on many datasets and assessing how often a confidence interval contains 0

• Ideal candidate for

parallelisation.

My experiment project (Universities of Southampton and Manchester)

Social Networking tool for sharing scientific artifacts

Component wise approach

Also interested in interoperability between statistical software packages so components may be existing packages

Prototype for Algebraic Processing Component

Software Component for code generation

• Algebraic processing component outputs statistical distributions (posteriors) in MATHML

• MATHML files taken as input into Python.• Python code creates C++ code for running the

model for 1 iteration. • C++ code compiled to a Python function.• Function can be called within Python.• C++ code can be viewed and edited by expert

users.

Specific areas of application in the social sciences

• Measuring segregation – proposal for complex modelling

• ESDS feasibility study project: Changing circumstances during childhood

• Social Networks in Multilevel structures

• Handling missing data via multiple imputation

• Sample size calculations

Integration into workbooks

• Model Specification and Estimation software components – freestanding Python application(s) operated by own GUI or script.

• These components along with tables, graphs, equations and diagrams can be incorporated in SAGE executable books.

• myExperiment will then act as a searchable repository for locating and sharing executable books