2012 Project Research Grant - Statistics in the …users.du.se/~lrn/obscure_tmp/5994.pdfKansliets...

33
Kansliets noteringar Kod Dnr 2012-43261-98001-14 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science The Swedish Research Council Announced grants Thematic grants VR April 24 2012 Total amount for which applied (kSEK) 2013 2014 2015 2016 2017 1858 2085 2149 2216 Vetenskapsrådet, Box 1035, SE-101 38 Stockholm, tel. +46 (0)8 546 44 000, [email protected] APPLICANT Name(Last name, First name) Date of birth Gender Mostad, Petter 641210-5899 Male Email address Academic title Position [email protected] Associate professor Docent Matematisk Statistik, Chalmers Phone Doctoral degree awarded (yyyy-mm-dd) +46317725315 1991-10-01 WORKING ADDRESS University/corresponding, Department, Section/Unit, Address, etc. Chalmers tekniska högskola Matematiska vetenskaper Matematisk statistik Matematiska Vetenskaper, Chalmers tekniska högskola och Göteborgs Universitet 41296 Göteborg, Sweden ADMINISTERING ORGANISATION Administering Organisation Chalmers tekniska högskola DESCRIPTIVE DATA Project title, Swedish (max 200 char) Bättre metoder för beräkningar och modellering med Bayesianska nätverk, med fokus på forensiska tillämpningar Project title, English (max 200 char) Improving Bayesian Network computational and modeling methods, with applications in forensics Abstract (max 1500 char) Within the Bayesian paradigm for statistics, posterior probability distributions for variables of interest are computed based on fully specified stochastic models, which may be described in the form of a Bayesian network. For some simple networks, exact inference is possible, but in many cases, numerical methods must be used, or one must resort to MCMC simulation. However, exact bounds for the accuracy of results from MCMC simulations are often not available. In forensic applications of Bayesian networks, this can be a particular problem. In this project, we will develop inference methods for ILDI (Inference with Low Dimensional Integration) networks, using numerical integration in such a way that precise bounds for the accuracy of results are obtained. ILDI networks contain many of the types of models we see in forensic sciences. In a cooperation between the Swedish National Laboratory for Forensic Science, the National Veterinary Institute in Sweden, and Chalmers University, we will also work with a number of example forensic applications, aiming to study and support the general process of Bayesian network model building and utilization. Abstract language 2012-5994

Transcript of 2012 Project Research Grant - Statistics in the …users.du.se/~lrn/obscure_tmp/5994.pdfKansliets...

Kansliets noteringarKod

Dnr

2012-43261-98001-14

2012Project Research Grant - Statistics in

the Empirical SciencesArea of science

The Swedish Research CouncilAnnounced grants

Thematic grants VR April 24 2012Total amount for which applied (kSEK)

2013 2014 2015 2016 2017

1858 2085 2149 2216

Vetenskapsrådet, Box 1035, SE-101 38 Stockholm, tel. +46 (0)8 546 44 000, [email protected]

APPLICANTName(Last name, First name) Date of birth Gender

Mostad, Petter 641210-5899 MaleEmail address Academic title Position

[email protected] Associate professor Docent Matematisk Statistik, ChalmersPhone Doctoral degree awarded (yyyy-mm-dd)

+46317725315 1991-10-01

WORKING ADDRESSUniversity/corresponding, Department, Section/Unit, Address, etc.

Chalmers tekniska högskolaMatematiska vetenskaperMatematisk statistikMatematiska Vetenskaper, Chalmers tekniska högskola och Göteborgs Universitet41296 Göteborg, Sweden

ADMINISTERING ORGANISATIONAdministering Organisation

Chalmers tekniska högskola

DESCRIPTIVE DATAProject title, Swedish (max 200 char)

Bättre metoder för beräkningar och modellering med Bayesianska nätverk, med fokus på forensiskatillämpningar

Project title, English (max 200 char)

Improving Bayesian Network computational and modeling methods, with applications in forensics

Abstract (max 1500 char)

Within the Bayesian paradigm for statistics, posterior probability distributions for variables of interest are computed based on fullyspecified stochastic models, which may be described in the form of a Bayesian network. For some simple networks, exact inferenceis possible, but in many cases, numerical methods must be used, or one must resort to MCMC simulation. However, exact boundsfor the accuracy of results from MCMC simulations are often not available. In forensic applications of Bayesian networks, this can bea particular problem.

In this project, we will develop inference methods for ILDI (Inference with Low Dimensional Integration) networks, using numericalintegration in such a way that precise bounds for the accuracy of results are obtained. ILDI networks contain many of the types ofmodels we see in forensic sciences. In a cooperation between the Swedish National Laboratory for Forensic Science, the NationalVeterinary Institute in Sweden, and Chalmers University, we will also work with a number of example forensic applications, aiming tostudy and support the general process of Bayesian network model building and utilization.

Abstract language

2012-5994

Kod

2012-43261-98001-14Name of Applicant

Mostad, Petter

Date of birth

641210-5899

Vetenskapsrådet, Box 1035, SE-101 38 Stockholm, tel. +46 (0)8 546 44 000, [email protected]

EnglishKeywords

Bayesian statistics, Bayesian networks, Forensic statistics Research areas

StatisticsReview panel

NT-R, NT-D1, HS-DClassification codes (SCB) in order of priority

10106, 10606, 50502Aspects

Continuation grant

Application concerns: New grantRegistration Number: Application is also submitted to

similar to: identical to:

ANIMAL STUDIESAnimal studies

No animal experiments

OTHER CO-WORKER Name(Last name, First name) University/corresponding, Department, Section/Unit, Addressetc.

Nordgaard, Anders Statens Kriminaltekniska Laboratorium

Date of birth Gender

620824-7871 MaleAcademic title Doctoral degree awarded (yyyy-mm-dd)

Associate professor 1996-01-15

Name(Last name, First name) University/corresponding, Department, Section/Unit, Addressetc.

Andersson, Gunnar Statens Veterinärmedicinska AnstaltAvdelning för kemi, miljö och fodersäkerhet

Date of birth Gender

721005-8512 MaleAcademic title Doctoral degree awarded (yyyy-mm-dd)

PhD 2002-02-01

Name(Last name, First name) University/corresponding, Department, Section/Unit, Addressetc.

,

Date of birth Gender

Academic title Doctoral degree awarded (yyyy-mm-dd)

Name(Last name, First name) University/corresponding, Department, Section/Unit, Addressetc.

,

Date of birth Gender

Academic title Doctoral degree awarded (yyyy-mm-dd)

Kod

2012-43261-98001-14Name of Applicant

Mostad, Petter

Date of birth

641210-5899

Vetenskapsrådet, Box 1035, SE-101 38 Stockholm, tel. +46 (0)8 546 44 000, [email protected]

ENCLOSED APPENDICESA, B, B, B, C, C, C, N, S

APPLIED FUNDING: THIS APPLICATION Funding period (planned start and end date)

2013-01-01 -- 2016-12-31Staff/ salaries (kSEK)

Main applicant % of full time in the project 2013 2014 2015 2016 2017

Petter Mostad 45 603 624 643 665

Other staff

Ny doktorand 100 818 845 873 902Gunnar Andersson 20 49 205 211 218Anders Nordgaard 10 116 120 125 128

Total, salaries (kSEK): 1586 1794 1852 1913

2013 2014 2015 2016 2017

Konferensresor 20 20 20 20Forskningsresor 100 100 100 100Påläggsberäkning för direkta lokalkostnader 141 159 164 170Påläggsberäkning för direkta IT-kostnader 11 12 13 13

Total, other costs (kSEK): 272 291 297 303

Total amount for which applied (kSEK)

2013 2014 2015 2016 2017

1858 2085 2149 2216

ALL FUNDINGOther VR-projects (granted and applied) by the applicant and co-workers, if applic. (kSEK)

Funds received by the applicant from other funding sources, incl ALF-grant (kSEK)

POPULAR SCIENCE DESCRIPTIONPopularscience heading and description (max 4500 char)

Bättre forensiska sannolikhetsberäkningar

Hur säker kan man vara på att en man med en sko som passar till skoavtryck vid en brottsplats verkligen är brottslingen? Hur säkerkan man vara att en foderfabrik är källan till en smitta på en gård den har levererat foder till, om man finner samma typ bakterie påfoderfabriken och på gården? Hur kan man ta reda på precis hur länge en död kropp har legat ute i naturen, efter att ha studeratinsekterna som fanns i kroppen? Denna typ frågor kallas forensiska frågor, därför att man använder vetenskap för att svara påfrågor som har juridisk betydelse. Som i exemplen ovan kan vetenskapen ofta inte ge något exakt svar, men man kan i stället säganågot om sannolikheten för det ena eller det andra. Ett sätt att komma fram till sådana sannolikheter är att man

Kod

2012-43261-98001-14Name of Applicant

Mostad, Petter

Date of birth

641210-5899

Vetenskapsrådet, Box 1035, SE-101 38 Stockholm, tel. +46 (0)8 546 44 000, [email protected]

börjar med att beskriva sannolikheterna för att få olika möjliga mätresultat under olika antaganden om det man vill ta reda på. Omman formulerar all kunskap man har om både möjliga mätresultat och olika antaganden och om hur de kan hänga i hop i en modell,kan man göra detta som ett Bayesiansk nätverk. Sedan kan man använda denna modell för att beräkna de sannolikheter man ärintresserad av, givet de mätningar man faktiskt gjort.

Det är dock inte alltid det finns enkla metoder för att göra dessa beräkningar på ett exakt sätt. Ofta kan man göra en ungefärligberäkning av de sannolikheter man är intresserad av, men man vet inte exakt hur noggrant svaret är. Det kan ju vara ett problem påflera sätt, speciellt om man behöver redogöra för sannolikheterna i en rättssal. Även när det finns matematiska metoder för att göraberäkningarna av sannolikheterna med känd noggrannhet, så hjälper det i praktiken dåligt om dessa metoder behöver utredas ochimplementeras på nytt vid varje exempel. I vårt projekt vill vi utveckla förbättrade generella metoder för att göra dessasannolikhetsberäkningar för en typ av modell som vi kallar ILDI (Inference with Low Dimensional Integration) nätverk. Denna typ avnätverk inkluderar många av de modeller vi ser i forensiska tillämpningar, och våra metoder skall vara sådana att noggrannheten isvaret kan beräknas.

Innan man kan göra sådana beräkningar måste man först formulera sin frågeställning som et Bayesiansk nätverk, och det är inte allsuppenbart hur detta bäst kan göras; det finns alltid många olika sätt att göra det på. Om användningen av Bayesianska nätverk inomforensiska vetenskaper skall få större genomslag behöver man studera hur man kan underlätta modellformuleringsprocessen, ochäven hur man kan säkerställa att beräkningsresultaten uppfattas och används på rätt sätt. Vi kommer att arbeta med dessa frågorinom ett antal forensiska exempelprojekt, i ett samarbete mellan Statens Kriminaltekniska Laboratorium, StatensVeterinärmedicinska Anstalt, och Chalmers Tekniska Högskola.

Vårt långsiktiga mål är att kombinera erfarenheterna från båda delarna av projektet i ny mjukvara, som kan ge stöd åt såväl arbetetmed att formulera modeller som beräkningarna av de sannolikheter man är intresserad av. Det är troligt att sådan mjukvara skullekunna vara nyttig också för många andra tillämpningar inom forskning och beslutsfattande.

VRAPS/VR-Direct bilaga 2004.Ae Vetenskapsrådet, Box 1035, SE-101 38 Stockholm, tel. +46 (0)8 546 44 000, [email protected]

Name of applicant

Date of birth

Kod

Title of research programme

Appendix AResearch programme

Petter Mostad, 641210-5899, Bilaga A Page 1

Improving Bayesian Network computational and modeling methods, with applications in forensics

Purpose and aims

Inferential statistics has a large and untapped potential as a tool in many scientific areas, in

particular areas where the scientific models in use are known to be gross simplifications of

reality, so that observations include substantial variability not directly explained by the model.

In many such cases, classes of stochastic models that might be suitable exist, and inferential

methods have been worked out, yet the use of statistical methods within the field remains

limited. The reasons include both difficulties for the scientists within the field to translate

their ideas into stochastic models and to understand and trust the result, and shortcomings in

the available methods for model formulation and computational inference.

Within the Bayesian paradigm for statistics, one constructs fully specified stochastic models,

and inference is done primarily by conditioning these models on observed data. A Bayesian

Network is a way to specify such models, where the joint distribution of all the variables of

interest is specified using a sequence of conditional distributions, implicitly also specifying

the interdependence structure between the variables. Depending on the type of variables and

type of conditional distributions, one may choose between a number of methods for

computational inference. For general networks containing infinite-valued variables, there is

however no general inference method that does not depend on Markov chain Monte Carlo

(MCMC) simulation.

In this project, we will develop inference methods for Inference with Low Dimensional

Integration (ILDI) networks. An ILDI network is a Bayesian network where the posterior for

a certain subsets of variables (the “variables of interest”) conditional on values at another set

of variables (the “data variables”) can be computed numerically through a series of low-

dimensional integrations. No general tool for inference in ILDI networks that does not depend

on MCMC simulation currently exists. Thus this will represent a significant extension of

available computational tools for Bayesian networks.

In the forensic sciences, a scientific argument is used to answer a legal question. The legal

question could be within criminal law, but could also be related to commercial or other

activities, e.g., “is this factory for animal feed legally responsible for the disease outbreak at

this farm?”, or “is this man the nephew and heir of the diseased?”. The scientific argument

may have a wide range of components and involve uncertainty. In the forensic setting, the

transparency and stringency of the argument is paramount. It is then natural to formulate it

using probability theory, often using a Bayesian network. The types of networks commonly

used in forensics today are those where all variables are finite-valued, but these are clearly

inappropriate for many applications involving for example continuous-valued measurements.

Using more general networks may however raise the issue that results based on MCMC

simulation are inherently uncertain, as the degree of convergence is often unproved. This is a

particular problem in the forensic context, where all results could be challenged in court, and

where various competent authorities may base decisions with large impact for individuals and

Petter Mostad, 641210-5899, Bilaga A Page 2

companies on results from Bayesian networks. Thus there is a need for inference methods

where the accuracy of the results can be proven. The class of ILDI networks will contain

many of the types of networks appearing in forensic applications, and our inference methods

will have accuracy bounds derived from accuracy bounds for the numerical integrations

involved.

As mentioned above, increased applied use of statistics is hindered by more than limitations in

computational tools. The difficulties of model formulation and understanding can be

substantial even when the models are quite standard from a mathematical point of view. There

is a need to study on a meta-level how model building and the use of inference results from

stochastic models can be streamlined within an organization or scientific group. The use of

ILDI networks within forensic sciences offers a good opportunity for such a study, as the use

of Bayesian network models provides a unified framework, while there is a wide diversity in

the types of applications and networks possible. Our project includes a series of example

usages of ILDI networks within two organizations, the Swedish National Laboratory of

Forensic Science (Statens Kriminaltekniska Laboratorium, SKL) and the National Veterinary

Institute in Sweden (Statens Veterinärmedicinska Anstalt, SVA). In addition to helping with

the specific example problems, the aim is to increase understanding of the process of applied

model building, and increase knowledge about how inference results from stochastic models

should be formulated in order to gain acceptance in organizations, scientific groups,

competent authorities, and courts of law.

As a later follow-up project, we plan to prototype a user-friendly computer tool implementing

support for stepwise building of Bayesian networks, and inference for ILDI networks, based

on the knowledge gained in this project.

Survey of the field

During the last few decades, Bayesian statistics has had a large number of successes in a wide

range of applications, due to both theory development and the increased availability of

computer power. Although the models used are of course very diverse, the Bayesian paradigm

means that they specify a complete stochastic model for the joint distribution of all the

variables involved: These variables include those describing questions of interest, those

describing possible data observations, and often many ancillary variables needed to efficiently

describe the relationship between the first two groups of variables. Any joint probability

density can be factorized as a product of conditional probability densities, and these

conditional distributions will indicate conditional independencies and dependencies in the

model. The graph showing such conditional dependencies, together with the conditional

distributions, is called a Bayesian network. Thus any Bayesian model can in principle be

described as a Bayesian network, and such a description may in many cases be helpful for

both model understanding and for inference.

The main goal in Bayesian inference is to find the posterior for the variables of interest, i.e.,

their conditional distributions given the observed values for the data variables. Using Bayes

formula, such posteriors can always be expressed theoretically in terms of integrals. The

methods for computing posteriors fall into three broad classes:

Petter Mostad, 641210-5899, Bilaga A Page 3

Analytical methods, where the relevant integrations can be computed analytically.

Many commonly used families of probability distributions are exponential families,

where the use of conjugate priors leads to analytically computable posteriors.

Numerical methods, meaning methods for numerical integration.

Simulation methods, where the goal is to obtain a sample from the posterior, so that

approximate conclusions about the posterior can be drawn based on the sample. In

particular, methods based on MCMC simulations have proven to be extremely

efficient and powerful in generating such samples. The generality of this method,

together with the availability of computer power to implement it, is a major reason for

the recent success of Bayesian statistics. However, a problem with MCMC methods is

that it is difficult to prove the accuracy of the results. MCMC simulations provide a

chain of values that are known to converge to a sample from the correct distribution,

but useful bounds for the speed of this convergence are often very difficult to

establish, leading to the absence of precise bounds for the accuracy of the final result.

In many cases, a combination of the three methods for inference above is used.

When a model is formulated as a Bayesian network, this provides a structure on which one

may base a general inference algorithm. There are two such algorithms that are in widespread

use:

1. When all the variables in a Bayesian network have a finite number of possible values,

all conditional distributions can be expressed in terms of tables of probabilities, and

there is an algorithm called evidence propagation which can be used to compute

posteriors analytically. These kinds of networks are sometimes called Expert Systems.

A general reference is (Cowell, Dawid, Lauritzen, & Spiegelhalter, 1999). The same

algorithm can be applied to Gaussian variables. Two well-known programs implement

the algorithm: Hugin (www.hugin.com), a commercial program marketed as a

decision support tool, and GeNIe (genie.sis.pitt.edu), a freeware program. These

programs are very valuable in their generality, but in order to handle continuous-

valued variables, they (mostly) need to use discretization, creating inaccuracies and

computationally heavy networks.

2. A version of the MCMC algorithm is Gibbs sampling, where the chain of simulated

values is produced by cycling through each non-fixed variable in the model,

simulating a new value conditionally on the already simulated values for the other

variables. When the model is formulated as a Bayesian network, the graphical

structure of the network helps identify the conditional distributions to simulate from.

This type of Gibbs sampling for Bayesian networks is implemented in a widely-used

program called WinBUGS (Lunn, Thomas, Best, & Spiegelhalter, 2000). Today a

range of tools for different platforms exist (see www.mrc-bsu.cam.ac.uk/bugs). The

programs and method can be usefully applied in a wide range of models, see for

example (Congdon, 2010).

Other methods and programs for inference in Bayesian networks tend to be more specific in

the type of underlying graphs they accept, and in the types of conditional distributions

Petter Mostad, 641210-5899, Bilaga A Page 4

handled. As an example, Hidden Markov Models are used in a range of applications, for

example genetics, and a large number of specialized methods and programs exist for

inference. Hidden Markov Models can be seen as a special type of Bayesian networks, were

the underlying graph has a special chain structure.

In forensics, Bayesian statistics has experienced increasing acceptance during the last few

decades. This is true in particular in some specialized areas, such as forensic genetics. A

major question is here how to weigh the strength of the evidence when comparing DNA

samples from a crime scene, which may contain a mixture of DNA from several persons, to

the DNA profiles of suspects, who in some cases have been found using database searches for

particular DNA profiles. Another important question is pedigree inference, where DNA

profiles of several individuals at certain polymorphic loci are used to infer their relationship.

The most common and simplest case is that of paternity testing. Pedigree inference based on

DNA tests is currently performed by thousands of labs globally, and the program Familias

(www.familias.name) written by Petter Mostad and Thore Egeland is a world leader1. We are

continuing developments of methods and programs in this area, with particular focus on

genetic markers that are linked (i.e., are not independently inherited) and/or in linkage

disequilibrium (i.e., have observed occurrences in the population that are not independent).

Common to these applications in forensic genetics is that the Bayesian framework has gained

wide acceptance as the basis for scientific arguments within the subject. The framework

allows separation between the description of likelihoods for measurements, which is the

domain of forensic experts, and descriptions of prior distributions for variables of interest,

which should be based on information outside of the forensic argument, and which is the

domain of the court. Inspired by the success in forensic genetics, statisticians are working to

increase the use and acceptance of Bayesian statistics also in other forensic fields. A general

introduction to the methodology is given in (Taroni, Aitken, Garbolino, & Biedermann,

2006). The models currently in use are however almost exclusively Bayesian networks where

all variables have a finite number of possible values.

Project description

Task 1: Development of theory and inference methods for ILDI networks

When finding the posterior distribution for variables of interest in any Bayesian network, the

first step is often to remove those ancillary variables from the network that can be removed

analytically (because of conjugacy, or because they have a finite state space). Assuming this

has been done, let denote the variables of interest, the variables of observed data, and

the remaining ancillary variables, and let denote the conditional probability

density for and given . Then the network is an ILDI network if the integral

can be expressed as a polynomial in low-dimensional integrals, and if the

infinite-valued component of has a low dimension. Here, low dimension means dimension

1 According to the 2007 Paternity Testing Workshop of the English Speaking Working Group of the

International Society for Forensic Genetics, at a test where 69 laboratories participated, 22% used Familias, 20%

used DNAview, and the remaining used programs with smaller spread.

Petter Mostad, 641210-5899, Bilaga A Page 5

2 or less. With some additional restrictions on the distributions, it is possible to compute not

only the posterior distribution for numerically, but also to compute bounds for the accuracy

of the result based on bounds for the accuracy of the numerical integrations. ILDI networks

do not include models that contain (large) multivariate normal components, such as

generalized linear mixed models (GLMMs), but they do include many types of simpler

networks that often occur in applications, for example in forensics.

Theoretical questions that must be worked out include obtaining a less operational description

of ILDI networks, and finding exactly what restrictions on the distributions are necessary for

obtaining useful accuracy bounds for the numerical integrations, and for the final result. Good

computational algorithms need to be found, balancing computational time and complexity

against accuracy. As far as possible, we would also like to study inference issues beyond the

computation of the posterior for . For example, if some variable in the network

represents experimental conditions under the control of the scientists, a description of how the

posterior for depends on is useful for experimental-planning purposes. Also, a

description relating the posterior for to a conditional distribution in another part of the

network can be useful while building the model, so that the efforts specifying conditional

distributions can be focused on the parts of the model where these distributions have the most

influence on the result.

Our workplan is:

1. Defining some example ILDI networks, and establishing computational algorithms for

posteriors in these networks that include rigorous error bounding.

2. Generalizing the results from step 1 to results for a larger class of ILDI networks.

3. Describing theoretical boundaries for when the techniques established in steps 1 and 2

can be used, in terms of network structure and conditional distributions.

4. Addressing the other computational issues mentioned above.

Task 1 will run over the entire period of the project. Petter Mostad will be responsible, and

will perform the work in cooperation with a new PhD student. The task will be performed in

close interplay with Task 2.

Task 2: Studying the processes of applied model formulation and use, with focus on

ILDI network models in forensics.

We plan a series of example projects within SKL and SVA in which we look at concrete

applications where observations, often of several different types, provide uncertain knowledge

about variables of interest. Each example project will go through the following steps:

1. An ILDI model is formulated to reflect expert knowledge about how observations

relate to variables of interest. The model building will happen in a stepwise fashion, in

cooperation between scientists within the field, scientists with experience in building

Bayesian network models, and statisticians.

2. Computational solutions for the model are found and made available. This will happen

in close connection with Task 1.

Petter Mostad, 641210-5899, Bilaga A Page 6

3. Building on the cooperation from step 1, documentation will be produced about the

model and results from the model, with the goal that such results can gain scientific

acceptance, acceptance in the relevant organizations, and acceptance in courts of law,

where this is relevant.

The main objectives will be:

To ensure we work on computational solutions for networks relevant for the forensic

applications.

To study, and then support, the process of model building, i.e., the process with which

a group of scientists reformulate their common ideas and arguments into a network of

stochastic variables.

To support the spread and acceptance of the use of the developed networks and the

results computed from them, within the relevant organizations and, in some cases,

within the court system.

Below is a list of planned example projects. For some of these, work has already started. The

list is likely to be revised and extended during the project.

1) The level of colony-forming units (CFU) of Salmonella bacteria needs to be closely

monitored and limited in several types of animal feed, in order to limit the chance of

disease outbreaks. Complicating such monitoring is the fact that the CFU are not

always uniformly distributed within the feed. The variable of interest may be the

average number of CFU per volume, or some related variable more directly connected

to measuring regulatory compliance. The data consists of results from repeated

sampling from the feed, using various volumes and various detection methods,

including selective plating and Polymerase Chain Reactions (PCR).

2) Organic waste and compost need to be decontaminated from hazardous

microorganisms before they are allowed to be distributed on farmland as fertilizer.

Regulations specify the minimum levels of decontamination for various classes of

viral and bacterial pathogens (e.g., a 3-5 log-reduction) or a product quality criterion.

In a similar way to the example above, there is a need to develop strategies to monitor

and verify the efficiency of decontamination procedures and end product quality. The

variables of interest may be the relative amounts of the microorganisms before and

after decontamination, while the data are repeated samples as in the above example.

3) Verotoxin-producing E.coli (VTEC) are special strains of the common E.coli bacteria

that, unlike most E.coli, can lead to serious disease in humans. VTEC can spread to

humans from for example cattle farms via food products, and extensive efforts are

under way in Sweden to detect and monitor the levels of VTEC in cattle farms. A first

step is to understand and model the relationship between the level of VTEC at a cattle

farm, which would be the variable of interest, and test results from various types of

samples, taken either from individual animals or the farm environment.

Petter Mostad, 641210-5899, Bilaga A Page 7

4) When a Salmonella infection is confirmed at a farm, it is important to determine where

the bacteria come from, both to prevent further spread, and as part of negotiations to

determine the financial responsibility for the cleanup. Possible culprits include the

feed, transfer of live animals, visitors to the farm, and wild animals. The variable of

interest is the indicator of the source of the Salmonella on the farm. Data may include

test results attempting to detect Salmonella in the feed and feed system, at neighboring

farms, or in wild animals in the area. Data may also include genotypic and phenotypic

typing of Salmonella strains.

5) Evaluation of the evidential strength of combining different pieces of technical

evidence may be crucial to answer questions of guilt concerning a criminal activity.

For instance, consider a burglary case where several traces are found at the crime

scene and a number of potential sources of the traces have been identified. Here, the

variable of interest would be the scenario that has given rise to these traces and their

matches with potential sources. The data available are the matches themselves, but

there are also several auxiliary variables concerning the transfer and persistence of the

traces as well as their background population at the crime scene and on the identified

sources.

6) When a corpse is found after spending a considerable time outdoors, an important

source of information about time of death is often the development level of various

types of insects found in the corpse. With time of death being the variable of interest,

the data consists of the appearance and life-cycle status of a range of different species

of insects, in addition to meteorological data for the area.

7) The traceability of food products is an important issue for several reasons. One

possible approach to this question is to use that stable isotopes of elements occurring

naturally in the food appear at different ratios at different geographical locations.

Specifically, for a given food product, the variable of interest might be the

geographical area in which it was produced, while the available data might be the

concentrations and ratios of various isotopes.

This task will run concurrently with Task 1 under the entire project period, while individual

example projects should be completed over less time. The responsibility will be shared

between Anders Nordgaard (SKL), Gunnar Anderssen (SVA), and Petter Mostad (Chalmers).

These three applicants are currently cooperating in a project where Ronny Hedell, who started

his PhD studies in October 2011, is studying Bayesian models for forensic problems, both at

SVA and SKL. Mostad is Hedell’s advisor, and Nordgaard and Andersson are co-advisors,

while Hedell is employed at SKL. As a part of Hedell’s project, he will also construct

Bayesian models for forensic use of PCR on crime scene samples that are contaminated with

PCR inhibitors, in cooperation with Johannes Hedman at Lund University. Hedell’s work fits

closely with the work planned for this project.

Petter Mostad, 641210-5899, Bilaga A Page 8

Significance

Within the Bayesian paradigm for statistics, it is natural to separate clearly between model

formulation (and model fitting) and computational issues given a specific model. Ideally, the

details of how computational solutions are obtained should not be necessary to understand for

those who formulate and use Bayesian models in applied settings. However, although a huge

number of special-purpose methods and programs for Bayesian inference exist, there are

curiously few programs in widespread use for inference in general-purpose Bayesian

networks. The main examples mentioned here are WinBUGS, Hugin, and GeNIe. In this

project, we aim to take a large step towards improving this situation, by deriving and

implementing inferential methods for ILDI networks. We recognize that the processes of

model building and usage can be quite difficult in applied settings, and plan to develop

methods, and ultimately software, that aid in this process. Although our focus is on forensics,

it is likely that our methods will be useful also in other application areas.

Preliminary results

As part of Hedell’s PhD project, work has started on model formulation for example projects

1, 3, and 4 mentioned above. In particular, we have had meetings and workshops at SVA

focused on model development for projects 3 and 4.

Mostad has worked with Bayesian statistics since 1992, when he started working with oil

reservoir models at the Norwegian Computing Center in Oslo. Since 1994, he has together

with Thore Egeland (currently at the Norwegian University of Life Sciences in Ås, Norway)

headed the development of Familias, a program that uses Bayesian network ideas for solving

specific problems within forensic genetics (Egeland, Mostad, Mevåg, & Stenersen, 2000).

Mostad has taught a long range of courses in Bayesian statistics on the PhD and master levels,

and has developed the R package lestat (cran.r-project.org/web/packages/lestat) which

implements a couple of the ideas planned to be implemented for ILDI networks.

Nordgaard has worked with Bayesian statistics and Bayesian networks as a statistician at

Linköping University (1996-2010) and now at SKL. He is also organizing courses for PhD-

students in the use of Bayesian Networks, last time in 2010, when Mostad participated as a

teacher. Publications include a recent paper with Hedell on forensic statistics (Nordgaard,

Hedell, & Ansell, 2012).

Andersson has worked since 2007 at SVA, where much of his research has concerned the type

of applications described in the example projects. The publication (Koyuncu, Andersson, &

Häggblom, Accuracy and Sensitivity of Commercial PCR-Based Methods for Detection of

Salmonella enterica in Feed, 2010) is directly related to example project 1, and publications

(Binter, et al., 2011) and (Koyuncu, Andersson, Vos, & Häggblom, 2011) are directly related

to example project 4. Andersson is also participating in the AniBio project (for information

contact the coordinator, [email protected]) and has participated in the Biotracer

project (www.biotracer.org), both related to the tracing of hazardous microorganisms.

Petter Mostad, 641210-5899, Bilaga A Page 9

International and national collaboration

Nordgaard and Mostad are both members of the FORSTAT Research Group. FORSTAT

organizes annual workshops and courses in forensic statistics, and teachers are drawn from

the FORSTAT Research Group. Members include Colin Aitken, Thore Egeland, Franco

Taroni, Julia Mortera, and about 15 other scientists central in the field of forensic statistics.

Nordgaard is agreement holder of the coordinated research project “Implementation of

Nuclear Techniques to Improve Food Traceability” funded by Joint FAO/IAEA Programme

on Nuclear Techniques in Food and Agriculture. (www-naweb.iaea.org/nafa/fep/crp/fep-

improve-traceability.html). In the EU funded project AniBioThreat Andersson is project task

leader for Task 1.1 “Terms, Definitions and Conceptual Modelling” with participants from

European institutes including ISS, ANSES, BFR, IFR and RIVM, and Swedish authorities

including the National Police Board, the Board of Agriculture, and the Civil Contingencies

Agency, and is also participating in task 4.2 “Scenario-based modeling in the detection field”

with Gary Barker (IFR Norwich) and Matthias Filter (BFR).

In addition to the formal networks and projects mentioned above, all the applicants participate

in a number of informal international networks of scientists. Mostad has a decades-long

cooperation with Thore Egeland at the Norwegian University of Life Sciences in Ås, Norway

centering on forensic genetics. Daniel Kling is a PhD student in Ås working with forensic

genetics, where Egeland is the supervisor and Mostad the co-supervisor. Mostad also has a

long-running cooperation with the family-genetics group headed by Gunilla Holmlund at the

Swedish National Board of Forensic Medicine in Linköping. The cooperation has included

co-supervision of PhD student Andreas Tillmar, and supervision of several projects on the

master level. In the years 2001-2004, Mostad participated in a project funded by the

Leverhulme foundation, with focus on using Bayesian networks in forensic genetics. Contacts

continue between the members of the group, which included Steffen Lauritzen, Philip Dawid,

Thore Egeland, Julia Mortera, Nuala Sheehan, Robert Cowell, and Vanessa Didelez.

Nordgaard has a history of international co-operation within the field of Environmental

statistics comprising being the assistant co-ordinator of the EC-funded project “Estimation of

human impact in the presence of natural fluctuations” (IMPACT), and co-organizing the

annual meeting of The International Environmetrics Society in 2006. For the moment

Nordgaard is involved (through his employment at SKL) in projects at FOI, Umeå about

safety and security with CBRNE materials involved. These projects are funded by MSB

(Myndigheten för Samhällsskydd och Beredskap) with the common objective to fulfill EU

regulations within this area. Again, Nordgaard’s role is to assist with Bayesian modeling for

source attribution of chemical or biological agents that may have been planted for criminal or

sabotage reasons.

Andersson has contacts with several researchers at SVA and the Swedish University of

Agricultural Sciences (SLU) who are regularly facing problems where the methods developed

in this project could be applied and who are willing to share data and expert knowledge. Dr

Anders Lindstöm is a forensic entomologist who is frequently involved in examining forensic

insect material and writing forensic reports, and also conducts research on the developmental

biology of insects. He also has an internal network of forensic entomologists. Dr Ann

Petter Mostad, 641210-5899, Bilaga A Page 10

Lindberg and co-workers at the Swedish Zoonosis Centre are responsible for surveillance and

reporting of zoonotic diseases (diseases on animals which may be transferred to humans) and

also conduct research to evaluate their performance, including studies on alternative sampling

strategies for VTEC. Dr Rickard Knutsson and co-workers are studying detection methods for

hazardous bacteria including Bacillus anthracis and Clostridium botulinum. Associate

professor Björn Vinnerås at SLU is currently studying large scale ammonia treatment of

sewage sludge in collaboration with Uppsala municipal sewage treatment plant. From 2007

and onwards Andersson has participated in the development of models for tracing

microbiological contaminants with researcher from RIVM and IFR.

Other grants

Mostad has previously received VR grant 80506801 for the project “Statistical solutions for

forensic DNA testing”. That project is focused on the specific issue of pedigree inference in

forensic genetics. Much of the theory used there is based on Bayesian Network ideas. The

project we apply for now extends in some ways the previously funded project, but is directed

towards much more general goals.

References Binter, Straver, Häggblom, Bruggeman, Lindqvist, Zentek, et al. (2011). Transmission and

control of Salmonella in the pig feed chain: A conceptual model. International Journal of

Food Microbiology , 145, 7-17.

Congdon. (2010). Applied Bayesian Hierarchical Methods. Chapman & Hall/CRC.

Cowell, Dawid, Lauritzen, & Spiegelhalter. (1999). Probabilistic Networks and Expert

Systems. New York: Springer.

Egeland, Mostad, Mevåg, & Stenersen. (2000). Beyond traditional paternity and identification

cases: Selecting the most probable pedigree. Forensic Science International , 110 (1).

Koyuncu, Andersson, & Häggblom. (2010). Accuracy and Sensitivity of Commercial PCR-

Based Methods for Detection of Salmonella enterica in Feed. Applied and Environmental

Microbiology , 2815-2822.

Koyuncu, Andersson, Vos, & Häggblom. (2011). DNA microarray for tracing Salmonella in

the feed chain. International Journal of Food Microbiology , 145, 18-22.

Lunn, Thomas, Best, & Spiegelhalter. (2000). WinBUGS - a Bayesian modelling framework:

concepts, structure, and extensibility. Statistics and Computing , 10:325-337.

Nordgaard, Hedell, & Ansell. (2012). Assessment of forensic findings when alternative

explanations have different likelihoods - "Blame-the-brother"-syndrome. Science and justice .

Taroni, Aitken, Garbolino, & Biedermann. (2006). Bayesian Networks and Probabilistic

Inference in Forensic Science. Wiley.

VRAPS/VR-Direct bilaga 2004.Be Vetenskapsrådet, Box 1035, SE-101 38 Stockholm, tel. +46 (0)8 546 44 000, [email protected]

Name of applicant

Date of birth

Kod

Title of research programme

Appendix BCurriculum vitae

Bilaga B, Petter Mostad, 641210-5899

Curriculum Vitae

Petter Mostad 19641210-5899

Högskoleexamen Cand. Scient. i matematik, 1987, Universitetet i Oslo.

Doktorsexamen PhD i matematik, 1991, Princeton University, USA. Titel: ”Bounded K-Theory of the Bruhat-

Tits Building for the Special Linear Group over the p-adics with Appliation to the Assembly

Map”. Handledare: “Gunnar Carlsson”.

Postdocvistelser 2005 – 2006: Postdoc vid Avdeling for biostatistikk, Universitetet i Oslo.

Docentkompetens Oavlönad docent vide Göteborgs universitet 2006, docent vid Chalmers Tekniska Högskola

2007 .

Nuvarande anställning Docent vid Matematiska Vetenskaper, Chalmers Tekniska Högskola. Fast anställd på

Chalmers från 2006-10-01. Jag har ca 40 % forskningstid.

Tidigare anställningar 1992 – 2000: Fast anställd som forskare, och från 1998 chefsforskare, vid Norsk Regnesentral,

Oslo.

2000 – 2005: Gästforskare vid Matematiska Vetenskaper, Chalmers Tekniska Högskola.

2005 – 2006: 50 % postdoc vid Avdeling for biostatistikk, Universitetet i Oslo, och 50 %

försteamanuensis (associate professor) vid Institutt for Helseledelse og Helseökonomi i Oslo.

Handledning Inga personer har avlagt doktorsexamen eller gjort postdoktorsvistelse under min

huvudhandledning, men se under övrigt.

Avräkningsbar tid Militärtjänst vid norska försvarets forskningsinstitut ca 1 år, 1991-1992.

Övrigt Ledning och styrelser

Anställdes representant vid styrelsen för Norsk Regnesentral i Oslo 1996-1997.

Medlem av Norsk Matematikkråd 1996-1998.

Beviljade forskningsmedel

I 2008 beviljades 1 950 000 kronor från VR för projektet ”Statistiska metoder för

rättsgenetiska tester”, avtals-ID 80506801.

Bilaga B, Petter Mostad, 641210-5899

Avslutat doktorandhandledning

Som biträdande handledare:

Magnus Åstrand, disputerade den 14 februari 2008. Avhandlingens title: ”Normalization and

Differential Gene Expression Analysis of Microarray Data”. Huvudhandledare: Professor

Mats Rudemo, Matematiska Vetenskaper, Chalmers Tekniska Högskola.

Som biträdande handledare:

Andreas Tillmar, disputerade 7 maj 2010. Avhandlingents titel: ”Populations and Statistics in

Forensic Genetics”. Huvudhandledare: Professor Bertil Lindblom, Linköping Universitet.

Som huvudhandledare fram till licentiat:

Krzysztof Bartoszek, licentiat 2 december 2011. Licentiatuppsatsens titel: ”Multivariate

Aspects of Phylogenetic Comparative Methods”.

Nuvarande doktorandhandledning

Som huvudhandledare:

Ronny Hedell. Projek: Statistisk modellering i forensisk resultatvärdering

Som biträdande handledare:

Daniel Kling. Projekt: Forensisk genetik. Huvudhandledare: Professor Thore Egeland,

Universitetet for Miljö og Biovitenskap i Ås, Norge.

Bilaga B, Anders Nordgaard, 620824-7871

Curriculum Vitae

NORDGAARD, Hans ANDERS

Född 1962-08-24

1. Civilingenjörsexamen, 1986, Teknisk Fysik och Elektroteknik, Linköpings universitet

2. Teknisk doktorsexamen, 1996, Matematisk statistik, Linköpings universitet

3. –

4. Docent, 2011

5. Statistiker (forensisk specialist), Statens Kriminaltekniska Laboratorium [andel

forskning i anställningen: -]

Adjungerad universitetslektor (20%) i Statistik, Institutionen för datavetenskap,

Linköpings universitet [andel forskning i anställningen: 0%]

6. Universitetslektor i Statistik, Linköpings universitet (Matematiska institutionen 1996-

2007, Institutionen för datavetenskap 2007-2010)

Universitetsadjunkt i Statistik, Matematiska institutionen, Linköpings universitet,

1993-1996

Universitetsadjunkt i Matematisk statistik, Matematiska institutionen, Linköpings

universitet, 1992-1993

Doktorand i Matematisk statistik, Matematiska institutionen, Linköpings universitet,

1986-1992 [assistent med utb. bidrag 1986 -1990]

7. Studierektor i Statistik, Linköpings universitet, 1996-2002

8. -

Bilaga B, Gunnar Andersson, 721005-8512

CURRICULUM VITAE 2012 04 16

Name: Mats Gunnar Andersson

Address: Kantarellv 19, 75645 Uppsala, SWEDEN

Phone: home: +46 18 505131; Work: +46 18 674082

e-mail: [email protected]

Date of birth: Oct, 5 1972

University Degree 1997 Master of Science, Uppsala University. Subject: Biology Doctoral degree 2002 Doctor of Philosophy: Uppsala University, Dept. Evolutionary biology, Div. Comparative Physiology Subject: Biology. Title: Differentiation and Pathogenicity within

the Saprolegniales. Postdoctoral experience 2002-2004 Postdoc dept. Medical Biochemistry and Microbiology (IMBIM), Uppsala University 2005-2008 Postdoc Linnaeus Centre for Bioinformatics, Uppsala University. Docent degree

Present employment 2007- National Veterinary Institute, Uppsala, Dept. Chemistry, Environment and Feed Hygiene,

(KMF), Adress: National Veterinary Institute (SVA), SE-751 89 Uppsala, Sweden, +4618 674000

Previous employments 1997-2002 PhD student Div. Comparative Physiology, Uppsala University. PhD position 2002 Teacher, High school of Gävle, Dept. Mathematics, Science and Computer-Science. Part time, Short term contract, Apr-Jun 2002 2002-2004 Postdoc dept. Medical Biochemistry and Microbiology (IMBIM), Uppsala University August 2002 – July 2004. Several short term contracts, 3 month – 1 year.. 2004-2005 Teacher Eriksbergsskolan in Uppsala Disease substitute. Sept 2004 – Mar 2005 by hour. Apr-Dec 2005full time. 2005-2008 Researcher Linnaeus Centre for Bioinformatics, Uppsala Short term contracts (researcher). Full time Jan 2005 – June 2007. Part time July-Dec 2008. Interruptions in research Approximately 3 month parental leave. Majority in July 2000 and July –August 2010. In aug 2004 – Dec 2005 I changed focus from laborative biology to bioinformatics. Studies in mathematics (Uppsala University)and bioinformatics (Linnaeus Centre of Bioinformatics, Uppsala) were made possible by working as a substitute teacher for approx 1.5 years. At SVA I have primarily been working for two large EU funded projects (Biotracer and AniBioThreat). My duties in these projects have included reporting, training, statistical support, networking and dissemination activities and only limited time for independent research. PhD students for which I have been main supervisor

Bilaga B, Gunnar Andersson, 721005-8512 Conferences, seminars. 2010. Feb 1-2. Invited speaker at the European Commission expert committee meeting on "agricultural contaminants – fusarium toxin forum, Brussels. Title : “Automatic and manual sampling for ochratoxin A in barley, impact of sampling and sample preparation on measurement uncertainty.” 2010. June 28 - 30 Oral presentation. “Sampling for traceability and conformance testing of Salmonella in animal feedingstuffs.” I3S, St. Malo, France. 2011 oct 20. “Sampling of feed”. Invited presentation at seminar with Swedish Laboratory Response Network, Kista, Sewden. 2011 Jan 26th “Levels and Distribution of Salmonella in feed, implications for sampling”. Cross-diciplinary seminar on Salmonella, SVA, Uppsala, Sweden. PhD-students for which I am or have been deputy supervisor. 2011 Ronny Hedell. National Forensic Laborratory (SKL),L inköping Sweden (ongoing). 2009. Marcin Kierczak, Linnaeus Centre for Bioinformatics, Uppsala. (2011). Project students for which I have been associated supervisor. 2011. Jin Wen. “Educational Software for Salmonella Biotracing in Feed Chain: User Interface Design”, Master thesis in Human-Computer Interaction, 30 credits Department of Informatics and MediaUppsala University 2010. Anders Sundström. Master thesis in Bioinformatics. “Prediction models for emergence of mycotoxins in grain” 2008. Camilla Ohlsson. Student project in Biomedicine. “Adjustment of the MPN method for the quantification of Salmonella in animal feed”. 2008. Mikael Andersson. Master thesis in statistics. “Sampling for biological contagions – What strategies may give a good base for decisions?” Research profile I started my scientific career as a molecular biologist. During my PhD I studied the interactions between pathogenic fungi and aquatic animals and during my first postdoc I studied how adenovirus escape the newly discowered defence mechanism RNA-interference. My interest in bioinformatic* Knutsson, R., B. van Rotterdam, et al. (2011). "Accidental and deliberate microbiological contamination in the feed and food chains -- How biotraceability may improve the response to bioterrorism." International Journal of Food Microbiology 145(Supplement 1): S123-S128. s lead me to shift my research focus. After working a year as a teacher and studying mathematics and bioinformatics I was offered a postdoc position at the Linnaeus Centre for Bioinformatics (LCB) where I was involved in various machine learning projects and was also co-supervisor for a PhD student. My main focus was prediction uncertainty in medical decisions support systems Since June 2007 I work at SVA. Until the end of 2010 I mainly worked in the EU funded project Biotracer. The focus of the project was to design models for improved traceability of unintended microorganisms and their by-products in food and feed. My background which combines experimental biology with bioinformatics proved very useful in this project and was leader for 2 tasks. “ Data generation in the feed chain and validation of sampling plans” and ”Advanced sampling for sporeforming bacteria”. The latter task focused on the development of a conceptual model for how sampling plans can be designed in case of a bioterror incident and a standard operation procedures for sampling plan design.In the first year I was also the project leader for two theoretical tasks related to sampling: “Sampling plan for the feed chain” and “Sampling of bioterror agents”. From the End of 2011 I work in the EU project AniBioThreat (HOME/2009/ISEC/AG191). My main responsibility is to be task leader for task 1.1 “Terms, Definitions and Conceptual Modelling”, which is a

Bilaga B, Gunnar Andersson, 721005-8512 horizontal task with links to several other tasks. In addition I take part in the task “Scenario based modeling in the detection field”. In connection with the AniBioThreat project I was in October appointed assistant supervisor for a PhD student at the Swedish National Laboratory of Forensic Sciences with the focus on statistical methods for planning and evaluating results from sampling for hazardous microorganisms.

VRAPS/VR-Direct bilaga 2004.Ce Vetenskapsrådet, Box 1035, SE-101 38 Stockholm, tel. +46 (0)8 546 44 000, [email protected]

Name of applicant

Date of birth

Kod

Title of research programme

Bilaga C Petter Mostad, 641210-5899 Page 1 of 2

Number of citations obtained from Google Scholar

1 Referee-bedömda artiklar

Bartoszek, K; Pienaar, J; Mostad, P; Andersson, S; Hansen, T: ”A comparative

method for studying multivariate adaptation”. Accepted for publication in Journal of

Theoretical Biology. Number of citations: 0

* Tillmar, A.O.; Egeland, T; Lindblom, B.; Holmlund G; Mostad P: ”Using X-

chromosomal markers in relationship testing: Calculation of likelihood ratios taking

both linkage and linkage disequilibrium into account”. Forensic Science Interantional:

Genetics, November 2011. Number of citations: 8

* Egeland, T; Dawid, AP; Mortera, J; Mostad, P; Tillmar, T: “Response to: DNA

identification by pedigree likelihood ratio accommodating population substructure and

mutations.” Investigative Genetics, 2011. Number of citations: 1

* Tillmar, A.O.; Mostad, P; Egeland, T; Lindblom, B; Holmlund, G; Montelius, K:

"Analysis of linkage and linkage disequilibrium for eight X -STR markers". Forensic

Sci. Int. Gene. 2008. Number of citations: 23

Bonander Nicklas, Ferndahl Cecilia, Mostad Petter, Wilks Martin D B, Chang Celia,

Showe Louise, Gustafsson Lena, Larsson Christer, Bill Roslyn M: "Transcriptome

analysis of a respiratory Saccharomyces cerevisiae strain suggests the expression of its

phenotype is glucose insensitive and predominantly controlled by Hap4, Cat8, and

Mig1". BMC Genomics 2008, 9:365. Number of citations: 9

Åstrand Magnus, Mostad Petter, Rudemo Mats: "Empirical Bayes models for multiple

probe type microarrays at the probe level". BMC Bioinformatics. 2008 Mar 20; 9(1).

Number of citations: 6

* Karlsson, Andreas; Holmlund, Gunilla; Egeland, Thore; Mostad, Petter: “DNA-

testing for immigration cases: The risk of erroneous conclusions”. Forensic Science

International, 2007. Number of citations: 23

Larsson Erik, Lindahl Per, Mostad Petter: HeliCis: a DNA motif discovery tool for

colocalized motif pairs with periodic spacing, MBC Bioinformatics 2007, 8:418.

Number of citations: 8

Åstrand Magnus, Mostad Petter, Rudemo Mats: "Improved covariance matrix

estimators for weighted analysis of microarray data", Journal of Computational

Biology, 2007 Dec; 14(10): 1353 -67. Number of citations: 8

L He, Y Sun, J Patrakka, P Mostad, J Norlin, Z Xiao, J Andrae, K Tryggvason,

T Samuelsson, C Betscholtz and M Takemoto: "Glomerulus-specific mRNA

transcripts and proteins identified through kidney expressed sequence tag database

analysis". Kidney International, February 2007. Number of citations: 21

Minoru Takemoto, Liqun He, Jenny Norlin, Jaakko Patrakka, Zhijie Xiao, Tatiana

Petrova, Cecilia Bondjers, Julia Asp, Elisabet Wallgard, Ying Sun, Tore Samuelsson,

Petter Mostad, Samuel Lundin, Naoyuki Miura, Yoshikazu Sado, Kari Alitalo, Susan

E Quaggin, Karl Tryggvason and Christer Betsholtz: "Large-scale identification of

genes implicated in kidney glomerulus development and function". The EMBO

Bilaga C Petter Mostad, 641210-5899 Page 2 of 2

Journal, February 2006; 25(5): 1160 -74. Number of citations: 74

Mostad PF, Egeland T, Cowell RG, Bosnes V, Braaten Ø: "The quest for a donor:

Probability based methods offer help" Statistical Research Paper 26, Sir John Cass

Business School, City University London, Nov 2005. Number of citations: 0

Bonander, N.; Hedfalk, K.; Larsson, C.; Mostad, P.; Chang, C.; G ustafsson, L.; Bill,

R.: "Design of Improved Membrane Protein Production Experiments: Quantitation of

the Host Response". Protein Science 2005 14: 1729-1740. Number of citations: 39

Nelander, S.; Larsson, E.; Kristiansson, E.; Månsson, R.; Nerman, O.; Sig vardsson,

M.; Mostad, P.; Lindahl, P.: "Predictive screening for regulators of conserved

functional gene modules (gene batteries) in mammals". BMC Genomics 2005, 6:68.

Number of citations: 29

Nelander, S.; Mostad, P.; Lindahl, P.:"Prediction of cell type-specific gene modules:

identification and initial characterization of a core set of smooth muscle-specific

genes." Genome Res. 2003 Aug;13(8):1838 -54. Number of citations: 33

Ståhlberg, Anders; Åman, Pierre; Ridell, Börje; Mostad, Petter; Kubista, Mi kael: "A

quantitative real-time PCR method for detection of B-lymphocyte monoclonality by

comparison of kappa and lambda immunoglobulin light chain expression". Clinical

Chemistry, 2003 Jan; 49(1):51 -9. Number of citations: 96

* Egeland, T; Mostad, P; M evåg, B; Stenersen, M: “Beyond traditional paternity and

identification cases: Selecting the most probable pedigree”. Forensic Science

International, 2000. Number of citations: 76

3 Översiktsartiklar, bokkapitel Mostad, P: ”Some Applications of Bayesian Statistics”. Book chapter from “New

Directions in the Mathematical and Computer Sciences”, Editors Ekhaguere, Nwozo,

publications of the ICMCS, Nigeria, 2008. Number of citations: 0

5 Egenutvecklade allmänt tillgängliga datorprogram

Familias: www.familias.name och cran.r-project.org/web/packages/Familias

Lestat: http://cran.r-project.org/web/packages/lestat

Bilaga C, Anders Nordgaard, 620824-7871

Publikationslista, Anders Nordgaard

1. Referee-bedömda artiklar

Nordgaard A., Hedell R., Ansell R. Assessment of forensic findings when alternative

explanations have different likelihoods – “Blame-the-brother”-syndrome. Science and Justice 2012. [doi: 10.1016/j.scijus.2011.12.001]

Nordgaard A., Hedberg K., Widén C, Ansell R. (2012).Comments on ”The database

search problem” with respect to a recent publication in Forensic Science International.

Letter to the editor. Forensic Science International 217: e32-e33.

[doi:10.1016/j.forsciint.2011.11.023]

Nordgaard A., Ansell R., Drotz W., Jaeger L. (2012) Scale of conclusions for the

value of evidence. Law, Probability and Risk 11(1): 1-24. [doi:10.1093/lpr/mgr020]

Nordgaard A. & Höglund T. (2011). Assessment of Approximate Likelihood Ratios

from Continuous Distributions: A Case Study of Digital Camera Identification.

Journal of Forensic Sciences 56(2): 390-402.

Hedman J., Ansell R., Nordgaard A. (2010). A ranking index for quality assessment of

forensic DNA profiles. BMC Research Notes 2010 3:290.

Hedman J., Nordgaard A., Dufva C., Rasmusson B., Ansell R. & Rådström P. (2010).

Synergy between DNA polymerases increases polymerase chain reaction inhibitor

tolerance in forensic DNA analysis. Analytical Biochemistry 405: 192-200.

Hedman J,. Nordgaard A., Rasmusson B., Ansell R. & Rådström P. (2009). Improved

forensic DNA analysis through the use of alternative DNA polymerases and statistical

modeling of DNA profiles. Biotechniques 47 (5): 951-958.

Nordgaard A. & Grimvall A. (2006). A resampling technique for estimating the power

of non-parametric trend tests. Environmetrics 17: 257-267.

Nordgaard A. (2005). Quantifying experience in sample size determination for drug

analysis of seized drugs. Law, Probability and Risk 4: 217-225.

Nordgaard A. & Hjorth U. (1993). Statistical extrapolation of nutrient concentrations

in the Baltic Sea. Environmetrics 4: 279-309.

2. Referee-bedömda konferensbidrag

Bilaga C, Anders Nordgaard, 620824-7871

Digréus P., Andersson A -C., Nilsson J., Dufva C., Nordgaard A., Ansell R. (2011).

Contamination monitoring in the forensic DNA laboratory and a simple graphical

model for unbiased EPG classification. Forensic Science International: Genetics, Supplement Series, 2011;3:e299 -e300.

Hedell R., Nordgaard A., Ansell R. (2011). Discrepancies between forensic DNA

databases. Forensic Science International: Genetics, Supplement Series, 2011;3:e135 -

e136.

Nordgaard A. (2003) Impact of Sampling Frequency on the Power of Nonparametric

Tests for Water Quality Trends. In: The Information Society and Enlargement of the

European Union, part 2 (A. Gnauck, R. Heindrich, eds.). Metropolis-Verlag,

Marburg.

Libiseller C. & Nordgaard A. (2002). Variance Reduction for Trend Analysis of

Hydrochemical Data in Brackish Waters. In: Environmental Communication in the

Information Society, part 1 (W. Pillmann, K. Tochtermann, eds.) ISEP, Vienna.

Nordgaard A. (1992). Resampling Stochastic Processes using a Bootstrap Approach.

In: Bootstrapping and Related Techniques (K.H. Jöckel, G. Rothe, W. Sendler, eds.)

Lecture notes in Econometrics and Mathematical Systems 376. SpringerVerlag,

Berlin.

3. Översiktsartiklar, bok kapitel, böcker

Lundquist P., Nordgaard A. (2011). Statistisk analys av narkotikahalter i material från

polisbeslag analyserade på SKL. SKL Rapport 2011:12. Statens Kriminaltekniska

Laboratorium, Linköping, Sweden.

Nordgaard A., Wistedt I., Drotz W., Elmqvist J., Höglund T., Jaeger L., Torbjörnsson

M., Palmborg J., Sullivan S., Wigilius I. (2010). Uppfattning av värdeord i

sakkunnigutlåtanden – En studie genomförd bland olika aktörer i rättsprocessen i

Sverige. SKL Rapport 2010:01. Statens Kriminaltekniska Laboratorium, Linköping,

Sweden.

4. Populärvetenskapliga artiklar

Nordgaard A. (2011). Resultatvärdering med Bayesianska nätverk. Kriminalteknik 4,

2011. Statens Kriminaltekniska Laboratorium, Linköping, Sweden.

Bilaga C, Anders Nordgaard, 620824-7871

Nordgaard A. (2011) SKL:s utlåtandeskala. Kriminalteknik 2,2011. Statens

Kriminaltekniska Laboratorium, Linköping, Sweden.

Bilaga C, Gunnar Andersson, 721005-8512

List of publications, Gunnar Andersson, April, 16 2012. * Aricles of relevance for this application Peer reviewed articles, Kruczyk M, Zetterberg, H., Hansson, O., Rolstad, S., Minthon, L., Wallin, A., Blennow, K., Jan Komorowski, J., Andersson, M.G.(2012) Monte Carlo Feature Selection and Rule-Based models to predict Alzheimer’s disease in mild cognitive impairment. Journal of Neural Transmission (Accepted for publication).

* Reiter E.V. , Dutton M.F., Agus A., Nordkvist E., Mwanza M.F. , Njobeh P.B., Prawano D., Häggblom P., Razzazi-Fazeli E., Zentek J., Andersson M.G. (2011). Uncertainty from Sampling in Measurements of Aflatoxins in Animal Feedingstuffs: Application of the Eurachem/CITAC guidelines. Analyst, 136(19), 4059-69

* Andersson M.G., Reiter E.V. Lindqvist P.-A., Razzazi-Fazeli E. , Häggblom P. (2011). Comparison of manual and automatic sampling for monitoring ochratoxin A in barley grain. Food Addit Contam Part A Chem Anal Control Expo Risk Assess. 28(8):1066-75. * Binter C, Straver J.M., Häggblom P, Bruggeman G, Lindqvist P-A, Zentek J, Andersson MG (2011). "Transmission and control of Salmonella in the pig feed chain: A conceptual model." International Journal of Food Microbiology 145(Supplement 1): S7-S17. *Koyuncu S., Andersson M.G., Vos P., Häggblom P.. DNA microarray for tracing Salmonella in the feed chain. International Journal of Food Microbiology (Special issue BIOTRACER) 145 (2011), 18-22. * Koyuncu,C., Andersson, M. G., Häggblom, P., 2010. Accuracy and sensitivity of commercial PCR-based methods for detection of Salmonella in feed. Applied and Environmental Microbiology. 76(9) 2815-2822 Cerenius, L., Liu, H., Zhang, Y., Rimphanitchayakit, V., Tassanakajon, A., Andersson, M. G., Söderhäll, K., Söderhäll, I. (2009). High sequence variability among hemocyte-specific Kazal-type proteinase inhibitors in decapod crustaceans. Developmental & Comparative Immunology (34,1), pp 69-75. Hvidsten T. R. , Lægreid A., Kryshtafovych A., Andersson M.G., Fidelis, K., Komorowski, J. (2009) A comprehensive analysis of the structure-function relationship in proteins based on local structure similarity. PLoS ONE 4(7) Andersson, G., Xu, N., Akusjärvi, G. (2007). In vitro methods to study RNA interference during an adenovirus infection. Methods in Molecular Medicine 131, 47-61. Andersson M. G, Haasnoot J. P. C, Xu N., Berenjian S., Berkhout B., Akusjärvi, G. (2005). Suppression of RNA interference by adenovirus VA RNA. Journal of Virology (15), pp. 9556-65. Roya, F., Andersson, M. G., Bangyeekhun, E., Cerenius, L., Múzquiz, J. L., Söderhäll, K.(2004) Physiological and genetic characterisation of some Aphanomyces strains associated with crayfish mortalities. Vet. Microbiol.104 (1-2): 103-112. Andersson, M. G. and Cerenius, L. (2002). Pumilio homologue from Saprolegnia parasitica specifically expressed in undifferentiated spore cysts. Eukaryotic Cell 1(1), pp. 105-111. Andersson, M. G. and Cerenius, L. (2002). Analysis of chitinase expression in the crayfish plague fungus, Aphanomyces astaci. Dis aquat org 51(2), pp. 139-147 Peer reviewed conference proceedings

Bilaga C, Gunnar Andersson, 721005-8512

* Andersson. M.G., Straver, J., Löfström, C., Lindqvist, P-A, and Häggblom, P (2010). Sampling for

traceability and conformance testing of Salmonella in animal feedingstuffs. International symposium of

Salmonella and salmonellosis. June 28-30 2010. , St Malo France. Proceedings. p. 69-72, Ploufragan,

France, ZOOPOLE développement-– ISPAIA, 2010 Rewiew articles and chapters in books. * Knutsson. R, , van Rotterdam, B., Fach, P., De Medici, D., Fricker, M., Löfström, C., Ågren, J., Segerman, B., Andersson, M.G., Wielinga P., Fenicia, L., Skiby, J.,Schultz A.C., Ehling-Schulz M. . (2011). "Accidental and deliberate microbiological contamination in the feed and food chains -- How biotraceability may improve the response to bioterrorism." International Journal of Food Microbiology 145(Supplement 1): S123-S128. Cerenius, L. Andersson, M.G., and Kenneth Söderhäll, K. (2009). Aphanomyces astaci and Crustaceans. In Oomycete Genetics and Genomics: Diversity, Interactions and Research Tools (Lamourm K., Kamoun, S. eds). Wiley-Blackwell. Andersson, M.G., Xu, N. and Akusjärvi, G. (2005). In vitro methods to study RNA interference in adenovirus-infected cells. In Methods in Molecular Biology. Ed. W.S.M. Wold. Humana Press Inc., Totowa, New Jersey. Vol 131, pp47-61. ISBN 978-1-58829-901-7 Popular science articles. * 16.Andersson, M.G., Häggblom, P. (2009). Sampling for contaminants in feed. Feed International.Mars 2009 pp 16-19. 17. Häggblom, P and Andersson, M.G. (2009). Mögelgifter i foder och livsmedel påverkar folkhälsan. SVA- vet 2: 30-31.

VRAPS/VR-Direct bilaga 2004.Re Vetenskapsrådet, Box 1035, SE-101 38 Stockholm, tel. +46 (0)8 546 44 000, [email protected]

Name of applicant

Date of birth

Kod

Title of research programme

Bilaga N, Petter Mostad, 641210-5899

Budget

% av heltid 2013 2014 2015 2016

Petter Mostad 45 603 624 643 665

Ny doktorand 100 818 845 873 902

Gunnar Andersson 20 49 205 211 218

Anders Nordgaard 10 116 120 125 128

Konferensresor

20 20 20 20

Forskningsresor

100 100 100 100

Direkta lokalkostnader

141 159 164 170

Direkta IT-kostnader

11 12 13 13

SUMMA

1858 2085 2149 2216

Budgeten är beräknat baserat på Chalmers fullkostnadskalkyl, och Chalmers mall för

overhead är använd för alla lönekostnader. Gunnar Andersson ansöker om medel i perioden

2013-10-01 till 2016-12-31, medan alla andra personer ansöker för perioden 2013-01-01 till

2016-12-31.

Projektet är mycket svårt att genomföra utan en ny doktorand. Tvärvetenskapligheten betyder

också att arbetskraftsresurser behövs på alla de tre samarbetande institutionerna.

Konferensresor är en viktig del av ett sådant projekt, speciellt när en ny doktorand deltar.

Forskningsresor är nödvändiga då projektet är ett samarbete mellan Chalmers i Göteborg,

SKL i Linköping, och SVA i Uppsala. Lokalkostnader och IT-kostnader tillkommer enligt

Chalmers mall.

Det är inte beviljad eller ansökt om resurser till detta projekt från andra finansiärer.

VRAPS/VR-Direct b Vetenskapsrådet, Box 1035, SE-101 38 Stockholm, tel. +46 (0)8 546 44 000, [email protected]

Name of applicant

Date of birth Reg date

Kod Dnr

Project title

DateApplicant

Head of department at host University Clarifi cation of signature Telephone

Vetenskapsrådets noteringarKod