Interpreting Metabolomic Profiles using Unbiased Pathway ...
Exploring Metabolomic data with recursive partitioning Metabolomic Workshop NISS July 14-15, 2005.
-
Upload
jeremy-tyler -
Category
Documents
-
view
213 -
download
0
Transcript of Exploring Metabolomic data with recursive partitioning Metabolomic Workshop NISS July 14-15, 2005.
University of North Carolina Wilmington
Why study metabolites?• Metabolomics – the global study of all small
molecules produced in the human body
• Biochemical consequences of environment, drugs, and mutations can be observed directly through metabolites
• Understand how drugs work, interactions and possible side effects
• ~2500 metabolites
University of North Carolina Wilmington
Challenges of metabolomic data
• Nonnormal distributions
• Outliers
• Informative missing values
• High correlation among metabolites
• n < p problem
(n - number of biological samples and
p - number of metabolites)
University of North Carolina Wilmington
Why recursive partitioning?• Is fairly robust to non-normal data
• Missing values is not an issue
• Correlation among variables is not an issue
• Useful for discovering outliers
• Is efficient at handling large p, small n data sets
University of North Carolina Wilmington
How recursive partitioning works• Recursive partitioning efficiently searches through
all of the variables and finds the one with the best split (most significant)
• Once data is split or “partitioned” on this variable, the resulting daughter nodes are more homogeneous
• Now each daughter node is explored to find the best split
• This process is continued until no significant split remains
University of North Carolina Wilmington
Multiple Trees• All effects are not necessarily found in a
single tree
• In any node, there may be more than one significant variable
• Creating multiple trees may reveal a number of possible effects
• Gain an understanding of interactions/correlations among metabolites
University of North Carolina Wilmington
Software• Helix Tree (Partitionator)
• www.goldenhelix.com
• Uses Formal Inference-based Recursive Modeling (FIRM) developed by Douglas Hawkins
• Anyone can download free 7 day trial (webinars to assist in using the software)
University of North Carolina Wilmington
Illustration of Software• Data
– 317 metabolites – LC/MS and GC/MS– 63 biological samples– Want to discover which metabolites differentiate
between the diseased group and the “healthy” individuals (within the diseased group there is a subset of individuals currently taking drugs)