CDT PROJECTS 2013-14 John Keane, Software Systems Group [email protected] 1. Data Analytics / Big...
-
Upload
osborn-blake -
Category
Documents
-
view
213 -
download
0
Transcript of CDT PROJECTS 2013-14 John Keane, Software Systems Group [email protected] 1. Data Analytics / Big...
CDT PROJECTS 2013-14John Keane, Software Systems Group
1. Data Analytics / Big Data
2. Parallel & Distributed Systems
3. Decision Support Systems
HAPPY TO DISCUSS
With Nenadic
CHALLENGE
• Investigate:
– Applications: characteristics and predictability
– Data Analytic / Machine Learning Algorithms – relatively simple so far
– Software: Map-Reduce, Hadoop
– Hardware: various platforms
Big Data Analytics (IBM funded)
With Nenadic, Zeng, Stivaros (Consultant, RMCH)
• Adverse drug event detection (EU funded)
– Bayesian/Fuzzy association rules algorithms
CHALLENGE
– Compare/contract accuracy of prediction
• Clinical Outcome Mining (Christie Hospital)
– Data/text-based clinical records – better diagnose and predict
CHALLENGE
– Illness staging; multi-modal data; changes over time;
• Decision Support for Radiology (NIHR-funded)
– Decision aid to assist better description of scans
CHALLENGE
– Usability; Integration with existing tools; Link to literature
Bio-medical data analytics
• Colossal itemsets:
- Very high dimensional datasets
- Run-time increases exponentially as average row length increases;
• Minimal unique itemsets (MUI) SUDA: Special Unique Detection
- “risky” records, those likely to be linked– 16 years old + widow
- Records of most concern have many, small MUIs
- SUDA s/w used by ONS, UK; licensed by Singaporean govt;
- Algorithm used by UN/World Bank International Household Survey
CHALLENGES:
• Data structure to represent itemsets during search process
• Search space pruning
• Algorithm: bottom-up; top-down; hybrid;
• Parallelism
Itemset Mining Algorithms {baby nappies}->{beer}
Eco-service composition (EU funded)
with Mehandjiev, MBS
• Aims to determine conditions for achieving eco-friendly, resilient and optimal service compositions on a distributed cloud infrastructure
• Two service optimisation approaches deployed:
1. Global: analyses end-to-end interaction between services
2. Local: computes local optimization by creating dynamic service chains between service provider/consumer
CHALLENGE
• Energy-efficient load balance and scheduling
HPC + Finance (EU funded, UK Government)
• High Frequency Trading– Flash crashes: dramatic sudden drop in share price
describe/predict
– Working paper: High Frequency Trading and Mini Flash Crashes http://arxiv.org/abs/1211.6667
• HPCFinance
• New models of risk analysis (diverse data integration)
• Role of HPC in Finance and comparison of technologies
• Trade-off: accuracy, speed, cost comparison: Cloud; GPGPUs, FPGA (Maxeler box)
CHALLENGES:
Data engineering;
Analytics;
Algorithms;
High performance;
Preference Elicitation from Pairwise Comparison
with Mikhailov, MBS; Siraj, COMSATS IIT, Pakistan
• Decision making is complex in presence of uncertainty and insufficient knowledge.
• Aim to estimate preference using pairwise comparison: PC used when unable to assign scores to available options; judgements provided may be inconsistent
• Work has proposed consistency measures and prioritization measures where revision not allowed.
• PriEsT tool now has sensitivity analysis -> best solution.
• CHALLENGES
– Evolutionary approach to multi-criteria DSS
– Work on preference elicitation model and tool
– Group decision making
– Bridge PriEsT and R (popular data mining tool) via XMCDA