Bigdata@BTH – Challenges and applications · Bigdata@BTH – Challenges and applications Håkan...

7
2015-04-24 1 Bigdata@BTH – Challenges and applications Håkan Grahn, Blekinge Institute of Technology Parisa Yousefi, Ericsson and Blekinge Institute of Technology BigData@BTH Research profile financed by the Knowledge foundation 36 msek (KKS) + 15 msek (BTH) + >40 msek (companies) Sep. 2014 to Dec. 2020 11 companies 4 departments at BTH Focus on machine learning and data mining, and efficient implementation of such algorithms on multicore and cloud system

Transcript of Bigdata@BTH – Challenges and applications · Bigdata@BTH – Challenges and applications Håkan...

2015-04-24

1

Bigdata@BTH – Challenges and applications

Håkan Grahn, Blekinge Institute of Technology Parisa Yousefi, Ericsson and Blekinge Institute of Technology

BigData@BTH •  Research profile financed by the Knowledge

foundation –  36 msek (KKS) + 15 msek (BTH) + >40 msek

(companies) –  Sep. 2014 to Dec. 2020 –  11 companies –  4 departments at BTH

•  Focus on machine learning and data mining, and efficient implementation of such algorithms on multicore and cloud system

2015-04-24

2

Research focus

How shall we design future scalable systems for big data analytics

in order to achieve a good balance between performance and resource efficiency

as well as business value?

Research themes - Core academic competence in all themes

Theme A: Big data analytics for decision support - Business intelligence - Multi-criteria decision-making - Descriptive/predictive big data analytics

Theme C: Core technologies - Data mining and knowledge discovery - Discovery science - Machine learning - Real-time analytics

Theme B: Big data analytics for image processing

- Image classification - Image restoration - Pattern recognition

Theme D: Foundations and enabling technologies - Multicore and cloud - Data communication and networks - Heterogeneous systems - Real-time and scheduling - Storage systems - Software architecture and implementation

2015-04-24

3

Balanced mix of industry partners

Theme A: Big data analytics for decision support

- Business intelligence - Multi-criteria decision-making - Descriptive/predictive big data

analytics

Theme C: Core technologies - Data mining and knowledge discovery - Discovery science - Machine learning - Real-time analytics

Theme B: Big data analytics for image processing

- Image classification - Image restoration - Pattern recognition

Theme D: Foundations and enabling technologies - Multicore and cloud - Data communication and networks - Heterogeneous systems - Real-time and scheduling - Storage systems - Software architecture and implementation

Wireless M

aingate Nordic

Arkiv D

igital AD

Scorett Footw

are

Noda Intelligent S

ystems

Indigo IPE

X

Telenor

Com

puverde

MM

I

Ericsson

Sony

Contribe

Uniqueness and competitive edge

Theme A: Big data analytics for decision support

- Business intelligence - Multi-criteria decision-making - Descriptive/predictive big data

analytics

Theme C: Core technologies - Data mining and knowledge discovery - Discovery science - Machine learning - Real-time analytics

Theme B: Big data analytics for image processing

- Image classification - Image restoration - Pattern recognition

Theme D: Foundations and enabling technologies - Multicore and cloud - Data communication and networks - Heterogeneous systems - Real-time and scheduling - Storage systems - Software architecture and implementation

Concrete challenges!!

Large distributed systems

Health care dom

ain Unique combination!!

Cam

era devices

Large-scale image processing

and classification

Telecomm

unication systems

2015-04-24

4

Industrial challenges

Concrete projects

Results, knowledge, products, …

Industrial challenges drive the research agenda

•  IC1: Real-time and large-scale quality assessment of images

•  IC2: Demand-based hospital staff planning •  IC3: Customer profiling for personalized strategies &

marketing •  IC4: Fraud and anomaly detection in large-scale data

sets •  IC5: Automation and orchestration of cloud-based test

environments •  IC6: Collection and selection of data for real-time

analysis

2015-04-24

5

Industrial challenges

Concrete projects

Results, knowledge, products, …

    IC1   IC2   IC3   IC4   IC5   IC6  P1,  Theme  A       X   X              P2,  Theme  A           X   X       X  P3,  Theme  B   X           X          P4,  Theme  C   X   X   X   X       X  P5,  Theme  C               X   X   X  P6,  Theme  D           X   X       X  P7,  Theme  D                   X   X  

    IC1   IC2   IC3   IC4   IC5   IC6  P1,  Theme  A       X   X              P2,  Theme  A           X   X       X  P3,  Theme  B   X           X          P4,  Theme  C   X   X   X   X       X  P5,  Theme  C               X   X   X  P6,  Theme  D           X   X       X  P7,  Theme  D                   X   X  

IC1: Real-time and large-scale quality assessment of images

2015-04-24

6

    IC1   IC2   IC3   IC4   IC5   IC6  P1,  Theme  A       X   X              P2,  Theme  A           X   X       X  P3,  Theme  B   X           X          P4,  Theme  C   X   X   X   X       X  P5,  Theme  C               X   X   X  P6,  Theme  D           X   X       X  P7,  Theme  D                   X   X  

IC1: Real-time and large-scale quality assessment of images

P3 (B): Efficient media analysis and processing P4 (C): Efficient ensemble methods for challenging domains

Subprojects – Addressing the challenges

•  P1 (A): Decision support systems for resource estimation and allocation

•  P2 (A): Decision support systems for anomaly detection and visualization

•  P3 (B): Efficient media analysis and processing •  P4 (C): Efficient ensemble methods for challenging domains •  P5 (C): Classification and regression in large data streams •  P6 (D): Data collection and selection in large distributed

environments •  P7 (D): Resource-efficient automatic orchestration of resources

in cloud systems for big data analytics

2015-04-24

7

Possible applications in transport and logistics

•  Distributed data collection, filtering, and storage, e.g., traffic information

•  Planning and scheduling, e.g., resource planning, train schedules, maintenance –  FLOAT - FLexibel Omplanering Av Tåglägen i drift –  KAJT – Kapacitet i JärnvägsTrafiken

•  Anomaly detection, e.g., strange or unusual behavior

•  Revenue management, e.g., revenue leakage, run-away costs