Data warehouse Big data solutions · Microsoft SharePoint 2010 Enterprise: TeamSite, Publishing...

Post on 06-Jul-2020

0 views 0 download

Transcript of Data warehouse Big data solutions · Microsoft SharePoint 2010 Enterprise: TeamSite, Publishing...

TMA Solutions 1

YOUR QUALITY PARTNER FOR SOFTWARE SOLUTIONS

www.tmasolutions.comTMA Solutions

BI-Big Data-Analytics

www.tmasolutions.comTMA Solutions

Skill Set

Big data & data analysis: Staging (ETL process) Data Warehousing (Storage structure: ROLAP, MOLAP, HOLAP) Data-mining: Classification/Regression, Clustering, Association, Sequence

Analysis Reporting: Cognos report, Jasper report, Qlikview report Hadoop platform

Microsoft Data Analytics tools: Microsoft Business Intelligence (Microsoft SQL Server Data Tools in 2012) :

Integration tool, Analysis tool, Reporting tool, SSRS, SSIS Microsoft Office 2010: Excel Pivot Table, Excel Pivot Chart. Microsoft SharePoint 2010 Enterprise: TeamSite, Publishing Portal.

Machine Learning Algorithms: Decision tree, SVM, Knn, Bayes theorem, Bayesian Network,

Neuron network Verification methods: ROC curve, t-test R language

Sample Projects

Traffic Data Analysis

Speed Profile Clustering

Sincerity Sentiment Mining

GENOME Alignment & Variant Callings

Imaging Mass Spectrometry

Traffic Data Analysis (1/2)

Real-time data report and analysis

Visualize data with chart and pivot table

Build up traffic data warehouse

User could analyze by filtering, rolling up and drilling down

Application

Geography

Device type

Report Type

Source Type

FactData

Time

Usage

User share

Hotspot/Auto

Hotspot

Device

GPS

LiveSpeed

Traffic data warehouse

DIMENSIONAL DATA

Facts Dimensions

Traffic Data Analysis (2/2)

Speed Profile Clustering

To support smart driving and navigation

Data collected from Navigation app

Billions of records in multidimensional manner

Road info

Vehicle speed with timestamp

Type of road

Unsupervised method used: SVM, Mean Shift

Speed Profile Clustering

Time

Speed

Sincerity Sentiment Mining

Analyzing & ranking sincerity sentiment of reviewers/commentators in an online community

Dealing with large historical data of reviews/comments

R language

Machine learning techniques

Natural Language Processing: POS, Tagger,…

Neuron network

SVM

Bayesian network

Etc.

Sincerity Sentiment MiningA Typical Framework

-There's lots of cool stuff packed into espn's ultimate x!- There's suspension of disbelief and then there's bad screenwriting..!this film packs a wallop of the latter!....

Genome Alignment & Variant Callings

Features

DNA, Exom, RNA Alignment

Variant callings

Beta Result

Genome Alignment & Variant Callings

Next-Generation Sequencing Machine(e.g. Illumina machine)

ACGTGTACAAGGTCCGGTTCTGAAAGTTGACCATGGATAACCGGTTAATTTAAGGAT

…..................AGTCCTTTTACATTGAGTAG

Human genome has about 3 giga bases/letters

CEQEO System

Hundreds of million reads(30X coverage ~ 90GB)

Aligned reads and Variant callings

Imaging Mass Spectrometry (1/2)

Features

Analyzing ToF-SIM data (multiple dimension)

Imaging / visualizing the data

Techniques

PCA algorithm

Bayes theorem

R Language

Imaging Mass Spectrometry (2/2)

Raw data in multi-dimension

Visualization

TMA Solutions 14

THANK YOU !

+ 1 802-735-1392+ 61 414-734-277+81 3-6432-4994www.tmasolutions.com

North America number:Australia number:Japan number:Website:

+84 8 3997-8000+84 908-676-212+84 8 3990-3303sales@tma.com.vn

Tel:Mobile:Fax:Email: