Spark Bi-Clustering - OW2 Big Data Initiative, altic
-
Upload
altic-altic -
Category
Technology
-
view
135 -
download
1
Transcript of Spark Bi-Clustering - OW2 Big Data Initiative, altic
Twitter #ow2 #sl2014 @Altic_buzzwww.ow2.org
smart #OpenSource Software #BusinessIntelligence
assembler
Twitter #ow2 #sl2014 @Altic_buzzwww.ow2.org
Altic tools / approach
• ETL : Talend
• Big Data : Spark, Hortonworks Data Platform (Hadoop), Elasticsearch
• Data Warehouse : InfiniDB
• Reporting : JasperReports, Birt
• OLAP : Mondrian, Palo
• Dashboard : Tableau Software, D3
• BI platform : SpagoBI
Twitter #ow2 #sl2014 @Altic_buzzwww.ow2.org
Biclustring on Big Data
● Tugdual SARAZIN, PhD
● ALTIC
● LIPEN (Paris 13)
● Biclustring
● a Biclustring algorithm on Big Data
● Spark
● Based on SOM – Self Organized Map
● Available on Github : Spark-Clustering
Twitter #ow2 #sl2014 @Altic_buzzwww.ow2.org
Integration with SpagoBI
● Spark Bi Clustering can be an engine for SpagoBI
● Define a data set as input
● Execute the biclustering with appropriate settings
● Store result in a defined format
– Databases– Big data storage (HDFS)– SpagoBI Dataset
Twitter #ow2 #sl2014 @Altic_buzzwww.ow2.org
Integration with Talend
● Spark Biclustering can be a component for Talend Big Data
● Add new features to existing Talend Big Data components
– Biclustering● Allow to map your data
OW2 Big Data Initiative
Charly Clairmont, ALTIC
Charly CLAIRMONT
@egwada / @[email protected]
http://www.altic.org
Thanks