Analysis Tools for Data Enabled S cience

48
Analysis Tools for Data Enabled Science SALSA HPC Group http:// salsahpc.indiana.edu School of Informatics and Computing Indiana University

description

Analysis Tools for Data Enabled S cience. S A L S A HPC Group http:// salsahpc.indiana.edu School of Informatics and Computing Indiana University. Presenter Introduction. Presenter Introduction. Twister Architecture. Kernels, Genomics, Proteomics, Information Retrieval, Polar Science - PowerPoint PPT Presentation

Transcript of Analysis Tools for Data Enabled S cience

PowerPoint Presentation

Analysis Tools forData Enabled ScienceSALSA HPC Group http://salsahpc.indiana.eduSchool of Informatics and ComputingIndiana University

TwisterBingjing ZhangFunded by Microsoft Foundation Grant, Indiana University's Faculty Research Support Program and NSF OCI-1032677 GrantTwister4AzureThilina GunarathneFunded by Microsoft Azure GrantHigh-Performance Visualization Algorithms For Data-Intensive AnalysisSeung-Hee Baeand Jong Youl Choi Funded by NIH Grant 1RC2HG005806-01

Presenter Introduction

DryadLINQ CTP EvaluationHui Li, Yang Ruan, and Yuduo Zhou Funded by Microsoft Foundation GrantMillion Sequence Challenge Saliya Ekanayake, Adam Hughs, Yang RuanFunded by NIH Grant 1RC2HG005806-01Cyberinfrastructure for Remote Sensing of Ice Sheets Jerome MitchellFunded by NSF Grant OCI-0636361

Presenter Introduction

Twister ArchitectureLinux HPCBare-systemAmazon CloudWindows Server HPCBare-system

VirtualizationCPU NodesVirtualizationInfrastructureHardwareAzure CloudGrid ApplianceGPU NodesCross Platform Iterative MapReduce (Collectives, Fault Tolerance, Scheduling)Kernels, Genomics, Proteomics, Information Retrieval, Polar ScienceScientific Simulation Data Analysis and ManagementDissimilarity Computation, Clustering, Multidimentional Scaling, Generative Topological MappingApplicationsProgramming ModelServices and WorkflowHigh Level LanguageDistributed File SystemsData Parallel File SystemRuntimeStorageObject StoreSecurity, Provenance, Portal4(a) Map Only(d) Loosely Synchronous(c) Iterative MapReduce(b) Classic MapReduceInputmapreduceInputmapreduceIterationsInputOutputmapPijCAP3 AnalysisSmith-Waterman DistancesParametric sweepsPolarGrid Matlab data analysisHigh Energy Physics (HEP) HistogramsDistributed searchDistributed sortingInformation retrievalMany MPI scientific applications such as solving differential equations and particle dynamicsDomain of MapReduce and Iterative ExtensionsMPIExpectation maximization clustering e.g. KmeansLinear AlgebraMultimensional ScalingPage RankStatus of Iterative MapReduce5GTM vs. MDSGTMMDS (SMACOF)Maximize Log-LikelihoodMinimize STRESS or SSTRESSObjectiveFunctionO(KN) (K