Lightning Talk Building Big Data Analytics Data Lake with ...€¦ · users to connect on premise...

15
QCT CONFIDENTIAL www.QCT.io Building Big Data Analytics Data Lake with All-Flash Ceph QCT Marco Huang QCT Amy Chang

Transcript of Lightning Talk Building Big Data Analytics Data Lake with ...€¦ · users to connect on premise...

QCT CONFIDENTIALwww.QCT.io

BuildingBigDataAnalyticsDataLakewith

All-FlashCephQCTMarcoHuangQCTAmyChang

QCT CONFIDENTIALwww.QCT.io

• Introduction ofQCT• Whydatalakearchitecture• BriefonDataLakewithAll-FlashCeph architecture

– Architecture design– Hardwareselection– Testingresult

• Conclusion

Agenda

2

AleadingclouddatacentersolutionproviderthatdeliversServer,Storage,Networking,RackSystemandCloudSolutionunderasingle,provenroof

QCT CONFIDENTIAL

www.QCT.io

Data-poweredcompanyneedsflexibledataanalyticframeworkPopularHadoopframeworkistopchoiceforexecutinganalyticaltasksyetcan’tscale-outon-demand

Data-GeneratingCenter

Data-PoweredCompany

Paradigm ShiftforEnterprise:Need fordataanalyticsincreases

+

Hyper-convergedHadoopFramework

DisaggregatedSDSArchitecture

VS

PreferDisaggregatedArchitecture:Targettoscale-outondemand

4

PoweredbyIntel®Xeon®processors

QCT CONFIDENTIALwww.QCT.io

5

DataLakewithAll-FlashCeph ArchitectureDisaggregatedataanalyticsclusterandbackendstoragetoprovidehigherflexibility

DataAnalyticsCluster1Query Engines(Hive)

HadoopHDFSTMP

DataAnalyticsCluster2

Presto

S3RESTAPI

80G

RADOSGateway(RGW)

RADOSGateway(RGW)

40G40G

x 16 x 16 x 16x 16 x 16 x 16 x 16 x 16 x 16x 16

40G40G

M Ceph-Monitor Node Ceph-OSDNode

M

LoadBalancer

PoweredbyIntel®Xeon®processors

QCT CONFIDENTIALwww.QCT.io

DataAnalyticsCluster1Query Engines(Hive)

HadoopHDFSTMP

DataAnalyticsCluster2

Presto

S3RESTAPI

80G

RADOSGateway(RGW)

RADOSGateway(RGW)

40G40G

x 16 x 16 x 16x 16 x 16 x 16 x 16 x 16 x 16x 16

40G40G

M Ceph-Monitor Node Ceph-OSDNode

M

LoadBalancer

6

Allowsmultipledataanalyticsclustertorunconcurrently

DataLakewithAll-FlashCeph ArchitectureDisaggregatedataanalyticsclusterandbackendstoragetoprovidehigherflexibility

PoweredbyIntel®Xeon®processors

QCT CONFIDENTIALwww.QCT.io

DataAnalyticsCluster1Query Engines(Hive)

HadoopHDFSTMP

DataAnalyticsCluster2

Presto

S3RESTAPI

80G

RADOSGateway(RGW)

RADOSGateway(RGW)

40G40G

x 16 x 16 x 16x 16 x 16 x 16 x 16 x 16 x 16x 16

40G40G

M Ceph-Monitor Node Ceph-OSDNode

M

LoadBalancer

7

CompatibilitywithS3allowsuserstoconnectonpremisecloudtopubliccloud

DataLakewithAll-FlashCeph ArchitectureDisaggregatedataanalyticsclusterandbackendstoragetoprovidehigherflexibility

PoweredbyIntel®Xeon®processors

QCT CONFIDENTIALwww.QCT.io

DataAnalyticsCluster1Query Engines(Hive)

HadoopHDFSTMP

DataAnalyticsCluster2

Presto

S3RESTAPI

80G

RADOSGateway(RGW)

RADOSGateway(RGW)

40G40G

x 16 x 16 x 16x 16 x 16 x 16 x 16 x 16 x 16x 16

40G40G

M Ceph-Monitor Node Ceph-OSDNode

M

LoadBalancer

8

Loadbalancertoequallydistributeworkloads

DataLakewithAll-FlashCeph ArchitectureDisaggregatedataanalyticsclusterandbackendstoragetoprovidehigherflexibility

PoweredbyIntel®Xeon®processors

QCT CONFIDENTIALwww.QCT.io

DataAnalyticsCluster1Query Engines(Hive)

HadoopHDFSTMP

DataAnalyticsCluster2

Presto

S3RESTAPI

80G

RADOSGateway(RGW)

RADOSGateway(RGW)

40G40G

x 16 x 16 x 16x 16 x 16 x 16 x 16 x 16 x 16x 16

40G40G

M Ceph-Monitor Node Ceph-OSDNode

M

LoadBalancer

9

Ceph asbackendstoragetoscale-outondemand

DataLakewithAll-FlashCeph ArchitectureDisaggregatedataanalyticsclusterandbackendstoragetoprovidehigherflexibility

PoweredbyIntel®Xeon®processors

QCT CONFIDENTIALwww.QCT.io

QuantaGrid D52BQ-2U– ScaleAlongwithYourBusinessIntelPurley platformwithupto242.5”bayswithSATA/SAS/NVMe support

TopshelfXeon®Pprocessor1

Upto10x PCIeexpansionslots

Upto26x hot-swapdrivebays

Upto3TBmemorycapacity2

1.Withlimitedconditions2.WithspecificCPU

Asmanyas24xSFF+optionalextra2xrearSSDbays(SATA/SAS/NVMe support)

12xLFF+optionalextra2xrearSSDbays(SATA/SAS/NVMe support)

Allscrew-less,hot-swappable!

PoweredbyIntel®Xeon®processors

QCT CONFIDENTIAL

www.QCT.io11

QxStor Ceph – KnowyourDemand,EasytoConfigureQCTQxStor BigDataAnalyticsDataLakewithAll-FlashCeph Solution

ThroughputOptimized

QxStor RCT-400QxStor RCT-200

Cost/Capacity Optimized

QxStor RCC-400

IOPS Optimized

QxStor RCI-300

ForStreamingMedia

ForArchiving

ForMission

CriticalApp

PoweredbyIntel®Xeon®processors

D52BQ-2UD51PH-1ULH T21P-4U T21P-4U D51BP-1U

+25% InTotalStorageCapacityavailable1

-33% InsequentialwritingLatencyTesting2

+50% InsequentialwritingThroughput2

Upto560TB Perchassis

Upto63% Costdown

+100% ImproveinIOPSperformance3

1.6M/s HighestIOPS3

-50%

Purley Available!

ReduceinLatency3

3TestresultofRCI-3002 TestresultofRCT-4001SKUstatisticsofRCT-200

QCT CONFIDENTIALwww.QCT.io

HighPerformance-OptimizedStorage

SuitableforMissionCriticalApp

CostEfficient ComparedtoHDD

All-FlashCeph ispreferredfordataanalyticworkloadsNVMe ispreferredboth fromthebusiness andperformanceperspective

CPUUtilization

NetworkTraffic

DiskReadThroughput

DiskReadLatency

x9.24Incoming:x3.9

x2.81 x9.77Outgoing:x16.1

PerformancePerspectiveNVMe exhibits exceptional results on system metricsthan conventional disks

BusinessPerspectiveNVMe is no longer a luxury device for enterprisewith IO intensive workloads

12

PoweredbyIntel®Xeon®processors

QCT CONFIDENTIAL

www.QCT.io13

TestResult– HiveandPrestoAssuredperformancefordatalakearchitecturewhencomparedtoHDFShyper-convergedarchitecture

Minorchangesareobserved intotalruntimecomparisonbetweenHDFShyper-convergedandCeph disaggregatedarchitectureusingHive.

Upto22.91%fasterintotalruntimeforCephdisaggregatedarchitectureusing Presto,theeffectisespecially notableforlargedatasize.

PoweredbyIntel®Xeon®processors

QCT CONFIDENTIALwww.QCT.io

AssuredPerformanceLevelComparabletestresultstohyper-convergedarchitecturefordataanalytics

Cost-EfficientArchitectureLowerstoragerequiredfordatadurabilitythanHDFSorRAIDbasedsystems

Scale-OutAccordingtoNeedScalingcomponents independently reducescost&managementcomplexity

DisaggregatedarchitectureissuitablefordataanalyticsMeetthedemandforbigdataframeworkswhileprovidinghigherflexibility

14

www.QCT.io

QCT CONFIDENTIAL

Lookingforinnovativecloudsolution?CometoQCT,whoelse?

15