Download - Introduction | Pivotal Greenplum Database Docs

Transcript
Page 1: Introduction | Pivotal Greenplum Database Docs

12346

181920222425

TableofContents

TableofContentsIntroductionKeyPointsforReviewCharacteristicsofaSupportedPivotalHardwarePlatformPivotalApprovedRecommendedArchitecturePivotalClusterExamplesExampleRackLayoutUsinggpcheckperftoValidateDiskandNetworkPerformancePivotalGreenplumSegmentInstancesperServerPivotalGreenplumonVirtualizedSystemsAdditionalHelpfulTools

©CopyrightPivotalSoftwareInc,2013-2016 1 A03

Page 2: Introduction | Pivotal Greenplum Database Docs

IntroductionTheEMCDataComputingApplianceprovidesaready-madeplatformthatstrivestoaccommodatethemajorityofcustomerworkloads.OneofPivotalGreenplum’sstrongestvaluepropositionsisitsabilitytorunonpracticallyanymodern-dayhardwareplatform.Moreandmore,PivotalEngineeringisseeingcaseswherecustomerselecttobuildaclusterthatsatisfiesaspecificrequirementorpurpose.

PivotalPlatformEngineeringpublishesthisframeworkasaresourceforassistingcustomersinthiseffort.

ObjectivesThisguidecanbeusedfor:

AclearunderstandingofwhatcharacterizesarecommendedplatformforrunningPivotalGreenplumDatabase

Areviewofthetwomostcommontopologieswithsupportingrecommendedarchitecturediagrams

Pivotalrecommendedreferencearchitecturethatincludeshardwarerecommendations,configuration,harddiskguidelines,networklayout,installation,dataloading,andverification

Extraguidancewithreal-worldGreenplumclusterexamples(seePivotalClusterExamples)

Thisdocumentdoes:providerecommendationsforbuildingawell-performingPivotalclusterusingthehardwareguidelinespresented

providegeneralconceptswithoutspecifictuningsuggestions

Thisdocumentdoesnot:

promisePivotalsupportfortheuseofthirdpartyhardware

assumethattheinformationhereinappliestoeverysite,butissubjecttomodificationdependingonacustomer’sspecificlocalrequirements

provideall-inclusiveproceduresforconfiguringPivotalGreenplum.AsubsetofinformationisincludedasitpertainstodeployingaPivotalcluster.

GreenplumTermstoKnowmaster

AserverthatprovidesentrytotheGreenplumDatabasesystem,acceptsclientconnectionsandSQLqueries,anddistributesworktothesegmentinstances.

segmentinstancesIndependentPostgreSQLdatabasesthateachstoreaportionofthedataandperformthemajorityofqueryprocessing.

segmenthostAserverthattypicallyexecutesmultipleGreenplumsegmentinstances.

interconnectNetworkinglayeroftheGreenplumDatabasearchitecturethatfacilitatesinter-processcommunicationbetweensegments.

FeedbackandUpdatesPleasesendfeedbackand/[email protected].

©CopyrightPivotalSoftwareInc,2013-2016 2 A03

Page 3: Introduction | Pivotal Greenplum Database Docs

KeyPointsforReview

WhatisPivotalEngineeringRecommendedArchitecture?ThisPivotalRecommendedArchitecturecomprisesgenericrecommendationsforthirdpartyhardwareforusewithPivotalsoftwareproducts.Pivotalmaintainsexamplesofvariousimplementationsinternallytoaidinassistingcustomersinclusterdiagnosticsandconfigurationassistance.Pivotaldoesnotperformhardwarereplacement,norisPivotalasubstitutefortheOEMvendorsupportfortheseconfigurations.

WhyInstallonanOEMVendorPlatform?TheEMCDCAstrivestoachievethebestbalancebetweenperformanceandcostwhilemeetingabroadrangeofcustomerneeds.Therearesomeveryvalidreasonscustomersmayopttodesigntheirownclusters.

Somepossibilitiesare:

Varyingworkloadprofilesthatmayrequiremorememoryorhigherprocessorcapacity

Specificfunctionalneedslikepublic/privateclouds,increaseddensity,ordisasterrecovery(DR)

Supportforradicallydifferentnetworktopologies

Deeper,moredirectaccessforhardwareandOSmanagement

ExistingrelationshipswithOEMhardwarepartners

PivotalEngineeringhighlyrecommendsfollowingPivotalarchitectureguidelinesifcustomersoptoutofusingtheapplianceanddiscussingtheimplementationwithaPivotalEngineer.Customersachievemuchgreaterreliabilitywhenfollowingtheserecommendations.

©CopyrightPivotalSoftwareInc,2013-2016 3 A03

Page 4: Introduction | Pivotal Greenplum Database Docs

CharacteristicsofaSupportedPivotalHardwarePlatform

CommodityHardwarePivotalbelievesthatcustomersshouldtakeadvantageoftheinexpensiveyetpowerfulcommodityhardwarethatincludesx86_64platformcommodityservers,storage,andEthernetswitches.

Pivotalrecommends:

Chipsetsorhardwareusedacrossmanyplatforms

NICchipsets(likesomeoftheIntelseries)RAIDcontrollers(likeLSIorStorageWorks)

Referencemotherboards/designs

Machinesthatusereferencemotherboardimplementationsarepreferred.AlthoughDIMMcountisimportant,ifamanufacturerintegratesmoreDIMMslotsthantheCPUmanufacturerspecifies,moreriskisplacedontheplatform.

Ethernet-basedinterconnects(10Gb)are

Highlypreferredtoproprietaryinterconnects.Highlypreferredtostoragefabrics.

ManageabilityPivotalrecommends:

Remote,out-of-bandmanagementcapabilitywithsupportforsshconnectivityaswellasweb-basedconsoleaccessandvirtualmedia.

DiagnosticLEDsthatconveyfailureinformation.Amberlightsareaminimum,butanLEDthatdisplaystheexactfailureismoreuseful.

Tool-freemaintenance(thecovercanbeopenedwithouttools,partsarehot-swappablewithouttools,etc.).

Labeling–componentssuchasDIMMsarelabeledsoit’seasytodeterminewhichpartneedstobereplaced.

Command-line,script-basedinterfacesforconfiguringtheserverBIOS,andoptionslikeRAIDcardsandNICs.

RedundancyPivotalrecommends:

Redundanthot-swappablepowersupplies

Redundanthot-swappablefans

Redundantnetworkconnectivity

Hot-swappabledrives

Hot-sparedriveswhenimmediatereplacementoffailedhardwareisunavailable

DeterminingtheBestTopology

TraditionalTopologyThisconfigurationrequirestheleastspecificnetworkingskills,andisthesimplestpossibleconfiguration.Inatraditionalnetworktopology,everyserverintheclusterisdirectlyconnectedtoeveryswitchinthecluster.Thisistypicallyimplementedover10GbEthernet.Thistopologylimitstheclustersizetothenumberofportsontheselectedinterconnectswitches.10Gbportsontheserversarebondedintoanactive/activepairandroutedirectlytoasetofswitchesconfiguredusingMLAG(orcomparabletechnology)toprovidearedundanthighspeednetworkfabric.

©CopyrightPivotalSoftwareInc,2013-2016 4 A03

Page 5: Introduction | Pivotal Greenplum Database Docs

Figure:RecommendedArchitectureExample1(TypicalTopology)

ScaleableTopologyScalablenetworksimplementanetworkcorethatallowstheclustertogrowbeyondthenumberofportsintheinterconnectswitches.Caremustbetakentoensurethatthenumberoflinksfromthein-rackswitchesisadequatetoservicethecore.

HowtoDeterminetheMaximumNumberofServers

Forexample,eachrackcanhold16serversandyoudeterminethatthecoreswitcheseachhave48ports.Oftheseports4areusedtocreatetheMLAGbetweenthetwocoreswitches.Oftheremaining44ports,networkingfromasinglesetofinterconnectswitchesinarackuses4linkspercoreswitch,2fromeachinterconnectswitchtoeachofthecoreswitches.Themaximumnumberofserversisdeterminedbythefollowingformula:

max-nodes=(nodes-per-rack*((core-switch-port-count-MLAGportutilization)/rack-to-rack-link-port-count))176=(16*((48-4)/4))

Figure:RecommendedArchitectureExample2(ScalableTopology)

©CopyrightPivotalSoftwareInc,2013-2016 5 A03

Page 6: Introduction | Pivotal Greenplum Database Docs

PivotalApprovedRecommendedArchitecture

MinimumServerGuidelinesTable1listsminimumrequirementsforagoodcluster.Usegpcheckperftogeneratethesemetrics.

SeeAppendixC:Using gpcheckperf toValidateDiskandNetworkPerformanceforexample gpcheckperf output.

Table1.BaselineNumbersforaPivotalCluster

MasterNodes(mdw&smdw)

Usersandapplicationsconnecttomasterstosubmitqueriesandreturnresults.Typically,monitoringandmanagingtheclusterandthedatabaseisperformedthroughthemasternodes.

8+physicalcoresatgreaterthan2GHzclockspeed

>256GB

>600MB/sRead

>500MB/sWrite

2x10GbNICs

MultipleNICs 1U

SegmentNodes(sdw)

Segmentnodesstoredataandexecutequeries.Theyaregenerallynotpublicfacing.

Multiplesegmentinstancesrunononesegmentnode.

8+physicalcoresatgreaterthan2GHzclockspeed

>256GB

>2000MB/sRead

>2000MB/sWrite

2x10GbNICs

MultipleNICs 2U

ETL/BackupNodes(etl)

Generallyidenticaltosegmentnodes.Theseareusedasstagingareasforloadingdataorasdestinationsforbackupdata.

8+physicalcoresatgreaterthan2GHzclockspeed

>64GBormore

>2000MB/sRead

>2000MB/sWrite

2x10GbNICs

MultipleNICs 2U

NetworkGuidelines

Table2.AdministrationandInterconnectSwitches

AdministrationNetwork

Administrationnetworksareusedtotietogetherlights-outmanagementinterfacesintheclusterandprovideamanagement

48 1GbAlayer-2/layer-3managedswitchperrackwithnospecificbandwidthorblockingrequirements.

©CopyrightPivotalSoftwareInc,2013-2016 6 A03

Page 7: Introduction | Pivotal Greenplum Database Docs

clusterandprovideamanagementrouteintoserverandOSswitches.

InterconnectNetwork 48 10GB

Twolayer-2/layer-3managedswitchesperrack.Allportsmusthavefullbandwidth,beabletooperateatlinerate,andbenon-blocking.

Table3.Racking,Power,andDensity

RackingGenerally,a40Uorlargerrackthatis1200mmdeepisrequired.Built-incablemanagementispreferred.ESMprotectivedoorsarealsopreferred.

Power

ThetypicalinputpowerforaPivotalGreenplumrackis4x208/220V,30amp,singlephasecircuitsintheUS.Internationally,4x230V,32amp,singlephasecircuitsaregenerallyused.Thisaffordsapowerbudgetof~9600VAoffullyredundantpower.

Otherpowerconfigurationsareabsolutelyfinesolongasthereisenoughenergydeliveredtotheracktoaccommodatethecontentsoftherackinafullyredundantmanner.

NodeGuidelines

OSLevelsAtaminimumthefollowingoperatingsystems(OS)aresupported:

RedHat/CentOSLinux5*

RedHat/CentOSLinux6

RedHat/CentOSLinux7**

SUSEEnterpriseLinux10.2or10.3

SUSEEnterpriseLinux11

*RHEL/CentOS5willbeunsupportedinthenextmajorrelease

**supportforRHEL/CentOS7isnearcompletion,pendingkernelbugfixes

ForthelatestinformationonsupportedOSversions,refertotheGreenplumDatabaseInstallationGuide.

SettingOSParametersforGreenplumDatabaseCarefulconsiderationmustbegivenwhensettingOSparametersforGreenplumDatabasehosts.RefertothelatestversionoftheGreenplumDatabaseInstallationGuideforthesesettings.

GreenplumDatabaseServerGuidelinesGreenplumDatabaseintegratesthreekindsofservers:masterservers,segmenthosts,andETLservers.GreenplumDatabaseserversmustmeetthefollowingcriteria.

MasterServers1Uor2Userver.Withlessofaneedfordrives,rackspacecanbesavedbygoingwitha1Uformfactor.However,a2Uformfactorconsistentwithsegmenthostsmayincreasesupportability.

©CopyrightPivotalSoftwareInc,2013-2016 7 A03

Page 8: Introduction | Pivotal Greenplum Database Docs

Sameprocessors,RAM,RAIDcard,andinterconnectNICsasthesegmenthosts.

Sixtotendisks(eightismostcommon)organizedintoasingleRAID5groupwithonehotspareconfigured.

SAS15korSSDdisksarepreferredwith10kdisksaclosesecond.

SATAdrivesareacceptableinsolutionsorientedtowardsarchivalspaceoverqueryperformance.

Alldisksmustbethesamesizeandtype.

Shouldbecapableofreadratesin gpcheckperf of500MB/sorhigher.(Thefasterthemasterscans,thefasteritcangeneratequeryplans,whichimprovesoverallperformance.)

Shouldbecapableofwriteratesin gpcheckperf of500MB/sorhigher.

Shouldhavesufficientadditionalnetworkinterfacestoconnecttothecustomernetworkdirectlyinthemannerdesiredbythecustomer.

SegmentHostsTypicallya2Userver.

Thefastestavailableprocessors.

256GBRAMormore.

OneortwoRAIDcardswithmaximumcacheandcacheprotection(flashorcapacitorspreferredoverbattery).RAIDcardsshouldbeabletosupportfullread/writecapacityofthedrives.

2x10GbNICs.

12to24disksorganizedintotwoorfourRAID5groups.Hotsparesshouldbeconfigured,unlesstherearedisksonhandforquickreplacement.

SAS15kdisksarepreferredwith10kdisksaclosesecond.SATAdisksarepreferredovernearlineSASifSAS15korSAS10kcannotbeused.Alldisksmustbethesamesizeandtype.

Aminimumreadratein gpcheckperf of300MB/spersegmentorhigher.(2000MB/sperserveristypical.)

Aminimumwriteratein gpcheckperf of300MB/sorhigher(2000MB/sperserveristypical.)

AdditionalTipsforSegmentHostConfigurationThenumberofsegmentinstancesthatarerunpersegmenthostisconfigurable,andeachsegmentinstanceisitselfadatabaserunningontheserver.Abaselinerecommendationoncurrenthardware,suchasthehardwaredescribedinAppendixA,is8primarysegmentinstancesperphysicalserver.

AsetofmemoryparameterswillbedeterminedwheninstallingthedatabasesoftwarethatdependupontheamountofRAMselectedforeachsegmentinstance.Whilethesearenotplatformparameters,itistheplatformthatdetermineshowmuchmemoryisavailableandhowthememoryparametersshouldbesetinthesoftware.Refertotheonlinecalculator(http://greenplum.org/calc/ )todeterminethesesettings.

RefertoAppendixDforfurtherreadingonsegmentinstanceconfiguration.

ETLServersTypicallya2Userver.

Thesameprocessors,RAM,andinterconnectNICsasthesegmentservers

OneortwoRAIDcardswithmaximumcacheandcacheprotection(flashorcapacitorspreferredoverbattery).

12to24disksorganizedintoRAID5groupsofsixtoeightdiskswithnohotsparesconfigured(unlessthereareavailabledisksaftertheRAIDgroupsareconstructed).

SATAdisksareagoodchoiceforETLasperformanceistypicallylessofaconcernthanstorageforthesesystems.

Shouldbecapableofreadratesin gpcheckperf of100MB/sorhigher.(ThefastertheETLserversscan,thefasterquerydatacanbeloaded.

Shouldbecapableofwriteratesin gpcheckperf of500MB/sorhigher.(ThefasterETLserverswrite,thefasterdatacanbestagedforloading.)

AdditionalTipsforSelectingETLServersETLnodescanbeanyserverthatoffersenoughstorageandperformancetoaccomplishthetasksrequired.Typically,between4and8ETLserversarerequiredpercluster.ThemaximumnumberisdependentonthedesiredloadperformanceandthesizeoftheGreenplumDatabasecluster.

Forexample,thelargertheGreenplumDatabasecluster,thefastertheloadscanbe.ThemoreETLservers,thefasterdatacanbeserved.HavingmoreETLbandwidththantheclustercanreceiveispointless.HavingmuchlessETLbandwidththantheclustercanreceivemakesforslowerloadingthanthe

©CopyrightPivotalSoftwareInc,2013-2016 8 A03

Page 9: Introduction | Pivotal Greenplum Database Docs

maximumpossible.

HardDiskConfigurationGuidelinesAgenericserverwith24hot-swappablediskscanhaveseveralpotentialdiskconfigurations.TestingbyPivotalPlatformandSystemsEngineeringshowsthatthebestperformingstorageforPivotalsoftwareis:

fourRAID5groupsofsixdiskseach(usedasfourfilesystems),or

combinedintooneortwofilesystemsusinglogicalvolumemanager.

ThefollowinginstructionsdescribehowtobuildtherecommendedRAIDgroupsandvirtualdisksforbothmasterandsegmentnodes.Howtheseultimatelytranslateintofilesystemsiscoveredintherelevantoperatingsystem’sinstallationguide.

LUNConfigurationTheRAIDcontrollersettingsanddiskconfigurationarebasedonsyntheticloadtestingperformedonseveralRAIDconfigurations.Unfortunately,thesettingsthatresultedinthebestreadratesdidnothavethehighestwriteratesandthesettingswiththebestwriteratesdidnothavethehighestreadrates.

Theprescribedsettingsofferacompromise.Inotherwords,thesesettingsresultinwriterateslowerthanthebestmeasuredwriteratebuthigherthanthewriteratesassociatedwiththesettingsforthehighestreadrate.Thesameistrueforreadrates.Thisisintendedtoensurethatbothinputandoutputarethebesttheycanbewhileaffectingtheothertheleastamountpossible.

LUNsforthesystemshouldbepartitionedandmountedas/data1forthefirstLUNandadditionalLUNsshouldfollowthesamenamingconventionwhileincrementallyincreasingthenumber(/data1,/data2,/data3…/dataN).AllfilesystemsshouldbeformattedasxfsandfollowtherecommendationssetforthinthePivotalGreenplumDatabaseInstallationGuide.

MasterServerMasterservers(primaryandsecondary)haveeight,hot-swappabledisks.Configurealleightdisksintoasingle,RAID5stripeset.Eachofthevirtualdisksthatarecarvedfromthisdiskgroupshouldhavethefollowingproperties:

256kstripewidth

Noread-ahead

Diskcachedisabled

DirectI/O

VirtualdisksareconfiguredintheRAIDcard’soptionalROM.EachvirtualdiskdefinedintheRAIDcardwillappeartobeadiskintheoperatingsystemwitha/dev/sd?devicefilename.

SegmentandETLServersSegmentservershave24,hot-swappabledisks.ThesecanbeconfiguredinanumberofwaysbutPivotalrecommendsfour,RAID5groupsofsixdiskseach(RAID5,5+1).Eachofthevirtualdisksthatwillbecarvedfromthesediskgroupsshouldhavethefollowingproperties:

256kstripewidth

Noread-ahead

Diskcachedisabled

DirectI/O

VirtualdisksareconfiguredintheRAIDcard’soptionalROM.EachvirtualdiskdefinedintheRAIDcardwillappeartobeadiskintheoperatingsystemwitha/dev/sd?devicefilename.

SSDStorageFlashstoragehasbeengaininginpopularity.PivotalhasnothadtheopportunitytodoenoughtestingwithSSDdrivestomakearecommendation.Itis

©CopyrightPivotalSoftwareInc,2013-2016 9 A03

Page 10: Introduction | Pivotal Greenplum Database Docs

importantwhenconsideringSSDdrivestovalidatethesustainedsequentialreadandwriteratesforthedrive.Manydriveshaveimpressiveburstrates,butareunabletosustainthoseratesforlongperiodsoftime.Additionally,thechoiceofRAIDcardneedstobeevaluatedtoensureitcanhandlethebandwidthoftheSSDdrives.

SAN/JBODStorageInsomeconfigurationsitmaybearequirementtouseanexternalstoragearrayduetothedatabasesizeorservertypebeingusedbythecustomer.Withthisinmind,itisimportanttounderstandthat,basedontestingbyPivotalPlatformandSystemsEngineering,SANandJBODstoragewillnotperformaswellaslocal,internalserverstorage.

Someconsiderationstobetakenintoaccountifinstallingorsizingsuchaconfigurationarethefollowing(independentofthevendorofchoice):

Knowthedatabasesizeandtheestimatedgrowthovertime

Knowthecustomer’sread/writeratio

LargeblockI/Oisthepredominantworkload(512KB)

DisktypeandpreferredRAIDtypebasedonthevendorofchoice

Expecteddiskthroughputbasedonreadandwrite

Responsetimeofthedisks/JBODcontroller

PreferredoptionistohaveBBUcapabilityoneithertheRAIDcardorcontroller

Redundancyinswitchzoning,preferablywithafanin:out2:1

Atleast8GBFibreChannel(FC)connectivity

EnsurethattheserversupportstheuseofFC,FCoE,orexternalRAIDcards

Inallinstanceswhereanexternalstoragesourceisbeingutilized,thevendorofthediskarray/JBODshouldbeconsultedtoobtainspecificrecommendationsbasedonasequentialworkload.Thismayalsorequirethecustomertoobtainadditionallicensesfromthepertinentvendors.

NetworkLayoutGuidelinesAllthesystemsintheGreenplumclusterneedtobetiedtogetherinsomeformofdedicated,high-speeddatainterconnect.Thisnetworkisusedforloadingdataandforpassingdatabetweensystemsduringqueryprocessing.Itshouldbeashigh-speedandlow-latencyaspossible,anditshouldnotbeusedforanyotherpurpose(i.e.,itshouldnotbepartofthegeneralLAN).

AruleofthumbfornetworkutilizationinaGreenplumclusteristoplanforuptotwentypercentofeachserver’smaximumI/Oreadbandwidthasnetworktraffic.Thismeansaserverwitha2000MB/sreadbandwidth(asmeasuredby gpcheckperf )mightbeexpectedtotransmit400MB/s.Greenplumalsocompressessomedataondiskbutuncompressesitbeforetransmittingtoothersystemsinthecluster,soa2000MB/sreadratewitha4xcompressionratioresultsinan8000MB/seffectivereadrate.Twentypercentof8000MB/sis1600MB/swhichismorethanasinglegigabitinterface’sbandwidth.

Toaccommodatethistraffic,10Gbnetworkingisrecommendedfortheinterconnect.Currentbestpracticesuggeststwo10Gbinterfacesfortheclusterinterconnect.Thisensuresthatthereisbandwidthtogrowinto,andreducescablingintheracks.Itisrecommendedtoconfigurethetwo10GbinterfaceswithNICbondingtocreateaload-balanced,fault-tolerantinterconnect.

Cisco,Brocade,andAristaswitchesaregoodchoicesasthesebrandsincludetheabilitytotieswitchestogetherinfabrics.TogetherwithNICbondingontheservers,thisapproacheliminatessinglepointsoffailureintheinterconnectnetwork.Intel,QLogic,orEmulexnetworkinterfacestendtoworkbest.Layer3capabilityisrecommendedsinceitintegratesmanyfeaturesthatareusefulinaGreenplumDatabaseenvironment.

Note:Thevendorhardwarereferencedaboveisstrictlymentionedasanexample.PivotalPlatformandSystemsEngineeringdoesnotspecifywhichproductstouseinthenetwork.

FCoEswitchsupportisalsorequiredifSANstorageisused,aswellassupportforFibresnooping(FIPS).

AGreenplumDatabaseclusterusesthreekindsofnetworkconnections:

Adminnetworks

Interconnectnetworks

Externalnetworks

AdminNetworks

©CopyrightPivotalSoftwareInc,2013-2016 10 A03

Page 11: Introduction | Pivotal Greenplum Database Docs

AnAdminnetworktiestogetherallthemanagementinterfacesforthedevicesinaconfiguration.Itisgenerallyusedtoprovidemonitoringandout-of-bandconsoleaccessforeachconnecteddevice.Theadminnetworkistypicallya1Gbnetworkphysicallyandlogicallydistinctfromothernetworksinthecluster.

Serversaretypicallyconfiguredsuchthattheout-of-bandorlights-outmanagementinterfacessharethefirstnetworkinterfaceoneachserver.Inthisway,thesamephysicalnetworkprovidesaccesstolights-outmanagementandanoperatingsystemlevelconnectionusefulfornetworkOSinstallation,patchdistribution,monitoring,andemergencyaccess.

SwitchTypes

Typicallyone24-or48-port,1Gbswitchperrackandoneadditional48-portswitchclusterasacore.

Any1GbswitchcanbeusedfortheAdminnetwork.Carefulplanningisrequiredtoensurethatanetworktopologyisdesignedtoprovideenoughconnectionsandthefeaturesdesiredbythesitetoprovidethekindsofaccessrequired.

CablesUseeithercat5eorcat6cablingfortheAdminnetwork.Cablethelights-outormanagementinterfacefromeachclusterdevicetotheAdminnetwork.PlaceanAdminswitchineachrackandcross-connecttheswitchesratherthanattemptingtoruncablesfromacentralswitchtoallracks.

Note:PivotalrecommendsusingadifferentcolorcablefortheAdminnetwork.

InterconnectNetworksTheinterconnectnetworktiestheserversintheclustertogetherandformsahigh-speed,low-contentiondataconnectionbetweentheservers.ThisshouldnotbeimplementedonthegeneraldatacenternetworkasGreenplumDatabaseinterconnecttraffictendstooverwhelmnetworksfromtimetotime.LowlatencyisneededtoensureproperfunctioningoftheGreenplumDatabasecluster.Sharingtheinterconnectwithageneralnetworktendstointroduceinstabilityintothecluster.

Typicallytwoswitchesarerequiredperrack,andtwomoretoactasacore.Usetwo10Gbcablesperserverandeightperracktoconnecttheracktothecore.

Interconnectnetworksareoftenconnectedtogeneralnetworksinlimitedwaystofacilitatedataloading.Inthesecases,itisimportanttoshieldboththeinterconnectnetworkandthegeneralnetworkfromtheGreenplumDatabasetrafficandvisa-versa.UsearouteroranappropriateVLANconfigurationtoaccomplishthis.

ExternalNetworkConnectionsThemasternodesareconnectedtothegeneralcustomernetworktoallowusersandapplicationstosubmitqueries.Typically,thisisdonewithasmallnumberof1Gbconnectionsattachedtothemasternodes.Anymethodthataffordsnetworkconnectivityfromtheusersandapplicationsneedingaccesstothemasternodesisacceptable.

InstallationGuidelinesEachconfigurationrequiresaspecificrackplan.Therearesingleandmulti-rackconfigurationsdeterminedbythenumberofserverspresentintheconfiguration.Asinglerackconfigurationisonewherealltheplannedequipmentfitsintoonerack.Multi-rackconfigurationsrequiretwoormorerackstoaccommodatealltheplannedequipment.

RackingGuidelinesfora42URackConsiderthefollowingifinstallingtheclusterina42Urack.

Priortorackinganyhardware,performasitesurveytodeterminewhatpoweroptionisdesired,ifpowercableswillbetoporbottomoftherack,andwhethernetworkswitchesandpatchpanelswillbetoporbottomoftherack.

InstalltheKMMtrayintorackunit19.

Installtheinterconnectswitchesintorackunits21and22leavingaone-unitgapabovetheKMMtray.

Racksegmentnodesupfromfirstavailablerackunitatthebottomoftherack(seemulti-rackrulesforvariationsusinglowrackunits).

Installnomorethansixteen2Uservers(excludesmasterbutincludessegment,andETLnodes).

Installthemasternodeintorackunit17.Installthestand-bymasterintorackunit18.

Adminswitchescanberackedanywhereintherack,thoughthetopistypicallythebestandsimplestlocation.

©CopyrightPivotalSoftwareInc,2013-2016 11 A03

Page 12: Introduction | Pivotal Greenplum Database Docs

Allcomputers,switches,arrays,andracksshouldbelabeledonboththefrontandback.

Allcomputers,switches,arrays,andracksshouldbelabeledasdescribedinthesectiononlabelslaterinthisdocument.

Allinstalleddevicesshouldbeconnectedtotwoormorepowerdistributionunits(PDUs)intherackwherethedeviceisinstalled.

Wheninstallingamulti-rackcluster:

Installtheinterconnectcoreswitchesinthetoptworackunitsifthecablescomeinfromthetoporinthebottomtworackunitsifthecablescomeinfromthebottom.

Donotinstallcoreswitchesinthemasterrack.

CablingThenumberofcablesrequiredvariesaccordingtotheoptionsselected.Ingeneral,eachserverandswitchinstalledwilluseonecablefortheAdminnetwork.Runcablesaccordingtoestablishedcablingstandards.Eliminatetightbendsorcrimps.Clearlylabelallateachend.Thelabeloneachendofthecablemusttracethepaththecablefollowsbetweenserverandswitch.Thisincludes:

Switchnameandport

Patchpanelnameandport,ifapplicable

Servernameandport

SwitchConfigurationGuidelinesTypically,thefactorydefaultconfigurationissufficient.

IPAddressingGuidelines

IPAddressingSchemefortheAdminNetworkAnadminnetworkshouldbecreatedsothatsystemmaintenanceandaccessworkcanbedoneonanetworkthatisnotthesameasclustertrafficbetweenthenodes.

Note:Pivotal’srecommendedIPaddressforserversontheAdminnetworkusesastandardinternaladdressspaceandisextensibletoincludeover1,000nodes.

AllAdminnetworkswitchespresentshouldbecrossconnectedandallNICsattachedtotheseswitchesparticipateinthe172.254.0.0/16network.

Table4.IPAddressesforServersandCIMC

HostType NetworkInterface IPAddress

SecondaryMasterNode CIMC 172.254.1.252/16

Eth0 172.254.1.250/16

SecondaryMasterNode CIMC 172.254.1.253/16

Eth0 172.254.1.251/16

Non-masterSegmentNodesinrack1(masterrack) CIMC 172.254.1.101/16through172.254.1.116/16

Eth0 172.254.1.1/16through172.254.1.16/16

Non-masterSegmentNodesinrack2 CIMC 172.254.2.101/16through172.254.2.116/16

Eth0 172.254.2.1/16through172.254.2.16/16

Non-masterSegmentNodesinrack# CIMC 172.254.#.101/16through172.254.#.116/16

Eth0 172.254.#.1/16through172.254.#.16/16

Note:Where#istheracknumber.

Thefourthoctetiscountedfromthebottomup.Forexample,thebottomserverinthefirstrackis172.254.1.1andthetop,excludingmasters,is172.254.1.16.

©CopyrightPivotalSoftwareInc,2013-2016 12 A03

Page 13: Introduction | Pivotal Greenplum Database Docs

Thebottomserverinthesecondrackis172.254.2.1andtop172.254.2.16.Thiscontinuesforeachrackintheclusterregardlessofindividualserverpurpose.

IPAddressingforNon-serverDevicesThefollowingtableliststhecorrectIPaddressingforeachnon-serverdevice.

Table5.Non-serverIPAddresses

Device IPAddress

FirstInterconnectSwitchinRack *172.254.#.201/16

SecondInterconnectSwitchinRack *172.254.#.202/16

*Where#istheracknumber

IPAddressingforInterconnectsusing10GbNICsTheInterconnectiswheredataisroutedathighspeedbetweenthenodes.

Table6.InterconnectIPAddressingfor10GbNICS

HostType PhysicalRJ-45Port IPAddress

PrimaryMaster 1stportonPCIecard 172.1.1.250/16

2ndportonPCIecard 172.2.1.250/16

SecondaryMaster 1stportonPCIecard 172.1.1.251/16

2ndportonPCIecard 172.2.1.251/16

Non-MasterNodes 1stportonPCIecard 172.1.#.1/16through172.1.#.16/16

2ndportonPCIecard 172.2.#.1/16through172.2.#.16/16

Note:Where#istheracknumber:

Thefourthoctetiscountedfromthebottomup.Forexample,thebottomserverinthefirstrackuses172.1.1.1and172.2.1.1.

Thetopserverinthefirstrack,excludingmasters,uses172.1.1.16and172.2.1.16.

EachNIContheinterconnectusesadifferentsubnetandeachserverhasaNIConeachsubnet.

IPAddressingforFaultTolerantInterconnectsThefollowingtablelistscorrectIPaddressesforfaulttolerantinterconnectsregardlessofbandwidth.

Table7.FaultTolerant(Bonded)Interconnects

HostType IPAddress

PrimaryMaster 172.1.1.250/16

SecondaryMaster 172.1.1.251/16

Non-MasterNodes 172.1.#.1/16through172.1.#.16/16

Note:Where#istheracknumber:

Thefourthoctetiscountedfromthebottomup.Forexample,thebottomserverinthefirstrackuses172.1.1.1.

Thetopserverinthefirstrack,excludingmasters,uses172.1.1.16.

©CopyrightPivotalSoftwareInc,2013-2016 13 A03

Page 14: Introduction | Pivotal Greenplum Database Docs

DataLoadingConnectivityGuidelinesHigh-speeddataloadingrequiresdirectaccesstothesegmentnodes,bypassingthemasters.TherearethreewaystoconnectaPivotalclustertoexternaldatasourcesorbackuptargets:

VLANOverlay–ThefirstandrecommendedbestpracticeistousevirtualLANs(VLANS)toopenupspecifichostsinthecustomernetworkandtheGreenplumDatabaseclustertoeachother.

DirectConnecttoCustomerNetwork–Onlyuseifthereisaspecificcustomerrequirement.

Routing–Onlyuseifthereisaspecificcustomerrequirement.

VLANOverlayVLANoverlayisthemostcommonlyusedmethodtoprovideaccesstoexternaldatawithoutintroducingnetworkproblems.TheVLANoverlayimposesanadditionalVLANontheconnectionsofasubsetoftheclusterservers.

HowtheVLANOverlayMethodWorksUsingtheVLANOverlaymethod,trafficpassesbetweentheclusterserversontheinternalVLAN,butcannotpassoutoftheinternalswitchfabricbecausetheexternalfacingportsareassignedonlytotheoverlayVLAN.TrafficontheoverlayVLAN(traffictoorfromIPaddressesassignedtotherelevantservers’virtualnetworkinterfaces)canpassinandoutofthecluster.

ThisVLANconfigurationallowsmultipleclusterstoco-existwithoutrequiringanychangetotheirinternalIPaddresses.Thisgivescustomersmorecontroloverwhatelementsoftheclustersareexposedtothegeneralcustomernetwork.TheOverlayVLANcanbeadedicatedVLANandincludeonlythoseserversthatneedtotalktoeachother;ortheOverlayVLANcanbethecustomer’sfullnetwork.

Figure:BasicVLANOverlayExample

Thisfigureshowsaclusterwith3segmenthosts,master,standbymasterandETLhost.Inthiscase,onlytheETLhostispartoftheoverlay.ItisnotarequirementtohaveETLnodeusetheoverlay,thoughthisiscommoninmanyconfigurationstoallowdatatobestagedwithinacluster.Anyoftheserversinthisrackoranyrackofanyotherconfigurationmayparticipateintheoverlayifdesired.Thetypeofconfigurationwilldependuponsecurityrequirementsandiffunctionswithintheclusterneedtoreachanyoutsidedatasources.

ConfiguringtheOverlayVLAN–AnOverviewConfiguringtheVLANinvolvesthreesteps:

1. VirtualinterfacetagspacketswiththeoverlayVLAN

2. ConfiguretheswitchintheclusterwiththeoverlayVLAN

3. Configuretheportsontheswitchconnectingtothecustomernetwork

©CopyrightPivotalSoftwareInc,2013-2016 14 A03

Page 15: Introduction | Pivotal Greenplum Database Docs

Step1–VirtualinterfacetagspacketswiththeoverlayVLAN

EachserverthatisbothinthebaseVLANandtheoverlayVLANhasavirtualinterfacecreatedthattagspacketssentfromtheinterfacewiththeoverlayVLAN.Forexample,supposeeth2isthephysicalinterfaceonanETLserverthatisconnectedtothefirstinterconnectnetwork.ToincludethisserverinanoverlayVLANtheinterfaceeth2.1000iscreatedusingthesamephysicalportbutdefiningasecondinterfacefortheport.ThephysicalportdoesnottagitspacketsbutanypacketsentusingthevirtualportistaggedwithaVLAN.

Step2–ConfiguretheswitchintheclusterwiththeoverlayVLAN

TheswitchintheclusterthatconnectstotheserversandthecustomernetworkisconfiguredwiththeoverlayVLAN.AlloftheportsconnectedtoserversthatwillparticipateintheoverlayarechangedtoswitchportmodeconvergedandaddedtoboththeinternalVLAN(199)andtheoverlayVLAN(1000).

Step3–Configuretheswitchportsconnectedtothecustomernetwork

Theportsontheswitchconnectingtothecustomernetworkareconfiguredaseitheraccessortrunkmodeswitchports(dependingoncustomerpreference)andaddedonlytotheoverlayVLAN.

DirectConnecttotheCustomer’sNetwork

EachnodeintheGreenplumDatabaseclustercansimplybecableddirectlytothenetworkwherethedatasourcesexistoranetworkthatcancommunicatewiththesourcenetwork.Thisisabruteforceapproachthatworksverywell.Dependingonwhatnetworkfeaturesaredesired(redundancy,highbandwidth,etc.)thismethodcanbeveryexpensiveintermsofcablingandswitchgearaswellasspaceforrunninglargenumbersofcables.

Figure:DataLoading—DirectConnecttoCustomerNetwork

Routing

Onewayistouseanyofthestandardnetworkingmethodsusedtolinktwodifferentnetworkstogether.Thesecanbedeployedtotietheinterconnectnetwork(s)tothedatasourcenetwork(s).Whichofthesemethodsisusedwilldependonthecircumstancesandthegoalsfortheconnection.

ArouterisinstalledthatadvertisestheexternalnetworkstotheserversintheGreenplumcluster.Thismethodcouldpotentiallyhaveperformanceandconfigurationimplicationsonthecustomer’snetwork.

ValidationGuidelinesMostofthevalidationeffortisperformedaftertheOSisinstalledandavarietyofOS-leveltoolsareavailable.AchecklistisincludedintherelevantOSinstallationguidethatshouldbeseparatelyprintedandsignedfordeliveryandincludestheissuesraisedinthissection.

Examineandverifythefollowingitems:

Allcableslabeledaccordingtothestandardsinthisdocument

©CopyrightPivotalSoftwareInc,2013-2016 15 A03

Page 16: Introduction | Pivotal Greenplum Database Docs

Allrackslabeledaccordingtothestandardsinthisdocument

Alldevicespoweron

Allhot-swappabledevicesareproperlyseated

Nodevicesshowanywarningorfaultlights

AllnetworkmanagementportsareaccessibleviatheadministrationLAN

Allcablesareneatlydressedintotheracksandhavenosharpbendsorcrimps

Allrackdoorsandcoversareinstalledandcloseproperly

Allserversextendandretractwithoutpinchingorstretchingcables

Labels

Racks

EachrackinaRecommendedArchitectureislabeledatthetopoftherackandonboththefrontandback.RacksarenamedMasterRackorSegmentRack#,where#isasequentialnumberstartingat1.Aracklabelwouldlooklikethis:

Servers

Eachserverislabeledonboththefrontandbackoftheserver.Thelabelshouldbethehostnameoftheserver.

Inotherwords,ifasegmentnodeisknownassdw15,thelabelonthatserverwouldbesdw15.

Switches

Switchesarelabeledaccordingtotheirpurpose.Interconnectswitchesarei-sw,administrationswitchesarea-sw,andETLswitchesaree-sw.Eachswitchisassignedanumberstartingat1.Switchesarelabeledonthefrontoftheswitchonlysincethebackisgenerallynotvisiblewhenracked.

CertificationGuidelines

NetworkPerformanceTestgpcheckperf

Verifiesthelinerateonboth10GbNICs.

Run gpcheckperf onthedisksandnetworkconnectionswithinthecluster.Aseachcertificationwillvaryduetothenumberofdisks,nodes,andnetworkbandwidthavailable,thecommandstoruntestswilldiffer.

SeeUsinggpcheckperftoValidateDiskandNetworkPerformanceformoreinformationonthe gpcheckperf command.

HardwareMonitoringandFailureAnalysisGuidelinesInordertosupportmonitoringofarunningclusterthefollowingitemsshouldbeinplaceandcapableofbeingmonitoredwithinformationgatheredavailableviainterfacessuchasSNMPorIPMI.

©CopyrightPivotalSoftwareInc,2013-2016 16 A03

Page 17: Introduction | Pivotal Greenplum Database Docs

Fans/TempFanstatus/presence

Fanspeed

Chassistemp

CPUtemp

IOHtemp

MemoryDIMMtemp

DIMMstatus(populated,online)

DIMMsinglebiterrors

DIMMdoublebiterrors

ECCwarnings(correctionsexceedingthreshold)

ECCcorrectableerrors

ECCuncorrectableerrors

MemoryCRCerrors

SystemErrorsPosterrors

PCIefatalerrors

PCIenon-fatalerrors

CPUmachinecheckexception

Intrusiondetection

Chipseterrors

PowerPowerSupplypresence

Powersupplyfailures

Powersupplyinputvoltage

Powersupplyamperage

Motherboardvoltagesensors

Systempowerconsumption

©CopyrightPivotalSoftwareInc,2013-2016 17 A03

Page 18: Introduction | Pivotal Greenplum Database Docs

PivotalClusterExamplesThefollowingtablelistsgoodchoicesforclusterhardwarebasedonIntelSandyBridgeprocessor-basedserversandCiscoswitches.

Table1.HardwareComponents

MasterNode

Twoofthesenodespercluster

1Userver(similartotheDellR630):

2xE5-2680v3processors(2.5GHz,12cores,120W)

256GBRAM(8x16GB)

1xRAIDcardw/1GBprotectedcache

8xSAS,10k,6Gdisks(typically8x600GB,2.5”)Organizedintoasingle,RAID5diskgroupwithahotspare.LogicaldevicesdefinedaspertheOSneeds(boot,root,swap,etc.)andtheremaininginasingle,largefilesystemfordata

2x10GbIntel,QLogic,orEmulexbasedNICs

Lightsoutmanagement(IPMI-basedBMC)

2x650Worhigher,high-efficiencypowersupplies

SegmentNode&ETLNode

Upto16perrack.Nomaximumtotalcount

2Userver(similartotheDellR730xd):

2xE5-2680v3processors(2.5GHz,12cores,120W)

256GBRAM(8x16GB)

1xRAIDcardw/1GBprotectedcache

12to24xSAS,10k,6Gdisks(typically12x600GB,3.5”or24x1.8TB,2.5”)OrganizedintotwotofourRAID5groups.Usedeitherastwotofourdatafilesystems(withlogicaldevicesskimmedoffforboot,root,swap,etc.)orasonelargedeviceboundwithLogicalVolumeManager.

2x10GbIntel,QLogic,orEmulexbasedNICs

Lightsoutmanagement(IPMI-basedBMC)

2x650Worhigherhigh-efficiencypowersupplies

AdminSwitch

CiscoCatalyst2960Series

Asimple,48-port,1GBswitchwithfeaturesthatallowittobeeasilycombinedwithotherswitchestoexpandthenetwork.Theleastexpensive,managedswitchwithgoodreliabilityisappropriateforthisrole.Therewillbeatleastoneperrack.

Interconnect

Arista7050-52

TheAristaswitchlineallowsformulti-switchlinkaggregationgroups(calledMLAG),easyexpansion,andareliablebodyofhardwareandoperatingsystem.

©CopyrightPivotalSoftwareInc,2013-2016 18 A03

Page 19: Introduction | Pivotal Greenplum Database Docs

ExampleRackLayoutThefollowingfigureisanexampleracklayoutwithproperswitchandserverplacement.

Figure:42URackDiagram

©CopyrightPivotalSoftwareInc,2013-2016 19 A03

Page 20: Introduction | Pivotal Greenplum Database Docs

UsinggpcheckperftoValidateDiskandNetworkPerformanceThefollowingexamplesillustratehowgpcheckperfisusedtovalidatediskandnetworkperformanceinacluster.

CheckingDiskPerformance—gpcheckperfOutput

[gpadmin@mdw~]$gpcheckperf-fhosts-rd-D-d/data1/primary-d/data2/primary-S80G

/usr/local/greenplum-db/./bin/gpcheckperf-fhosts-rd-D-d/data1/primary-d/data2/primary-S80G

--------------------

DISKWRITETEST

--------------------

--------------------

DISKREADTEST

--------------------

====================

==RESULT

====================

diskwriteavgtime(sec):71.33diskwritetotbytes:343597383680

diskwritetotbandwidth(MB/s):4608.23

diskwriteminbandwidth(MB/s):1047.17[sdw2]diskwritemaxbandwidth(MB/s):1201.70[sdw1]

perhostbandwidth--

diskwritebandwidth(MB/s):1200.82[sdw4]diskwritebandwidth(MB/s):1201.70[sdw1]diskwritebandwidth(MB/s):1047.17[sdw2]diskwritebandwidth(MB/s):1158.53[sdw3]

diskreadavgtime(sec):103.17diskreadtotbytes:343597383680

diskreadtotbandwidth(MB/s):5053.03

diskreadminbandwidth(MB/s):318.88[sdw2]diskreadmaxbandwidth(MB/s):1611.01[sdw1]diskreadbandwidth(MB/s):1611.01[sdw1]diskreadbandwidth(MB/s):318.88[sdw2]diskreadbandwidth(MB/s):1560.38[sdw3]--perhostbandwidth--

CheckingNetworkPerformance—gpcheckperfOutput

©CopyrightPivotalSoftwareInc,2013-2016 20 A03

Page 21: Introduction | Pivotal Greenplum Database Docs

[gpadmin@mdw~]$gpcheckperf-fnetwork1-rN-d/tmp

/usr/local/greenplum-db/./bin/gpcheckperf-fnetwork1-rN-d/tmp

-------------------

--NETPERFTEST

-------------------

====================

==RESULT

====================

Netperfbisectionbandwidthtestsdw1->sdw2=1074.010000

sdw3->sdw4=1076.250000sdw2->sdw1=1094.880000sdw4->sdw3=1104.080000

Summary:

sum=4349.22MB/secmin=1074.01MB/secmax=1104.08MB/secavg=1087.31MB/secmedian=1094.88MB/sec

©CopyrightPivotalSoftwareInc,2013-2016 21 A03

Page 22: Introduction | Pivotal Greenplum Database Docs

PivotalGreenplumSegmentInstancesperServer

UnderstandingGreenplumSegmentsGreenplumsegmentinstancesareessentiallyindividualdatabases.InaGreenplumclustertherewillbeaGreenplummasterserverwhichdispatchesworktobedonetomultiplesegmentinstances.Eachoftheseinstanceswillresideonsegmenthosts.Dataforatableisdistributedacrossallofthesegmentinstancesandwhenaqueryisexecutedthatrequestsdataitisdispatchedtoallofthemtoexecuteinparallel.Thoseinstancesthatactivelyprocessthequeryarereferredtoastheprimaryinstances.AGreenplumclusterinadditionwillberunningmirrorinstances,onepairedtoeachprimary.Themirrorsdonotparticipateinansweringqueries;theyarejustperformingdatareplication,sothatifaprimaryshouldfailitsmirrorcantakeoverprocessinginitsplace.

Whenplanningacluster,itisimportanttounderstandthatalloftheseinstancesaregoingtoacceptaqueryinparallelandactuponit.Thereforetheremustbeenoughresourcesonaservertofacilitatealloftheseprocessesrunningandcommunicatingwitheachotheratonce.

SegmentsResourcesRuleofThumbAgeneralruleofthumbisthatforeverysegmentinstance(primaryormirror)youwillwanttoprovideatleast:

1core

200MB/sIOread

200MB/sIOwrite

8GBRAM

1GBnetworkthroughput

Asegmenthostwith8primaryand8mirrorinstanceswouldhave:

16cores

3200MB/sIOread

3200MB/sIOwrite

128GBRAM

20GBnetworkthroughput

Thesenumbershaveproventoprovideareliableplatformforavarietyofusecasesandgiveagoodbaselineforthenumberofinstancestorunonasingleserver.Pivotalrecommendsamaximumof8primaryand8mirrorinstancesonaservereveniftheresourcesprovidedaresufficientformore.

Pivotalhasfoundthatallocatingaratioof1to2physicalCPUsperprimarysegmentworkswellformostusecases;itisnotrecommendtodropbelow1CPUperprimarysegment.IdealarchitectureswilladditionallyalignNUMAarchitecturewiththenumberofsegments.

ReasonstoreducethenumberofsegmentinstancesperserverAdatabaseschemathatusespartitionedcolumnartableshasthepotentialtogeneratealargenumberoffiles.Forexample,atablethatispartitioneddailyforayearwillhaveover300files,oneforeachday.Ifthattableadditionallyhascolumnarorientationwith300columnsitwillhavewellover90,000filesrepresentingthedatainthattableononesegmentinstance.Aserverthatisrunning8primaryinstanceswiththistablewouldhavetoopen720,000filesifafulltablescanquerywereissuedtothattable.Systemsthatmakeuseofpartitioncolumnartablesmaybenefitfromalessernumberofsegmentinstancesperserverifdataisbeingusedinawaythatrequiresmanyopenfiles.

Systemsthatspanlargenumbersofnodescreatemoreworkforthemastertoplanqueriesanddocoordinationofallofthesegments.Insystemsspanningtwoormoreracksconsiderreducingthenumberofsegmentinstancesperserver.

Whenqueriesrequirelargeamountsofmemoryreducingthenumberofsegmentsperserverincreasestheamountofmemoryavailabletoanyonesegment.

Iftheamountofconcurrentqueryprocessingcausesresourcestorunlowonthesystem,reducingtheamountofparallelismontheplatformitselfwillallowformoreparallelisminqueryexecution.

Reasonstoincreasethenumberofsegmentinstancesperserver

©CopyrightPivotalSoftwareInc,2013-2016 22 A03

Page 23: Introduction | Pivotal Greenplum Database Docs

Inlowconcurrencysystemsincreasingthesegmentinstancecountwillalloweachquerytoutilizemoreresourcesinparallelifsystemutilizationislow.

SystemswithlargeamountsoffreeRAMthatcanbeusedbytheOSforfilebuffersmaybenefitfromincreasingthenumberofsegmentinstancesperserver.

©CopyrightPivotalSoftwareInc,2013-2016 23 A03

Page 24: Introduction | Pivotal Greenplum Database Docs

PivotalGreenplumonVirtualizedSystems

GeneralunderstandingofPivotalGreenplumandvirtualizationGreenplumDatabaseisaparallelprocessingsoftware.ThismeansthatthePivotalGreenplumsoftwareoftendoesthesameprocessatthesametimeacrossaclusterofnodes.Virtualizationisfrequentlyusedtocentralizesystemssothattheywillbeabletoshareresources,takingadvantageofthefactthatsoftwareoftenutilizesresourcessporadically,allowingthoseresourcestobeover-subscribed.GreenplumDatabasewillnotfunctionwellinanoversubscribedenvironmentbecauseallsegmentsbecomeactiveatonceduringqueryprocessing.Inthattypeofenvironment,thesystemispronetobottlenecksandunpredictablebehaviorthatcouldresultfrombeingunabletoaccessresourcesthesystembelievesithasbeenallocated.

Withthisinmind,aslongasthesystemmeetstherequirementssetforthintheinstallationguide,Greenplumissupportedonvirtualinfrastructure.

ChoosingthenumberofsegmentinstancestorunperVMTherecommendedhardwarespecificationsarequitelargeandmaybehardtoachieveinavirtualenvironment.InthesecaseseachVMshouldhavenomorethan1primaryand1mirrorsegmentforevery2CPUs,32GBofRAM,and300MB/sofsequentialreadbandwidthandwritebandwidth.ThusaVMwith4CPU,64GBRAM,and1GB/ssequentialreadandwritewouldbeabletohost2primarysegmentinstancesand2mirrorsegmentinstances.

WhileitispossibletocreatesegmenthostVMsthatonlyhostasingleprimarysegmentinstance,itispreferredtohaveatleasttwoormoreprimarysegmentinstancesperVM.Certainqueriesthatperformtaskssuchaslookingforuniquenesscancausesomesegmentinstancestoperformmorework,andrequiremoreresources,thanotherinstances.Groupingmultiplesegmentinstancestogetherononeservercanmitigatesomeoftheseincreasedresourceneedsbyallowingasegmentinstancetoutilizetheresourcesallocatedtotheothersegmentinstances.

VMEnvironmentSettingsVMshostingGreenplumDatabaseshouldnothaveanyauto-migrationfeaturesturnedon.Thesegmentinstancesareexpectingtoruninparallelandifoneofthemispausedtocoalescememoryorstateformigrationthesystemcanseeitasafailureoroutage.Itwouldbebettertotakethesystemdown,removeitfromtheactiveclusterandthenintroduceitbackintoclusteronceithasbeenmoved.

Specialcareshouldbegiventounderstandthetopologyofprimaryandmirrorsegmentinstances.NosetofVMsthatcontainaprimaryanditsmirrorshouldrunonthesamehostsystem.Ifahostcontainingboththeprimaryandmirrorforasegmentfails,theGreenplumclusterwillbeofflineuntilatleastoneofthemisrestoredtocompletethedatabasecontent.

©CopyrightPivotalSoftwareInc,2013-2016 24 A03

Page 25: Introduction | Pivotal Greenplum Database Docs

AdditionalHelpfulTools

YumRepositoryConfiguringaYUMrepositoryonthemasterserverscanmakemanagementofthesoftwareacrosstheclustermoreefficient,particularlyincaseswherethesegmentnodesdonothaveexternalinternetaccess.Morethanonerepositorycanmakemanagementeasier,forexampleonerepositoryforOSfilesandanotherforallotherpackages.Configuretherepositoriesonboththemasterandstandbymasterservers.

KickstartImagesKickstartimagesforthemasterserversandsegmenthostscanspeedupimplementationofnewserversandrecoveryoffailednodes.Inmostcaseswherethereisanodefailurebutthedisksaregood,reimagingisnotnecessarybecausethedisksinthefailedservercanbetransferredtothenewreplacementnode.

©CopyrightPivotalSoftwareInc,2013-2016 25 A03