Microsoft SQL Server Analysis Services Multidimensional Performance and Operations Guide

201

Transcript of Microsoft SQL Server Analysis Services Multidimensional Performance and Operations Guide

Microsoft SQL Server Analysis Services Multidimensional Performance and Operations GuideThomas Kejser and Denny LeeContributors and Technical Reviewers: Peter Adshead (UBS), T.K. Anand, KaganArca, Andrew Calvett (UBS), Brad Daniels, John Desch, Marius Dumitru, WillfriedFrber (Trivadis), Alberto Ferrari (SQLBI), Marcel Franke (pmOne), Greg Galloway (Artis Consulting), Darren Gosbell (James & Monroe), DaeSeong Han, Siva Harinath, Thomas Ivarsson (Sigma AB), Alejandro Leguizamo (SolidQ), Alexei Khalyako, Edward Melomed, AkshaiMirchandani, Sanjay Nayyar (IM Group), TomislavPiasevoli, Carl Rabeler (SolidQ), Marco Russo (SQLBI), Ashvini Sharma, Didier Simon, John Sirmon, Richard Tkachuk, Andrea Uggetti, Elizabeth Vitt, Mike Vovchik, Christopher Webb (Crossjoin Consulting), SedatYogurtcuoglu, Anne Zorner

Summary: Download this book to learn about Analysis Services Multidimensional performance tuning from an operational and development perspective. This book consolidates the previously published SQL Server 2008 R2 Analysis Services Operations Guide and SQL Server 2008 R2 Analysis Services Performance Guide into a single publication that you can view on portable devices. Category: Guide Applies to: SQL Server 2005, SQL Server 2008, SQL Server 2008 R2, SQL Server 2012 Source: White paper (link to source content, link to source content) E-book publication date: May 2012 200 pages

This page intentionally left blank

Copyright 2012 by Microsoft Corporation All rights reserved. No part of the contents of this book may be reproduced or transmitted in any form or by any means without the written permission of the publisher.

Microsoft and the trademarks listed at http://www.microsoft.com/about/legal/en/us/IntellectualProperty/Trademarks/EN-US.aspx are trademarks of the Microsoft group of companies. All other marks are property of their respective owners. The example companies, organizations, products, domain names, email addresses, logos, people, places, and events depicted herein are fictitious. No association with any real company, organization, product, domain name, email address, logo, person, place, or event is intended or should be inferred. This book expresses the authors views and opinions. The information contained in this book is provided without any express, statutory, or implied warranties. Neither the authors, Microsoft Corporation, nor its resellers, or distributors will be held liable for any damages caused or alleged to be caused either directly or indirectly by this book.

Contents1 2 Introduction..........................................................................................................................................5 Part1:BuildingaHighPerformanceCube...........................................................................................6 2.1 2.2 2.3 2.4 2.5 3 DesignPatternsforScalableCubes...........................................................................................6 TestingAnalysisServicesCubes..............................................................................................32 TuningQueryPerformance.....................................................................................................39 TuningProcessingPerformance..............................................................................................76 SpecialConsiderations............................................................................................................93

Part2:RunningaCubeinProduction...............................................................................................105 3.1 3.2 3.3 3.4 3.5 3.6 3.7 ConfiguringtheServer ..........................................................................................................106 . MonitoringandTuningtheServer........................................................................................133 SecurityandAuditing............................................................................................................143 HighAvailabilityandDisasterRecovery................................................................................147 DiagnosingandOptimizing....................................................................................................150 ServerMaintenance..............................................................................................................189 SpecialConsiderations..........................................................................................................192

4

Conclusion.........................................................................................................................................200

Sendfeedback...........................................................................................................................................200

4

1 IntroductionThisbookconsolidatestwopreviouslypublishedguidesintooneessentialresourceforAnalysisServices developersandoperationspersonnel.AlthoughthetitlesoftheoriginalpublicationsindicateSQLServer 2008R2,mostoftheknowledgethatyougainfromthisbookiseasilytransferredtootherversionsof AnalysisServices,includingmultidimensionalmodelsbuiltusingSQLServer2012. Part1isfromtheSQLServer2008R2AnalysisServicesPerformanceGuide.PublishedinOctober 2011,thisguidewascreatedfordevelopersandcubedesignerswhowanttobuildhighperformance cubesusingbestpracticesandinsightslearnedfromrealworlddevelopmentprojects.InPart1,youll learnproventechniquesforbuildingsolutionsthatarefastertoprocessandquery,minimizingtheneed forfurthertuningdowntheroad. Part2isfromtheSQLServer2008R2AnalysisServicesOperationsGuide.Thisguide,publishedinJune 2011,isintendedfordevelopersandoperationsspecialistswhomanagesolutionsthatarealreadyin production.Part2showsyouhowtoextractperformancegainsfromaproductioncube,including changingserverandsystemproperties,andperformingsystemmaintenancethathelpyouavoid problemsbeforetheystart. Whileeachguidetargetsadifferentpartofasolutionlifecycle,havingbothinasingleportableformat givesyouanintellectualtoolkitthatyoucanaccessonmobiledeviceswhereveryoumaybe.Wehope youfindthisbookhelpfulandeasytouse,butitisonlyoneofseveralformatsavailableforthiscontent. YoucanalsogetprintableversionsofbothguidesbydownloadingthemfromtheMicrosoftwebsite.

5

2 Part 1: Building a High-Performance CubeThissectionprovidesinformationaboutbuildingandtuningAnalysisServicescubesforthebestpossible performance.Itisprimarilyaimedatbusinessintelligence(BI)developerswhoarebuildinganewcube fromscratchoroptimizinganexistingcubeforbetterperformance. Thegoalofthissectionistoprovideyouwiththenecessarybackgroundtounderstanddesigntradeoffs andwithtechniquesanddesignpatternsthatwillhelpyouachievethebestpossibleperformanceof evenlargecubes. Cubeperformancecanbedividedintotwotypesofworkload:queryperformanceandprocessing performance.Becausetheseworkloadsareverydifferent,thissectionisorganizedintofourmain groups. DesignPatternsforScalableCubesNoamountofquerytuningandoptimizationcanbeatthebenefits ofawelldesigneddatamodel.Thissectioncontainsguidancetohelpyougetthedesignrightthefirst time.Ingeneral,goodcubedesignfollowsKimballmodelingtechniques,andifyouavoidsometypical designmistakes,youareinverygoodshape. TestingAnalysisServicesCubesIneveryITproject,preproductiontestingisacrucialpartofthe developmentanddeploymentcycle.Evenwiththemostcarefuldesign,testingwillstillbeabletoshake outerrorsandavoidproductionissues.Designingandrunningatestrunofanenterprisecubeistime wellinvested.Hence,thissectionincludesadescriptionofthetestmethodsavailabletoyou. TuningQueryPerformanceQueryperformancedirectlyimpactsthequalityoftheenduser experience.Assuch,itistheprimarybenchmarkusedtoevaluatethesuccessofanonlineanalytical processing(OLAP)implementation.AnalysisServicesprovidesavarietyofmechanismstoaccelerate queryperformance,includingaggregations,caching,andindexeddataretrieval.Thissectionalso providesguidanceonwritingefficientMultidimensionalExpressions(MDX)calculationscripts. TuningProcessingPerformanceProcessingistheoperationthatrefreshesdatainanAnalysisServices database.Thefastertheprocessingperformance,thesooneruserscanaccessrefresheddata.Analysis Servicesprovidesavarietyofmechanismsthatyoucanusetoinfluenceprocessingperformance, includingparallelizedprocessingdesigns,relationaltuning,andaneconomicalprocessingstrategy(for example,incrementalversusfullrefreshversusproactivecaching). SpecialConsiderationsSomefeaturesofAnalysisServicessuchasdistinctcountmeasuresandmany tomanydimensionsrequiremorecarefulattentiontothecubedesignthanothers.AttheendofPart1, youwillfindasectionthatdescribesthespecialtechniquesyoushouldapplywhenusingthesefeatures.

2.1 Design Patterns for Scalable CubesCubespresentauniquechallengetotheBIdeveloper:theyareadhocdatabasesthatareexpectedto respondtomostqueriesinshorttime.Thefreedomoftheenduserislimitedonlybythedatamodel youimplement.Achievingabalancebetweenuserfreedomandscalabledesignwilldeterminethe 6

successofacube.Eachindustryhasspecificdesignpatternsthatlendthemselveswelltovalueadding reportingandadetailedtreatmentofoptimal,industryspecificdatamodelisoutsidethescopeofthis book.However,therearealotofcommondesignpatternsyoucanapplyacrossallindustriesthis sectiondealswiththesepatternsandhowyoucanleveragethemforincreasedscalabilityinyourcube design.

2.1.1 Building Optimal DimensionsAwelltuneddimensiondesignisoneofthemostcriticalsuccessfactorsofahighperformingAnalysis Servicessolution.Thedimensionsofthecubearethefirststopfordataanalysisandtheirdesignhasa deepimpactontheperformanceofallmeasuresinthecube. Dimensionsarecomposedofattributes,whicharerelatedtoeachotherthroughhierarchies.Efficient useofattributesisakeydesignskilltomaster,andstudyingandimplementingtheattribute relationshipsavailableinthebusinessmodelcanhelpimprovecubeperformance. Inthissection,youwillfindguidanceonbuildingoptimizeddimensionsandproperlyusingboth attributesandhierarchies.

2.1.1.1 Using the KeyColumns, ValueColumn, and NameColumn Properties EffectivelyWhenyouaddanewattributetoadimension,threepropertiesareusedtodefinetheattribute.The KeyColumnspropertyspecifiesoneormoresourcefieldsthatuniquelyidentifyeachinstanceofthe attribute. TheNameColumnpropertyspecifiesthesourcefieldthatwillbedisplayedtoendusers.Ifyoudonot specifyavaluefortheNameColumnproperty,itisautomaticallysettothevalueoftheKeyColumns property. ValueColumnallowsyoutocarryfurtherinformationabouttheattributetypicallyusedfor calculations.Unlikememberproperties,thispropertyofanattributeisstronglytypedproviding increasedperformancewhenitisusedincalculations.Thecontentsofthispropertycanbeaccessed throughtheMemberValueMDXfunction. UsingbothValueColumnandNameColumntocarryinformationeliminatestheneedforextraneous attributes.Thisreducesthetotalnumberofattributesinyourdesign,makingitmoreefficient. Itisabestpracticetoassignanumericsourcefield,ifavailable,totheKeyColumnspropertyratherthan astringproperty.Furthermore,useasinglecolumnkeyinsteadofacomposite,multicolumnkey.Not onlydothesepracticesthisreduceprocessingtime,theyalsoreducethesizeofthedimensionandthe likelihoodofusererrors.Thisisespeciallytrueforattributesthathavealargenumberofmembers,that is,greaterthanonemillionmembers.

7

2.1.1.2 Hiding Attribute HierarchiesFormanydimensions,youwillwanttheusertonavigatehierarchiescreatedforeaseofaccess.For example,acustomerdimensioncouldbenavigatedbydrillingintocountryandcitybeforereachingthe customername,orbydrillingthroughagegroupsorincomelevels.Suchhierarchies,coveredinmore detaillater,makenavigationofthecubeeasierandmakequeriesmoreefficient. Inadditiontouserhierarchies,AnalysisServicesbydefaultcreatesaflathierarchyforeveryattributein adimensiontheseareattributehierarchies.Hidingattributehierarchiesisoftenagoodidea,because alotofhierarchiesinasingledimensionwilltypicallyconfuseusersandmakeclientqueriesless efficient.ConsidersettingAttributeHierarchyVisible=falseformostattributehierarchiesanduseuser hierarchiesinstead. 2.1.1.2.1 Hiding the Surrogate Key Itisoftenagoodideatohidethesurrogatekeyattributeinthedimension.Ifyouexposethesurrogate keytotheclienttoolsasaValueColumn,thosetoolsmayrefertothekeyvaluesinreports.The surrogatekeyinaKimballstarschemadesignholdsnobusinessinformation,andmayevenchangeif youremodeltype2history.Afteryoucreateadependencytothekeyintheclienttools,youcannot changethekeywithoutbreakingreports.Becauseofthis,youdontwantenduserreportsreferringto thesurrogatekeydirectlyandthisiswhywerecommendhidingit. Thebestdesignforasurrogatekeyistohideitfromusersinthedimensiondesignbysettingthe AttributeHierarchyVisible=falseandbynotincludingtheattributeinanyuserhierarchies.This preventsendusertoolsfromreferencingthesurrogatekey,leavingyoufreetochangethekeyvalueif requirementschange.

2.1.1.3 Setting or Disabling Ordering of AttributesInmostcases,youwantanattributetohaveanexplicitordering.Forexample,youwillwantaCity attributetobesortedalphabetically.YoushouldexplicitlysettheOrderByorOrderByAttribute propertyoftheattributetoexplicitlycontrolthisordering.Typically,thisorderingisbyattributename orkey,butitmayalsobeanotherattribute.Ifyouincludeanattributeonlyforthepurposeofordering anotherattribute,makesureyousetAttributeHierarchyEnabled=falseand AttributeHierarchyOptimizedState=NotOptimizedtosaveonprocessingoperations. Therearefewcaseswhereyoudontcareabouttheorderingofanattribute,yetthesurrogatekeyis onesuchcase.Forsuchhiddenattributethatyouusedonlyforimplementationpurposes,youcanset AttributeHierarchyOrdered=falsetosavetimeduringprocessingofthedimension.

2.1.1.4 Setting Default Attribute MembersAnyquerythatdoesnotexplicitlyreferenceahierarchywillusethecurrentmemberofthat hierarchy.ThedefaultbehaviorofAnalysisServicesistoassigntheAllmemberofadimensionasthe defaultmember,whichisnormallythedesiredbehavior.Butforsomeattributes,suchasthecurrent

8

dayinadatedimension,itsometimesmakessensetoexplicitlyassignadefaultmember.Forexample, youmaysetadefaultdateintheAdventureWorkscubelikethis.ALTERCUBE [Adventure Works]UPDATE DIMENSION [Date], DEFAULT_MEMBER='[Date].[Date].&[2000]'

However,defaultmembersmaycauseissuesintheclienttool.Forexample,MicrosoftExcel2010will notprovideavisualindicationthatadefaultmemberiscurrentlyselectedandhenceimplicitlyinfluence thequeryresult.ThismayconfuseuserswhoexpecttheAllleveltobethecurrentmemberwhenno othermembersareimpliedbythequery.Also,ifyousetadefaultmemberinadimensionwithmultiple hierarchies,youwilltypicallygetresultsthatarehardforuserstointerpret. Ingeneral,preferexplicitlydefaultmembersonlyondimensionswithsinglehierarchiesorinhierarchies thatdonothaveanAlllevel.

2.1.1.5 Removing the All LevelMostdimensionsrolluptoacommonAlllevel,whichistheaggregationofalldescendants.Butthere aresomeexceptionswhereisdoesnotmakesensetoqueryattheAlllevel.Forexample,youmayhave acurrencydimensioninthecubeandaskingforthesumofallcurrenciesisameaninglessquestion. ItcanevenbeexpensivetoaskfortheAlllevelofdimensionifthereisnotgoodaggregatetorespondto thequery.Forexample,ifyouhaveacubepartitionedbycurrency,askingfortheAlllevelofcurrency willcauseascanofallpartitions,whichcouldbeexpensiveandleadtoauselessresult. InordertopreventusersfromqueryingmeaninglessAlllevels,youcandisabletheAllmemberina hierarchy.YoudothisbysettingtheIsAggregateable=falseontheattributeatthetopofthehierarchy. NotethatifyoudisabletheAlllevel,youshouldalsosetadefaultmemberasdescribedintheprevious sectionifyoudont,AnalysisServiceswillchooseoneforyou.

2.1.1.6 Identifying Attribute RelationshipsAttributerelationshipsdefinehierarchicaldependenciesbetweenattributes.Inotherwords,ifAhasa relatedattributeB,writtenAB,thereisonememberinBforeverymemberinA,andmanymembers inAforagivenmemberinB.Forexample,givenanattributerelationshipCityState,ifthecurrent cityisSeattle,weknowtheStatemustbeWashington. Often,therearerelationshipsbetweenattributesthatmightormightnotbemanifestedintheoriginal dimensiontablethatcanbeusedbytheAnalysisServicesenginetooptimizeperformance.Bydefault, allattributesarerelatedtothekey,andtheattributerelationshipdiagramrepresentsabushwhere relationshipsallstemfromthekeyattributeandendateachothersattribute.

9

Figure 1 Bushy attribute relationship 11: a ps

Youcano optimizeperfo ormancebyd defininghierarchicalrelatio onshipssupportedbythedata.Inthisc case, amodeln nameidentifie estheproduc ctlineandsubcategory,an ndthesubcat tegoryidentif fiesacategor ry.In otherwor rds,asingles subcategoryis snotfoundin nmorethano onecategory.Ifyouredefi inethe relationsh hipsintheatt tributerelatio onshipeditor, ,therelationshipsareclea arer.

Figure 2 Redefin 22: ned attribu relation ute nships

Attributerelationships shelpperform manceinthre eesignificantways: sbetweenlev velsinthehie erarchydono otneedtogothroughthekeyattribute.This Crossproducts avesCPUtime eduringquer ries. sa Aggregationsb builtonattrib butescanber reusedforqu ueriesonrelat tedattributes s.Thissaves esourcesduringprocessing gandforque eries. re AutoExistcanmoreefficientlyeliminate eattributeco ombinationst thatdonotex xistinthedat ta.

Considert thecrosspro oductbetweenSubcategor ryandCatego oryinthetwo ofigures.Inthefirst,wher reno attributerelationshipshavebeenexplicitlydefin ned,theengin nemustfirstf findwhichpr roductsarein n 10

eachsubcategoryandthendeterminewhichcategorieseachoftheseproductsbelongsto.Forlarge dimensions,thiscantakealongtime.Iftheattributerelationshipisdefined,theAnalysisServices engineknowsbeforehandwhichcategoryeachsubcategorybelongstoviaindexesbuiltatprocesstime. 2.1.1.6.1 Flexible vs. Rigid Relationships Whenanattributerelationshipisdefined,therelationcaneitherbeflexibleorrigid.Aflexibleattribute relationshipisonewherememberscanmovearoundduringdimensionupdates,andarigidattribute relationshipisonewherethememberrelationshipsareguaranteedtobefixed.Forexample,the relationshipbetweenmonthandyearisfixedbecauseaparticularmonthisntgoingtochangeitsyear whenthedimensionisreprocessed.However,therelationshipbetweencustomerandcitymaybe flexibleascustomersmove. Whenachangeisdetectedduringprocessinaflexiblerelationship,allindexesforpartitionsreferencing theaffecteddimension(includingtheindexesforattributethatarenotaffected)mustbeinvalidated. ThisisanexpensiveoperationandmaycauseProcessUpdateoperationstotakeaverylongtime. IndexesinvalidatedbychangesinflexiblerelationshipsmustberebuiltafteraProcessUpdateoperation withaProcessIndexontheaffectedpartitions;thisaddsevenmoretimetocubeprocessing. Flexiblerelationshipsarethedefaultsetting.Carefullyconsidertheadvantagesofrigidrelationshipsand changethedefaultwherethedesignallowsit.

2.1.1.7 Using Hierarchies EffectivelyAnalysisServicesenablesyoutobuildtwotypesofuserhierarchies:naturalandunnaturalhierarchies. Eachtypehasdifferentdesignandperformancecharacteristics. Inanaturalhierarchy,allattributesparticipatingaslevelsinthehierarchyhavedirectorindirect attributerelationshipsfromthebottomofthehierarchytothetopofthehierarchy. Inanunnaturalhierarchy,thehierarchyconsistsofatleasttwoconsecutivelevelsthathavenoattribute relationships.Typicallythesehierarchiesareusedtocreatedrilldownpathsofcommonlyviewed attributesthatdonotfollowanynaturalhierarchy.Forexample,usersmaywanttoviewahierarchyof GenderandEducation.

Figure 33: Natural and unnatural hierarchies

11

Fromaperformanceperspective,naturalhierarchiesbehaveverydifferentlythanunnaturalhierarchies do.Innaturalhierarchies,thehierarchytreeismaterializedondiskinhierarchystores.Inaddition,all attributesparticipatinginnaturalhierarchiesareautomaticallyconsideredtobeaggregationcandidates. Unnaturalhierarchiesarenotmaterializedondisk,andtheattributesparticipatinginunnatural hierarchiesarenotautomaticallyconsideredasaggregationcandidates.Rather,theysimplyprovide userswitheasytousedrilldownpathsforcommonlyviewedattributesthatdonothavenatural relationships.Byassemblingtheseattributesintohierarchies,youcanalsouseavarietyofMDX navigationfunctionstoeasilyperformcalculationslikepercentofparent. Totakeadvantageofnaturalhierarchies,definecascadingattributerelationshipsforallattributesthat participateinthehierarchy.

2.1.1.8 Turning Off the Attribute HierarchyMemberpropertiesprovideadifferentmechanismtoexposedimensioninformation.Foragiven attribute,memberpropertiesareautomaticallycreatedforeverydirectattributerelationship.Forthe primarykeyattribute,thismeansthateveryattributethatisdirectlyrelatedtotheprimarykeyis availableasamemberpropertyoftheprimarykeyattribute. Ifyouonlywanttoaccessanattributeasmemberproperty,afteryouverifythatthecorrectrelationship isinplace,youcandisabletheattributeshierarchybysettingtheAttributeHierarchyEnabledproperty toFalse.Fromaprocessingperspective,disablingtheattributehierarchycanimproveperformanceand decreasecubesizebecausetheattributewillnolongerbeindexedoraggregated.Thiscanbeespecially usefulforhighcardinalityattributesthathaveaonetoonerelationshipwiththeprimarykey.High cardinalityattributessuchasphonenumbersandaddressestypicallydonotrequiresliceanddice analysis.Bydisablingthehierarchiesfortheseattributesandaccessingthemviamemberproperties, youcansaveprocessingtimeandreducecubesize. Decidingwhethertodisabletheattributeshierarchyrequiresthatyouconsiderboththequeryingand processingimpactsofusingmemberproperties.Memberpropertiescannotbeplacedonaqueryaxisin anMDXqueryinthesamemannerasattributehierarchiesanduserhierarchies.Toqueryamember property,youmustquerytheattributethatcontainsthatmemberproperty. Forexample,ifyourequiretheworkphonenumberforacustomer,youmustquerythepropertiesof customerandthenrequestthephonenumberproperty.Asaconvenience,mostfrontendtoolseasily displaymemberpropertiesintheiruserinterfaces. Ingeneral,filteringmeasuresusingmemberpropertiesisslowerthanfilteringusingattribute hierarchies,becausememberpropertiesarenotindexedanddonotparticipateinaggregations.The actualimpacttoqueryperformancedependsonhowyouusetheattribute. Forexample,ifyouruserswanttosliceanddicedatabybothaccountnumberandaccountdescription, fromaqueryingperspectiveyoumaybebetteroffhavingtheattributehierarchiesinplaceand removingthebitmapindexesifprocessingperformanceisanissue. 12

2.1.1.9 Reference DimensionsReferencedimensionsallowyoutobuildadimensionalmodelontopofasnowflakerelationaldesign. Whilethisisapowerfulfeature,youshouldunderstandtheimplicationsofusingit. Bydefault,areferencedimensionisnonmaterialized.Thismeansthatquerieshavetoperformthejoin betweenthereferenceandtheouterdimensiontableatquerytime.Also,filtersdefinedonattributesin theouterdimensiontablearenotdrivenintothemeasuregroupwhenthebitmapstherearescanned. Thismayresultinreadingtoomuchdatafromdisktoansweruserqueries.Leavingadimensionasnon materializedprioritizesmodelingflexibilityoverqueryperformance.Considercarefullywhetheryoucan affordthistradeoff:cubesaretypicallyintendedtobefastadhocstructures,andputtingthe performanceburdenontheenduserisrarelyagoodidea. AnalysisServiceshastheabilitytomaterializethereferencesdimension.Whenyouenablethisoption, memoryanddiskstructuresarecreatedthatmakethedimensionbehavejustlikeadenormalizedstar schema.Thismeansthatyouwillretainalltheperformancebenefitsofaregular,nonreference dimension.However,becarefulwithmaterializedreferencedimensionifyourunaprocessupdateon theintermediatedimension,anychangesintherelationshipsbetweentheouterdimensionandthe referencewillnotbereflectedinthecube.Instead,theoriginalrelationshipbetweentheouter dimensionandthemeasuregroupisretainedwhichismostlikelynotthedesiredresult.Inaway,you canconsiderthereferencetabletobearigidrelationshiptoattributesintheouterattributes.Theonly waytoreflectchangesinthereferencetableistofullyprocessthedimension.

2.1.1.10

Fast-Changing Attributes

Somedatamodelscontainattributesthatchangeveryfast.Dependingonwhichtypeofhistorytracking youneed,youmayfacedifferentchallenges. Type2FastChangingAttributesIfyoutrackeverychangetoafastchangingattribute,thismaycause thedimensioncontainingtheattributetogrowverylarge.Type2attributesaretypicallyaddedtoa dimensionwithaProcessAddcommand.Atsomepoint,runningProcessAddonalargedimensionand runningalltheconsistencycheckswilltakealongtime.Also,havingahugedimensionisunwieldy becauseuserswillhavetroublequeryingitandtheserverwillhavetroublekeepingitinmemory.A goodexampleofsuchamodelingchallengeistheageofacustomerthiswillchangeeveryyearand causethecustomerdimensiontogrowdramatically. Type1FastChangingAttributesEvenifyoudonottrackeverychangetotheattribute,youmaystill runintoissueswithfastchangingattributes.Toreflectachangeinthedatasourcetothecube,you havetorunProcessUpdateonthechangeddimension.Asthecubeanddimensiongrowslarger, runningProcessUpdatebecomesexpensive.Anexampleofsuchamodelingchallengeistotrackthe statusattributeofaserverinahostingenvironment(Running,Shutdown,Overloadedandsoon). Astatusattributelikethismaychangeseveraltimesperdayorevenperhour.Runningfrequent ProcessUpdatesonsuchadimensiontoreflectchangescanbeanexpensiveoperation,anditmaynot befeasiblewiththelockingimplementationofAnalysisServicesinaproductionenvironment. 13

Inthefollowingsections,wewilllookatsomemodelingoptionsyoucanusetoaddresstheseproblems. 2.1.1.10.1 Type 2 Fast-Changing Attributes Ifhistorytrackingisarequirementofafastchangingattribute,thebestoptionisoftentousethefact tabletotrackhistory.Thisisbestillustratedwithanexample.Consideragainthecustomerdimension withtheageattribute.ModelingtheAgeattributedirectlyinthecustomerdimensionproducesadesign likethis.

Figure 44: Age in customer dimension

NoticethateverytimeThomashasabirthday,anewrowisaddedinthedimensiontable.The alternativedesignapproachsplitsthecustomerdimensionintotwodimensionslikethis.

14

Figure 55: Age in its own dimension

Notethattherearesomerestrictionsonthesituationwherethisdesigncanbeapplied.Itworksbest whenthechangingattributetakesonasmall,distinctsetofvalues.Italsoaddscomplexitytothe design;byaddingmoredimensionstothemodel,itcreatesmoreworkfortheETLdeveloperswhenthe facttableisloaded.Also,considerthestorageimpactonthefacttable:Withthealternativedesign,the facttablebecomeswider,andmorebyteshavetobestoredperrow. 2.1.1.10.2 Type 1 Fast-Changing Attributes Yourbusinessrequirementmaybeupdatinganattributeofadimensionathighfrequency,daily,or evenhourly.Forasmallcube,runningProcessUpdatewillhelpyouaddressthisissue.Butasthecube growslarger,theruntimeofProcessUpdatecanbecometoolongforthebatchwindoworthereal timerequirementsofthecube(youcanreadmoreabouttuningprocessupdateintheprocessing section). Consideragaintheserverhostingexample:Youmaywanttotrackthestatus,whichchangesfrequently, ofallservers.Fortheexample,letussaythattheserverdimensionisusedbyafacttabletracking performancecounters.Assumeyouhavemodeledlikethis.

15

Figure 66: Status column in server dimension

TheproblemwiththismodelistheStatuscolumn.IftheFactCounterislargeandstatuschangesalot, ProcessUpdatewilltakeaverylongtimetorun.Tooptimize,considerthisdesigninstead.

Figure 77: Status column in its own dimension

IfyouimplementDimServerastheintermediatereferencetabletoDimServerStatus,AnalysisServices nolongerhastokeeptrackofthemetadataintheFactCounterwhenyourunProcessUpdateon DimServerStatus.Butasdescribedearlier,thismeansthatthejointoDimServerStatuswillhappenat runtime,increasingCPUcostandquerytimes.Italsomeansthatyoucannotindexattributesin DimServerbecausetheintermediatedimensionisnotmaterialized.Youhavetocarefullybalancethe tradeoffbetweenprocessingtimeandqueryspeeds.

16

2.1.1.11

Large Dimensions

InSQLServer2005,SQLServer2008,andSQLServer2008R2,AnalysisServiceshassomebuiltin limitationsthatlimitthesizeofthedimensionsyoucancreate.Firstofall,ittakestimetoupdatea dimensionthisisexpensivebecauseallindexesonfacttableshavetobeconsideredforinvalidation whenanattributechanges.Second,stringvaluesindimensionattributesarestoredonadiskstructure calledthestringstore.Thisstructurehasasizelimitationof4GB.Ifadimensioncontainsattributes wherethetotalsizeofthestringvalues(thisincludestranslations)exceeds4GB,youwillgetanerror duringprocessing.ThenextversionofSQLServerAnalysisServices,codenamedDenali,isexpectedto removethislimitation. Considerforamomentadimensionwithtensorevenhundredsofmillionsofmembers.Sucha dimensioncanbebuiltandaddedtoacube,evenonSQLServer2005,SQLServer2008,andSQLServer 2008R2.Butwhatdoessuchadimensionmeantoanadhocuser?Howwilltheusernavigateit?Which hierarchieswillgroupthemembersofthisdimensionintoreasonablesizesthatcanberenderedona screen?Whileitmaymakesenseforsomereportingpurposestosearchforindividualmembersinsuch adimension,itmaynotbetherightproblemtosolvewithacube. Whenyoubuildcubes,askyourself:isthisacubeproblem?Forexample,thinkofthistypicaltelco modelofcalldetailrecords.

Figure 88: Call detail records (CDRs)

Inthisparticularexample,thereare300millioncustomersinthedatamodel.Thereisnogoodwayto groupthesecustomersandallowadhocaccesstothecubeatreasonablespeeds.Evenifyoumanageto optimizethespaceusedtofitinthe4GBstringstore,howwouldusersbrowseacustomerdimension likethis? Ifyoufindyourselfinasituationwhereadimensionbecomestoolargeandunwieldy,considerbuilding thecubeontopofanaggregate.Forthetelcoexample,imagineatransformationlikethefollowing.

17

Figure 99: Cube built on aggregate

Usinganaggregatedfacttable,thisturnsa300millionrowdimensionprobleminto100,000row dimensionproblem.Youcanconsideraggregatingthefactstosavestoragetooalternatively,youcan addademographicskeydirectlytotheoriginalfacttable,processontopofthisdatasource,andrelyon MOLAPcompressiontoreducedatasizes.

2.1.2 Partitioning a CubePartitionsseparatemeasuregroupdataintophysicalstorageunits.Effectiveuseofpartitionscan enhancequeryperformance,improveprocessingperformance,andfacilitatedatamanagement.This sectionspecificallyaddresseshowyoucanusepartitionstoimprovequeryperformance.Youmustoften makeatradeoffbetweenqueryandprocessingperformanceinyourpartitioningstrategy. Youcanusemultiplepartitionstobreakupyourmeasuregroupintoseparatephysicalcomponents.The advantagesofpartitioningforimprovingqueryperformancearepartitioneliminationandaggregation design. PartitioneliminationPartitionsthatdonotcontaindatainthesubcubearenotqueriedatall,thus avoidingthecostofreadingtheindex(orscanningatableiftheserverisinROLAPmode).Whilereading apartitionindexandfindingnoavailablerowsisacheapoperation,asthenumberofconcurrentusers grows,thesereadsbegintoputastraininthethreadpool.Also,forqueriesthatdonothaveindexesto supportthem,AnalysisServiceswillhavetoscanallpotentiallymatchingpartitionsfordata. AggregationdesignEachpartitioncanhaveitsownorsharedaggregationdesign.Therefore,partitions queriedmoreoftenordifferentlycanhavetheirowndesigns.

18

Figure 1010: Intelligent querying by partitions

Figure10displaystheprofilertraceofqueryrequestingResellerSalesAmountbyBusinessTypefrom AdventureWorks.TheResellerSalesmeasuregroupoftheAdventureWorkscubecontainsfour partitions:oneforeachyear.Becausethequerysliceson2003,thestorageenginecangodirectlytothe 2003ResellerSalespartitionandignoreotherpartitions.

2.1.2.1 Partition SlicingPartitionsareboundtoasourcetable,view,orsourcequery.Whentheformulaenginerequestsa subcube,thestorageenginelooksatthemetadataofpartitionfortherelevantmeasuregroup.Each partitionmaycontainaslicedefinition,ahighleveldescriptionoftheminimumandmaximumattribute DataIDsthatexistinthatdimension.Ifitcanbedeterminedfromtheslicedefinitionthattherequested subcubedataisnotpresentinthepartition,thatpartitionisignored.Iftheslicedefinitionismissingorif theinformationinthesliceindicatesthatrequireddataispresent,thepartitionisaccessedbyfirst lookingattheindexes(ifany)andthenscanningthepartitionsegments. Thesliceofapartitioncanbesetintwoways: AutoslicewhenAnalysisServicesreadsthedataduringprocessing,itkeepstrackofthe minimumandmaximumattributeDataIDreads.Thesevaluesareusedtosettheslicewhenthe indexesarebuiltonthepartition. ManualslicerTherearecaseswhereautoslicewillnotworkthesearedescribedinthenext section.Forthosesituations,youcanmanuallysettheslice.Manualslicesaretheonlyavailable sliceoptionforROLAPpartitionsandproactivecachingpartitions.

2.1.2.1.1 Auto Slice DuringprocessingofMOLAPpartitions,AnalysisServicesinternallyidentifiestherangeofdatathatis containedineachpartitionbyusingtheMinandMaxDataIDsofeachattributetocalculatetherangeof datathatiscontainedinthepartition.Thedatarangeforeachattributeisthencombinedtocreatethe slicedefinitionforthepartition.

19

TheMinandMaxDataIDscanspecifyaeitherasinglememberorarangeofmembers.Forexample, partitioningbyyearresultsinthesameMinandMaxDataIDslicefortheyearattribute,andqueriestoa specificmomentintimeonlyresultinpartitionqueriestothatyearspartition. ItisimportanttorememberthatthepartitionsliceismaintainedasarangeofDataIDsthatyouhaveno explicitcontrolover.DataIDsareassignedduringdimensionprocessingasnewmembersare encountered.BecauseAnalysisServicesjustlooksattheminimumandmaximumvalueoftheDataID, youcanendupreadingpartitionsthatdontcontainrelevantdata. Forexample:ifyouhaveapartition,P2003_4,thatcontainsboth2003and2004data,youarenot guaranteedthattheminimumandmaximumDataIDintheslidecontainvaluesnexttoeachother(even thoughtheyearsareadjacent).Inourexample,letussaytheDataIDfor2003is42andtheDataIDfor 2004is45.BecauseyoucannotcontrolwhichDataIDgetsassignedtowhichmembers,youcouldbeina situationwheretheDataIDfor2005is44.Whenauserrequestsdatafor2005,AnalysisServiceslooksat thesliceforP2003_4,seesthatitcontainsdataintheinterval42to45andthereforeconcludesthatthis partitionhastobescannedtomakesureitdoesnotcontainthevaluesforDataID44(because44is between42and45). Becauseofthisbehavior,autoslicetypicallyworksbestifthedatacontainedinthepartitionmapstoa singleattributevalue.Whenthatisthecase,themaximumandminimumDataIDcontainedintheslice willbeequalandtheslicewillworkefficiently. Notethattheautosliceisnotdefinedandindexesarenotbuiltforpartitionswithfewerrowsthan IndexBuildThreshold(whichhasadefaultvalueof4096). 2.1.2.1.2 Manually Setting Slices NometadataisavailabletoAnalysisServicesaboutthecontentofROLAPandproactivecaching partitions.Becauseofthis,youmustmanuallyidentifythesliceinthepropertiesofthepartition.Itisa bestpracticetomanuallysetslicesinROLAPandproactivecachingpartitions. However,asshownintheprevioussection,therearecaseswhereautoslicewillnotgiveyouthe desiredpartitioneliminationbehavior.Inthesecasesyoucanbenefitfromdefiningthesliceyourselffor MOLAPpartitions.Forexample,ifyoupartitionbyyearwithsomepartitionscontainingarangeofyears, definingthesliceexplicitlyavoidstheproblemofoverlappingDataIDs.Thiscanonlybedonewith knowledgeofthedatawhichiswhereyoucanaddsomeoptimizationasaBIdeveloper. Itisgenerallynotabestpracticetocreatepartitionsbeforeyouarereadytofillthemwithdata.Butfor realtimecubes,itissometimesagoodideatocreatepartitionsinadvancetoavoidlockingissues. Whenyoutakethisapproach,itisalsoagoodideatosetamanualsliceonMOLAPpartitionstomake surethestorageenginedoesnotspendtimescanningemptypartitions.

20

2.1.2.2 Partition SizingFornondistinctcountmeasuregroups,testswithpartitionsizesintherangeof200MBtoupto3GB indicatethatpartitionsizealonedoesnothaveasubstantialimpactonqueryspeeds.Infact,wehave successfullydeployedgoodqueryperformanceonpartitionslargerthan3GB. Thefollowinggraphshowsfourdifferentqueryrunswithdifferentpartitionsizes(theverticalaxisis totalruntimeinhours).Performanceiscomparablebetweenpartitionsizesandisonlyaffectedbythe designofthesecurityfeaturesinthisparticularcustomercube.

Figure 1111: Throughput by partition size (higher is better)

Thepartitioningstrategyshouldbebasedonthesefactors: Increasingprocessingspeedandflexibility Increasingmanageabilityofbringinginnewdata Increasingqueryperformancefrompartitioneliminationasdescribedearlier Supportfordifferentaggregationdesigns

Asyouaddmorepartitions,themetadataoverheadofmanagingthecubegrowsexponentially.This affectsProcessUpdateandProcessAddoperationsondimensions,whichhavetotraversethemetadata dependenciestoupdatethecubewhendimensionschange.Asaruleofthumb,youshouldtherefore seektokeepthenumberofpartitionsinthecubeinthelowthousandswhileatthesametime balancingtherequirementsdiscussedhere. Forlargecubes,preferlargerpartitionsovercreatingtoomanypartitions.Thisalsomeansthatyoucan safelyignoretheAnalysisManagementObjects(AMO)warninginMicrosoftVisualStudiothatpartition sizesshouldnotexceed20millionrows.

2.1.2.3 Partition StrategyFromguidanceonpartitionsizing,wecandevelopsomecommondesignpatternsforpartition strategies.

21

2.1.2.3.1 Partition by Date Mostcubesarebuiltonatleastonecolumncontainingadate.Becausedataoftenarrivesinmonthly, weekly,daily,orevenhourlyslices,itmakessensetopartitionthecubeondate.Partitioningondate allowsyoutoreplaceafulldayincaseyouloadfaultydata.Itallowsyoutoselectivelyarchiveolddata bymovingthepartitiontocheapstorage.Andfinally,itallowsyoutoeasilygetridofdata,byremoving anentirepartition.Typically,adatepartitioningschemelookssomewhatlikethis.

Figure 1212: Partitioning by Date

Notethatinordertomovethepartitiontocheaperstorage,youwillhavetochangethedatalocation andreprocessesthepartition.Thisdesignworksverywellforsmalltomediumsizedcubes.Itis reasonablysimpletoimplementandthenumberofpartitionsiskeptlow.However,itdoessufferfroma fewdrawbacks: 1. Ifthegranularityofthepartitioningissmallenough(forexample,hourly),thenumberof partitionscanquicklybecomeunmanageable. 2. Assumingdataisaddedonlytothelatestpartition,partitionprocessingislimitedtooneTCP/IP connectionreadingfromthedatasource.Ifyouhavealotofdata,thiscanbeascalabilitylimit. Ad1)Ifyouhavealotofdatebasedpartitions,itisoftenagoodideatomergetheolderonesintolarge partitions.YoucandothiseitherbyusingtheAnalysisServicesmergefunctionalityorbydroppingthe oldpartitions,creatinganew,largerpartition,andthenreprocessingit.Reprocessingwilltypicallytake

22

longerthanmerging,butwehavefoundthatcompressionofthepartitioncanoftenincreaseifyou reprocess.Amodified,datepartitioningschememaylooklikethis.

Figure 1313: Modified Date Partitioning

Thisdesignaddressesthemetadataoverheadofhavingtoomanypartitions.Butitisstillbottlenecked bythemaximumspeedoftheProcessAddorProcessFullforthelatestpartition.Ifyourdatasourceis SQLServer,thespeedofasingledatabaseconnectioncanbehundredsofthousandsofrowsevery secondwhichworkswellformostscenarios.Butifthecuberequiresevenfasterprocessingspeeds, considermatrixpartitioning. 2.1.2.3.2 Matrix Partitioning Forlargecubes,itisoftenagoodideatoimplementamatrixpartitioningscheme:partitiononboth dateandsomeotherkey.Thedatepartitioningisusedtoselectivelydeleteormergeoldpartitionsas describedearlier.Theotherkeycanbeusedtoachieveparallelismduringpartitionprocessingandto restrictcertainuserstoasubsetofthepartitions.Forexample,consideraretailerthatoperatesinUS, Europe,andAsia.Youmightdecidetopartitionlikethis.

23

Figure 1414: Example of matrix partitioning

Iftheretailergrows,theymaychoosetosplittheregionpartitionsintosmallerpartitionstoincrease parallelismofloadfurtherandtolimittheworstcasescansthatausercanperform.Forcubesthatare expectedtogrowdramatically,itisagoodideatochooseapartitionkeythatgrowswiththebusiness andgivesyouoptionsforextendingthematrixpartitioningstrategyappropriately.Thefollowingtable containsexamplesofsuchpartitioningkeys. Industry Webretail Storeretail Datahosting 24 Examplepartitionkey Customerkey Storekey HostIDorracklocation Sourceofdataproliferation Addingcustomersandtransactions Addingnewstores Addinganewserver

Telecommunications Computerized manufacturing Investmentbanking Retailbanking Onlinegaming

SwitchID,countrycode,orarea Expandingintonewgeographical code regionsoraddingnewservices ProductionlineIDormachineID Addingproductionlinesor(for machines)sensors Stockexchangeorfinancial Addingnewfinancialinstruments, instrument products,ormarkets Creditcardnumberorcustomer Increasingcustomertransactions key Gamekeyorplayerkey Addingnewgamesorplayers

Ifyouimplementamatrixpartitioningscheme,youshouldpayspecialattentiontouserqueries.Queries touchingseveralpartitionsforeverysubcuberequest,suchasaquerythatasksforahighlevel aggregateofthepartitionbusinesskey,resultinahighthreadusageinthestorageengine.Becauseof this,werecommendthatyoupartitionthebusinesskeysothatsinglequeriestouchnomorethanthe numberofcoresavailableonthetargetserver.Forexample,ifyoupartitionbyStoreKeyandyouhave 1,000stores,queriestouchingtheaggregationofallstoreswillhavetotouch1,000partitions.Insucha design,itisagoodideatogroupthestoresintoanumberofbuckets(thatis,groupthestoresoneach partition,ratherthanhavingindividualpartitionsforeachstore).Forexample,ifyourunona16core server,youcangroupthestoreintobucketsofaround62storesforeachpartition(1,000storesdivided into16buckets). 2.1.2.3.3 Hash Partitioning Sometimesitisnotpossibletocomeupwithagooddistributionofbusinesskeysforpartitioningthe cube.Perhapsyoujustdonthaveagoodkeycandidatethatfitsthedescriptionintheprevioussection, orperhapsthedistributionofthekeyisunknownatdesigntime.Insuchcases,abruteforceapproach canbeused:Partitiononthehashvalueofakeythathasahighenoughcardinalityandwherethereis littleskew.Ifyouexpecteveryquerytotouchmanypartitions,itisimportantthatyoupayspecial attentiontotheCoordinatorQueryBalancingFactorandtheCoordinatorQueryMaxThreadsettings, whicharedescribedinPart2.

2.1.3 Relational Data Source DesignCubesaretypicallybuiltontopofrelationaldatasourcestoserveasdatamarts.Throughthedesign surface,AnalysisServicesallowsyoutocreatepowerfulabstractionsontopoftherelationalsource. Computedcolumnsandnamedqueriesareexamplesofthis.Thisallowsfastprototypingandalso enabledyoutocorrectpoorrelationaldesignwhenyouarenotincontroloftheunderlyingdatasource. ButtheAnalysisServicesdesignsurfaceisnopanaceaawelldesignedrelationaldatasourcecanmake queriesandprocessingofacubefaster.Inthissection,weexploresomeoftheoptionsthatyoushould considerwhendesigningarelationaldatasource.Afulltreatmentofrelationaldatawarehousingisout ofscopeforthisdocument,butwewillprovidereferenceswhereappropriate.

25

2.1.3.1 Use a Star Schema for Best PerformanceItiswidelydebatedwhatthemostefficientadreportmodelingtechniqueis:starschema,snowflake schema,orevenathirdtofifthnormalformordatavaultmodels(inorderoftheincreased normalization).Allareconsideredbywarehousedesignersascandidatesforreporting. NotethattheAnalysisServicesUnifiedDimensionalModel(UDM)isadimensionalmodel,withsome additionalfeatures(referencedimensions)thatsupportsnowflakesandmanytomanydimensions.No matterwhichmodelyouchooseastheenduserreportingmodel,performanceoftherelationalmodel boilsdowntoonesimplefact:joinsareexpensive!ThisisalsopartiallytruefortheAnalysisServices engineitself.Forexample:Ifasnowflakeisimplementedasanonmaterializedreferencedimension, userswillwaitlongerforqueries,becausethejoinisdoneatruntimeinsidetheAnalysisServices engine. Thelargestimpactofsnowflakesoccursduringprocessingofthepartitiondata.Forexample:Ifyou implementafacttableasajoinoftwobigtables(forexample,separatingorderlinesandorderheaders insteadofstoringthemasprejoinedvalues),processingoffactswilltakelonger,becausetherelational enginehastocomputethejoin. ItispossibletobuildanAnalysisServicescubeontopofahighlynormalizedmodel,butbepreparedto paythepriceofjoinswhenaccessingtherelationalmodel.Inmostcases,thatpriceispaidatprocessing time.InMOLAPdatamodels,materializedreferencedimensionshelpyoustoretheresultofthejoined tablesondiskandgiveyouhighspeedqueriesevenonnormalizeddata.However,ifyouarerunning ROLAPpartitions,querieswillpaythepriceofthejoinatquerytime,andyouruserresponsetimesor yourhardwarebudgetwillsufferifyouareunabletoresistnormalization.

2.1.3.2 Consider Moving Calculations to the Relational EngineSometimescalculationscanbemovedtotheRelationalEngineandbeprocessedassimpleaggregates withmuchbetterperformance.Thereisnosinglesolutionhere;butifyoureencounteringperformance issues,considerwhetherthecalculationcanberesolvedinthesourcedatabaseordatasourceview (DSV)andprepopulated,ratherthanevaluatedatquerytime. Forexample,insteadofwritingexpressionslikeSum(Customer.City.Members, cint(Customer.City.Currentmember.properties(Population))),considerdefiningaseparatemeasure groupontheCitytable,withasummeasureonthePopulationcolumn. Asasecondexample,youcancomputetheproductofrevenue*ProductsSoldattheleavesinthecube andaggregatewithcalculations.Butcomputingthisresultinthesourcedatabaseinsteadcanprovide superiorperformance.

2.1.3.3 Use ViewsItisgenerallyagoodideatobuildyourUDMontopofdatabaseviews.Amajoradvantageofviewsis thattheyprovideanabstractionlayerontopofthephysical,relationalmodel.Ifthecubeisbuiltontop ofviews,therelationaldatabasecan,tosomedegree,beremodeledwithoutbreakingthecube. 26

Considerarelationalsourcethathaschosentonormalizetwotablesyouneedtojointoobtainafact tableforexample,adatamodelthatsplitsasalesfactintoorderlinesandorders.Ifyouimplement thefacttableusingquerybinding,yourUDMwillcontainthefollowing.

Figure 1515: Using named queries in UDM

Inthismodel,theUDMnowhasadependencyonthestructureoftheLineItemsandOrderstables alongwiththejoinbetweenthem.IfyouinsteadimplementaSalesviewinthedatabase,youcanmodel likethis.

Figure 1616: Implementing UDM on top of views

27

ThismodelgivestherelationaldatabasethefreedomtooptimizethejoinedresultsofLineItemsand Order(forexamplebystoringitdenormalized),withoutanyimpactonthecube.Itwouldbetransparent forthecubedeveloperiftheDBAoftherelationaldatabaseimplementedthischange.

Figure 1717: Implementing UDM on top of pre-joined tables

Viewsprovideencapsulation,anditisgoodpracticetousethem.Iftherelationaldatamodelersinsist onnormalization,givethemachancetochangetheirmindsanddenormalizewithoutbreakingthecube model. Viewsalsoprovideeasyofdebugging.YoucanissueSQLqueriesdirectlyonviewstocomparethe relationaldatawiththecube.Hence,viewsaregoodwaytoimplementbusinesslogicthatcouldyou couldmimicwithquerybindingintheUDM.WhiletheUDMsyntaxissimilartotheSQLviewsyntax,you cannotissueSQLstatementsagainsttheUDM. 2.1.3.3.1 Query Binding Dimensions QuerybindingfordimensionsdoesnotexistinSQLServer2008AnalysisServices,butyoucan implementitbyusingaview(insteadoftables)foryourunderlyingdimensiondatasource.Thatway, youcanusehints,indexedviews,orotherrelationaldatabasetuningtechniquestooptimizetheSQL statementthataccessesthedimensiontablesthroughyourview.Thisalsoallowsyoutoturna snowflakedesignintherelationalsourceintoaUDMthatisapurestarschema. 2.1.3.3.2 Processing Through Views Dependingontherelationalsource,viewscanoftenprovidemeanstooptimizethebehaviorofthe relationaldatabase.Forexample,inSQLServeryoucanusetheNOLOCKhintintheviewdefinitionto removetheoverheadoflockingrowsastheviewisscanned,balancingthiswiththepossibilityofgetting dirtyreads.ViewscanalsobeusedtopreaggregatelargefacttablesusingaGROUPBYstatement;the relationaldatabasemodelercanevenchoosetomaterializeviewsthatusealotofhardwareresources. 28

2.1.4 Calculation ScriptsThecalculationscriptinthecubeallowsyoutoexpresscomplexfunctionalityofthecube,conferringthe abilitytodirectlymanipulatethemultidimensionalspace.Inafewlinesofcode,youcanelegantlybuild highlyvaluablebusinesslogic.Butconversely,ittakesonlyafewlinesofpoorlywrittencalculationcode tocreateabigperformanceimpactonusers.Ifyouplantodesignacubewithalargecalculationscript, wehighlyrecommendthatyoulearnthebasicsofwritinggoodMDXcodethelanguageusedfor calculations.Thereferencessectioncontainsresourcesthatwillgetyouofftoagoodstart. Thequerytuningsectioninthisbookprovideshighlevelguidanceontuningindividualqueries.Buteven atdesigntime,therearesomebestpracticesyoushouldapplytothecubethatavoidcommon performancemistakes.Thissectionprovidesyouwithsomebasicrules;thesearethebareminimum youshouldapplywhenbuildingthecubescript. References: MDXhasarichcommunityofcontributorsontheweb.Herearesomelinkstogetyoustarted: Pearson,Bill:StairwaytoMDX o http://www.sqlservercentral.com/stairway/72404/ Piasevoli,Tomislav:MDXwithMicrosoftSQLServer2008R2AnalysisServicesCookbook o http://www.packtpub.com/mdxwithmicrosoftsqlserver2008r2analysis services/book Russo,Marco:MDXBlog: o http://sqlblog.com/blogs/marco_russo/archive/tags/MDX/default.aspx Pasumansky,Mosha:Blog o http://sqlblog.com/blogs/mosha/ Piasevoli,Tomislav:Blog o http://tomislav.piasevoli.com Webb,Christopher:Blog o http://cwebbbi.wordpress.com/category/mdx/ Spofford,George,SivakumarHarinath,ChristopherWebb,DylanHaiHuang,andFrancesco Civardi,:MDXSolutions:WithMicrosoftSQLServerAnalysisServices2005andHyperionEssbase, ISBN:9780471748083

2.1.4.1 Use Attributes Instead of SetsWhenyouneedtorefertoafixedsubsetofdimensionmembersinacalculation,useanattribute insteadofaset.Attributesenableyoutotargetaggregationstothesubset.Attributesarealsoevaluated fasterthansetsbytheformulaengine.Usinganattributeforthispurposealsoallowsyoutochangethe setbyupdatingthedimensioninsteadofdeployinganewcalculationscripts. Example:Insteadofthis:

29

CREATESET[CurrentDay]ASTAIL([Date].[Calendar].members,1) CREATESET[PreviousDay]ASHEAD(TAIL(Date].[Calendar].members),2),1)

Dothis(assumingtodayis20110616): CalendarKeyAttribute 20110613 20110614 20110615 20110616 DayTypeAttribute (Flexiblerelationshiptokey) OldDates OldDates PreviousDay CurrentDay

ProcessUpdatethedimensionwhenthedaychanges.Userscannowrefertothecurrentdayby addressingtheDayTypeattributeinsteadoftheset.

2.1.4.2 Use SCOPE Instead of IIF When Addressing Cube SpaceSometimes,youwantacalculationtoonlyapplyforaspecificsubsetofcubespace.SCOPEisabetter choicethanIIFinthiscase.Hereisanexampleofwhatnottodo. CREATEMEMBERCurrentCube.[Measures].[SixMonthRollingAverage]AS IIF([Date].[Calendar].CurrentMember.Level Is[Date].[Calendar].[Month] ,Sum([Date].[Calendar].CurrentMember.Lag(5) :[Date].[Calendar].CurrentMember ,[Measures].[InternetSalesAmount])/6 ,NULL)

Instead,usetheAnalysisServicesSCOPEfunctionforthis. CREATEMEMBERCurrentCube.[Measures].[SixMonthRollingAverage] ASNULL,FORMAT_STRING="Currency",VISIBLE=1; SCOPE([Measures].[SixMonthRollingAverage],[Date].[Calendar].[Month].Members); THIS=Sum([Date].[Calendar].CurrentMember.Lag(5) :[Date].[Calendar].CurrentMember ,[Measures].[InternetSalesAmount])/6;

30

ENDSCOPE;

2.1.4.3 Avoid Mimicking Engine Features with ExpressionsSeveralnativefeaturescanbemimickedwithMDX: Unaryoperators Calculatedcolumnsinthedatasourceview(DSV) Measureexpressions Semiadditivemeasures

YoucanreproduceeachthesefeaturesinMDXscript(infact,sometimesyoumust,becausesomeare onlysupportedintheEnterpriseSKU),butdoingsooftenhurtsperformance. Forexample,usingdistributiveunaryoperators(thatis,thosewhosememberorderdoesnotmatter, suchas+,,and~)isgenerallytwiceasfastastryingtomimictheircapabilitieswithassignments. Therearerareexceptions.Forexample,youmightbeabletoimproveperformanceofnondistributive unaryoperators(thoseinvolving*,/,ornumericvalues)withMDX.Furthermore,youmayknowsome specialcharacteristicofyourdatathatallowsyoutotakeashortcutthatimprovesperformance.Such optimizationsrequireexpertleveltuningandingeneral,youcanrelyontheAnalysisServicesengine featurestodothebestjob. Measureexpressionsalsoprovideauniquechallenge,becausetheydisabletheuseofaggregates(data hastoberolledupfromtheleaflevel).Onewaytoworkaroundthisistouseahiddenmeasurethat containspreaggregatedvaluesintherelationalsource.Youcanthentargetthehiddenmeasuretothe aggregatevalueswithaSCOPEstatementinthecalculationscript.

2.1.4.4 Comparing Objects and ValuesWhendeterminingwhetherthecurrentmemberortupleisaspecificobject,useIS.Forexample,the followingqueryisnotonlynonperformant,butincorrect.Itforcesunnecessarycellevaluationand comparesvaluesinsteadofmembers. [Customer].[CustomerGeography].[Country].&[Australia]=[Customer].[Customer Geography].currentmember

Furthermore,dontperformextrastepswhendeducingwhetherCurrentMemberisaparticular memberbyinvolvingIntersectandCounting.

31

intersect({[Customer].[CustomerGeography].[Country].&[Australia]}, [Customer].[CustomerGeography].currentmember).count>0

UseISinstead. [Customer].[CustomerGeography].[Country].&[Australia]is[Customer].[Customer Geography].currentmember

2.1.4.5 Evaluating Set MembershipDeterminingwhetheramemberortupleisinasetisbestaccomplishedwithIntersect.TheRank functiondoestheadditionaloperationofdeterminingwhereinthesetthatobjectlies.Ifyoudontneed it,dontuseit.Forexample,thefollowingstatementmaydomoreworkthanyouneedittodo. rank([Customer].[CustomerGeography].[Country].&[Australia], )>0

ThisstatementusesIntersecttodeterminewhetherthespecifiedinformationisintheset. intersect({[Customer].[CustomerGeography].[Country].&[Australia]},).count>0

2.2 Testing Analysis Services CubesAsyouprepareforuseracceptanceandpreproductiontestingofacube,youshouldfirstconsiderwhat acubeisandwhatthatmeansforuserqueries.Dependingonyourbackgroundandroleinthe developmentanddeploymentcycle,therearedifferentwaystolookatthis. Asadatabaseadministrator,youcanthinkofacubeasadatabasethatcanacceptanyqueryfrom users,andwheretheresponsetimefromanysuchqueryisexpectedtobereasonableatermthatis oftenvaguelydefined.Inmanycases,youcanoptimizeresponsetimeforspecificqueriesusing aggregates(whichforrelationalDBAsissimilartoindextuning),andtestingshouldgiveyouanearly ideaofgoodaggregationcandidates.Butevenwithaggregates,youmustalsoconsidertheworstcase: Youshouldexpecttoseeleaflevelscanqueries.Suchqueries,whichcanbeeasilyexpressedbytools likeExcel,mayenduptouchingeverypieceofdatathecube.Dependingonyourdesign,thiscanbea 32

significantamountofdata.Youshouldconsiderwhatyouwanttodowithsuchqueries.Thereare multipleoptions:Forexample,youmaychoosetoscaleyourhardwaretohandletheminadecent responsetime,oryoumaysimplychoosetocancelthem.Ineithercase,asyoupreparefortesting, makesuresuchqueriesarepartthetestsuiteandthatyouobservewhathappenstotheAnalysis Servicesinstancewhentheyrun.Youshouldalsounderstandwhatareasonableresponsetimeisfor yourenduserandmakethatpartofthetestsuite. AsaBIdeveloper,youcanlookatthecubeasyourdescriptionofthemultidimensionalspaceinwhich queriescanbeexpressed.Apartofthisspacewillbeinstantiatedwithdatastructuresondisk supportingit:dimensions,measuregroups,andtheirpartitions.However,someofthis multidimensionalspacewillbeservedbycalculationsoradhocmemorystructures,forexample:MDX calculations,manytomanydimensions,andcustomrollups.Wheneverqueriesarenotserveddirectly byinstantiateddata,thereisapotential,querytimecalculationpricetobepaid,thismayshowupas badresponsetime.Asthecubedeveloper,youshouldmakesurethatthetestingcoversthesecases. Hence,astheBIdeveloper,youshouldmakesurethetestqueriesalsostressnoninstantiateddata.This isavaluableexercise,becauseyoucanuseittomeasuretheimpactonusersofcomplexcalculations andthenadjustthedatamodelaccordingly. Yourapproachtotestingwilldependonwhichsituationyoufindyourselfin.Ifyouaredevelopinganew system,youcanworkdirectlytowardsthetestgoalsdrivenbybusinessrequirements.However,ifyou aretryingtoimproveanexistingsystem,itisanadvantagetoacquireatestbaselinefirst.

2.2.1 Testing GoalsBeforeyoudesignatestharness,youshoulddecidewhatyourtestinggoalswillbe.Testingnotonly allowsyoutofindfunctionalbugsinthesystem,italsohelpsquantifythescalabilityandpotential bottlenecksthatmaybehardtodiagnoseandfixinabusyproductionenvironment.Ifyoudonotknow whatyourscalebarrierisorwheretheysystemmightbreak,itbecomeshardtoactwithconfidence whenyoutunethefinalproductionsystem. ConsiderwhatcharacteristicsarereasonabletoexpectfromtheBIsystemsanddocumentthese expectationsaspartofyourtestplan.Thedefinitionofreasonabledependstoalargeextentonyour familiaritywithsimilarsystems,theskillsofyourcubedesigners,andthehardwareyourunon.Itis oftenagoodideatogetasecondopiniononwhattestvaluesarereachableforexamplefroma neutralthirdpartythathasexperiencewithsimilarsystems.Thishelpsyousetexpectationsproperly withbothdevelopersandbusinessusers. BIsystemsvaryinthecharacteristicsorganizationsrequireofthemnoteveryoneneedsscalabilityto thousandsofusers,tensofterabytes,nearzerodowntime,andguaranteedsubsecondresponsetime. Whileallthesegoalscanbeachievedformostcases,itisnotalwayscheaptoacquiretheskillsrequired todesignasystemtosupportthem.Considerwhatyoursystemneedstodoforyourorganization,and avoidoverdesigningintheareaswhereyoudontneedthehighestrequirements.Forexample:youmay decidethatyouneedveryfastresponsetimes,butthatyoualsowantaverylowcostserverthatcan runinasharedstorageenvironment.Forsuchascenario,youmaywanttoreducethedatainthecube 33

toasizethatwillfitinmemory,eliminatingtheneedforthemajorityofI/Ooperationsandproviding fastscantimesevenforpoorlyfilteredqueries. Hereisatableofpotentialtestgoalsyoushouldconsider.Iftheyarerelevantforyourorganizations requirements,youshouldtailorthemtoreflectthoserequirements. TestGoal Scalability Examplegoal Mustsupport10,000 concurrentlyconnectedusers,of which1,000runqueries simultaneously. Simplequeriesreturninga Howfastshouldqueriesreturntotheclient? singleproductgroupforagiven Thismayrequireyoutoclassifyqueriesinto yearshouldreturninlessthan1 differentcomplexities. secondevenatfulluser Notallqueriescanbeansweredquicklyandit concurrency. willoftenbewisetoconsultanexpertcube designertoliaisewithuserstounderstandwhat Queriesthattouchnomore querypatternscanbeexpectedandwhatthe than20%ofthefactrowsshould runinlessthan30seconds. complexityofansweringthesequerieswillbe. Mostotherqueriestouchinga smallpartofthecubeshould Anotherwaytolookatthistestgoalisto returninaround10seconds. measurethethroughputinqueriesanswered Withourworkload,weexpect persecondinamixedworkload. throughputtobearound50 queriesreturnedpersecond. Userqueriesrequestingthe endofmonthcurrencyrate conversionshouldreturninno morethan20seconds.Queries thatdonotrequirecurrency conversionshouldreturninless than5seconds. Whatisthegranularityofeachdimension Thelargestcustomerdimension attribute?Howmuchdatawilleachmeasure willcontain30millionrowsand groupcontain? have10attributesandtwouser hierarchies.Thelargestnonkey Notethatthecubedesignerswilloftenhave attributewillhave1million beenconsideringthisandmayalreadyknowthe members. answer. Thelargestmeasuregroupis sales,with1billionrows.The secondlargestispurchases,with 100millionrows.Allother measuregroupsaretrivialin size. Description Howmanyconcurrentusersshouldbe supportedbythesystem?

Performance/ throughput

DataSizes

34

TargetServer Platforms

Whichservermodeldoyouwanttorunon?Itis oftenagoodideatotestonboththatserver andanevenbiggerserverclass.Thisenables youtoquantifythebenefitsofupgrading.

TargetI/O system

WhichI/Osystemdoyouwanttouse?What characteristicswillthatsystemhave?

Targetnetwork infrastructure

Whichnetworkconnectivitywillbeavailable betweenusersandAnalysisServices,and betweenAnalysisServicesandthedatasources? Notethatyoumayhavetosimulatethese networkconditionsinalab.

Processing Speeds

Howfastshouldrowsbebroughtintothecube andhowoften?

Mustrunon2socket6core Nehalemmachinewith32GBof RAM. Mustbeabletoscaleto4 socketNehalem8coremachine with256GBofRAM. MustrunoncorporateSANand usenomorethan1,000random IOPSat32,000blocksizesat6ms latency. WillrunondedicatedNAND devicesthatsupport80,000IOPS at100slatency. Intheworstcasescenario, userswillconnectovera100ms latencyWANlinkwitha maximumbandwidthof 10Mbit/sec. Therewillbea10Gbit dedicatednetworkavailable betweenthedatasourceandthe cube. Dimensionsshouldbefully processedeverynightwithin30 minutes. Twotimesduringtheday, 100,000,000rowsshouldbe addedtothesalesmeasure group.Thisshouldtakeno longerthan15minutes.

2.2.2 Test ScenariosBasedontheconsiderationsfromtheprevioussectionyoushouldbeabletocreateauserworkload thatrepresentstypicaluserbehaviorandthatenablesyoutomeasurewhetheryouaremeetingyour testinggoals. Typicaluserbehaviorandwellwrittenqueriesareunfortunatelynottheonlyqueriesyouwillreceivein mostsystems.Aspartofthetestphase,youshouldalsotrytoflushoutpotentialproductionissues beforetheyarise.Werecommendthatyoumakesureyourtestworkloadcontainsthefollowingtypes ofqueriesandteststhemthoroughly:

35

Queriesthattouchseveraldimensionsatthesametime EnoughqueriestotesteveryMDXexpressioninthecalculationscript Queriesthatexercisemanytomanydimensions Queriesthatexercisecustomrollups Queriesthatexerciseparent/childdimensions Queriesthatexercisedistinctcountmeasuregroups Queriesthatcrossjoinattributesfromdimensionsthatcontainmorethan100,000members Queriesthattoucheverysinglepartitioninthedatabase Queriesthattouchalargesubsetofpartitionsinthedatabase(forexample,currentyear) Queriesthatreturnalotofdatatotheclient(forexample,morethan100,000rows) Queriesthatusecubesecurityversusqueriesthatdonotuseit Queriesexecutingconcurrentlywithprocessingoperationsifthisispartofyourdesign

Youshouldtestonthefulldatasetfortheproductioncube.Ifyoudont,theresultswillnotbe representativeofreallifeoperations.Thisisespeciallytrueforcubesthatarelargerthanthememory onthemachinetheywilleventuallyrunon. Ofcourse,youshouldstillmakesurethatyouhaveplentyofqueriesinthetestscenariosthatrepresent typicaluserbehaviorsrunningonaworkloadthatonlyshowcasestheslowestperformingpartsofthe cubewillnotrepresentarealproductionenvironment(unlessofcourse,theentirecubeispoorly designed). Asyourunthetests,youwilldiscoverthatcertainqueriesaremoredisruptivethanothers.Onegoalof testingistodiscoverwhatsuchquerieslooklike,sothatyoucaneitherscalethesystemtodealwith themorprovideguidanceforuserssothattheycanavoidexercisingthecubeinthiswayifpossible. Partofyourtestscenariosshouldalsoaimtoobservethecubesbehaviorasuserconcurrencygrows. YoushouldworkwithBIdevelopersandbusinessuserstounderstandwhattheworstcasescenariofor userconcurrencyis.Testingatthatconcurrencywillshakeoutpoorlyscalabledesignsandhelpyou configurethecubeandhardwareforbestperformanceandstability.

2.2.3 Load GenerationItishardtocreatealoadthatactuallylooksliketheexpectedproductionloaditrequiressignificant experienceandcommunicationwithenduserstocomeupwithafullyrepresentativesetofqueries.But afteryouhaveasetofqueriesthatmatchuserbehavior,youcanfeedthemintoatestharness.Analysis Servicesdoesnotshipwithatestharnessoutofthebox,butthereareseveralsolutionsavailablethat helpyougetstarted: ascmdYoucanusethiscommandlinetooltorunasetofqueriesagainstAnalysisServices.Itships withtheAnalysisServicessamplesandismaintainedonCodePlex. VisualStudioYoucanconfigureMicrosoftVisualStudiotogenerateloadagainstAnalysisServices,and youcanalsouseVisualStudiotovisuallyanalyzethatload.

36

ThirdpartytoolsYoucanusetoolssuchasHPLoadRunnertogeneratehighconcurrencyload.Note thatAnalysisServicesalsosupportsanHTTPbasedinterface,whichmeansitmaybepossibletouse webstresstoolstogenerateload. Rollyourown:Wehaveseencustomerswritetheirowntestharnessesusing.NETandtheADOMD.NET interfacetoAnalysisServices.Usingthe.NETthreadinglibraries,itispossibletogeneratealotofuser loadfromasingleloadclient. Nomatterwhichloadtoolyouuse,youshouldmakesureyoucollecttheruntimeofallqueriesandthe PerformanceMonitorcountersforallruns.Thisdataenablesyoutomeasuretheeffectofanychanges youmakeduringyourtestruns.Whenyougenerateuserworkloadtherearealsosomeotherfactorsto consider. Firstofall,youshouldtestbothasequentialrunandaparallelrunofqueries.Thesequentialrungives youthebestpossibleruntimeofthequerywhilenootherusersareonthesystem.Theparallelrun enablesyoutoshakeoutissueswiththecubethataretheresultofmanyusersrunningconcurrently. Second,youshouldmakesurethetestscenarioscontainasufficientnumberofqueriessothatyouwill beabletorunthetestscenarioforsometime.Tostresstheserverproperly,andtoavoidqueryingthe samehotspotvaluesoverandoveragain,queriesshouldtouchavarietyofdatainthecube.Ifallyour queriestouchasmallsetofdimensionvalues,itwillhardlyrepresentarealproductionrun.Onewayto spreadqueriesoverawidersetofcellsinthecubeistousequerytemplates.Eachtemplatecanbeused togenerateasetofqueriesthatareallvariantsofthesamegeneraluserbehavior. Third,yourtestharnessshouldbeabletocreatereproducibletests.Ifyouareusingcodethatgenerates manyqueriesfromasmallsetoftemplates,makesurethatitgeneratesthesamequeriesonevery testrun.Ifnot,youintroduceanelementofrandomnessinthetestthatmakesithardtocompare differentruns. References: Ascmd.exeonMSDNhttp://msdn.microsoft.com/en us/library/ms365187%28v=sql.100%29.aspx AnalysisServicesCommunitysampleshttp://sqlsrvanalysissrvcs.codeplex.com/ o Describeshowtouseascmdforloadgeneration o ContainsVisualStudiosamplecodethatachievesasimilareffect HPLoadRunner https://h10078.www1.hp.com/cda/hpms/display/main/hpms_content.jsp?zn=bto&cp=111 12617^8_4000_100

2.2.4 Clearing CachesTomaketestrunsreproducible,itisimportantthateachrunstartwiththeserverinthesamestateas thepreviousrun.Todothis,youmustclearoutanycachescreatedbyearlierruns.

37

TherearethreecachesinAnalysisServicesthatyoushouldbeawareof: Theformulaenginecache Thestorageenginecache Thefilesystemcache

Clearingformulaengineandstorageenginecaches:ThefirsttwocachescanbeclearedwiththeXMLA ClearCachecommand.Thiscommandcanbeexecutedusingtheascmdcommandlineutility:

Clearingfilesystemcaches:Thefilesystemcacheisabithardertogetridofbecauseitresidesinside Windowsitself. IfyouhavecreatedaseparateWindowsvolumeforthecubedatabase,youcandismountthevolume itselfusingthefollowingcommand: fsutil.exevolumedismount Thisclearsthefilesystemcacheforthisdriveletterormountpoint.Ifthecubedatabaseresidesonlyon thislocation,runningthiscommandresultsinacleanfilesystemcache. Alternatively,youcanusetheutilityRAMMapfromsysinternals.Thisutilitynotonlyallowsyoutoread thefilesystemcachecontent,italsoallowsyoutopurgeit.Ontheemptymenu,clickEmptySystem WorkingSet,andthenclickEmptyStandbyList.Thisclearsthefilesystemcachefortheentiresystem. NotethatwhenRAMMapstartsup,ittemporarilyfreezesthesystemwhileitreadsthememorycontent thiscantakesometimeonalargemachine.Hence,RAMMapshouldbeusedwithcare. ThereiscurrentlyaCodePlexprojectcalledASStoredProceduresfoundat: http://asstoredprocedures.codeplex.com/wikipage?title=FileSystemCache.Thisprojectcontainscode forautilitythatenablesyoutoclearthefilesystemcacheusingastoredprocedurethatyoucanrun directlyonAnalysisServices. NotethatneitherFSUTILnorRAMMapshouldbeusedinproductioncubesbothcausedisruptionto usersconnectedtothecube.AlsonotethatneitherRAMMaporASStoredProceduresissupportedby Microsoft.

2.2.5 Actions During Testing for Operations ManagementWhileyourtestteamiscreatingandrunningthetestharness,youroperationsteamcanalsotakesteps topreparefordeployment.

38

Testyourdatacollectionsetup:Testrunsgiveyouauniquechancetotryoutyourdatacollection proceduresbeforeyougointoproduction.Thedatacollectionyouperformcanalsobeusedtodrive earlyfeedbacktothedevelopmentteam. Understandserverutilization:Whileyoutest,youcangetanearlyinsightintoserverutilization.You willbeabletomeasurethememoryusageofthecubeandthewaythenumberofusersmapstoI/O loadandCPUutilization.Ifthecubeislargerthanmemory,youcanalsomeasuretheeffectof concurrencyandleaflevelscanontheI/Osubsystem.Remembertomeasuretheworstcaseexamples describedearliertounderstandwhattheimpactonthesystemis. Earlythreadtuning:Duringtesting,youcandiscoverthreadingbottlenecks,asdescribedinPart1.This enablesyoutogointoproductionwithpretunedsettingsthatimproveuserexperience,scalability,and hardwareutilizationofthesolution.

2.2.6 Actions After TestingWhentestingiscomplete,youhavereportsthatdescribetheruntimeofeachquery,andyoualsohave agreaterunderstandingoftheserverutilization.Thisisagoodtimetoreviewthedesignofthecube withtheBIdevelopersusingthenumbersyoucollectedduringtesting.Often,someeasywinscanbe harvestedatthispoint. Whenyouputacubeintoproduction,itisimportanttounderstandthelongtermeffectsofusers buildingspreadsheetsandreportsreferencingit.Considerthedependenciesthataregeneratedasthe cubeissuccessfullydeployedinadhocdatastructuresacrosstheorganization.Thedatamodelcreated andexposedbythecubewillbelinkedintospreadsheetsandreports,anditbecomeshardtomakedata modelchangeswithoutdisturbingusers.Fromanoperationalperspective,preproductiontestingis typicallyyourlastchancetorequestcheapdatamodelchangesfromyourBIdevelopersbeforebusiness usersinevitablylockthemselvesintothedatastructuresthatunlocktheirdata.Noticethatthisis differentfromtypical,staticreportinganddevelopmentcycles.Withstaticreports,theBIdevelopers areincontrolofthedependencies;ifyougiveusersExcelorotheradhocaccesstocubes,thatcontrolis lost.Explorativedatapowercomesataprice.

2.3 Tuning Query PerformanceToimprovequeryperformance,youshouldunderstandthecurrentsituation,diagnosethebottleneck, andthenapplyoneofseveraltechniquesincludingoptimizingdimensiondesign,designingandbuilding aggregations,partitioning,andapplyingbestpractices.Theseshouldbethefirststopsforoptimization, beforediggingintoqueriesingeneral. Muchtimecanbeexpendedpursuingdeadendsitisimportanttofirstunderstandthenatureofthe problembeforeapplyingspecifictechniques.Togainthisunderstanding,itisoftenusefultohavea mentalmodelofhowthequeryengineworks.Wewillthereforestartwithabriefintroductiontothe AnalysisServicesqueryprocessor.

39

2.3.1 Query Processor ArchitectureTomakethequeryingexperienceasfastaspossibleforendusers,theAnalysisServicesquerying architectureprovidesseveralcomponentsthatworktogethertoefficientlyretrieveandevaluatedata. Thefollowingfigureidentifiesthethreemajoroperationsthatoccurduringqueryingsession management,MDXqueryexecution,anddataretrievalaswellastheservercomponentsthat participateineachoperation.

ClientApp(MDX)

SessionManagement

XMLA Listener Session Manager SecurityManager

QueryProcessing

QueryProcessor

QueryProcCache

DataRetrieval

Storage Engine

SECache

DimensionData AttributeStore HierarchyStore

MeasureGroupData FactData Aggregations

Figure 1818: Analysis Services query processor architecture

2.3.1.1 Session ManagementClientapplicationscommunicatewithAnalysisServicesusingXMLforAnalysis(XMLA)overTCP/IPor HTTP.AnalysisServicesprovidesanXMLAlistenercomponentthathandlesallXMLAcommunications betweenAnalysisServicesanditsclients.TheAnalysisServicesSessionManagercontrolshowclients connecttoanAnalysisServicesinstance.UsersauthenticatedbytheWindowsoperatingsystemand whohaveaccesstoatleastonedatabasecanconnecttoAnalysisServices.Afterauserconnectsto 40

AnalysisServices,theSecurityManagerdeterminesuserpermissionsbasedonthecombinationof AnalysisServicesrolesthatapplytotheuser.Dependingontheclientapplicationarchitectureandthe securityprivilegesoftheconnection,theclientcreatesasessionwhentheapplicationstarts,andthenit reusesthesessionforalloftheusersrequests.Thesessionprovidesthecontextunderwhichclient queriesareexecutedbythequeryprocessor.Asessionexistsuntilitisclosedbytheclientapplicationor theserver.

2.3.1.2 Query ProcessingThequeryprocessorexecutesMDXqueriesandgeneratesacellsetorrowsetinreturn.Thissection providesanoverviewofhowthequeryprocessorexecutesqueries.Formoreinformationabout optimizingMDX,seeOptimizingMDX. Toretrievethedatarequestedbyaquery,thequeryprocessorbuildsanexecutionplantogeneratethe requestedresultsfromthecubedataandcalculations.Therearetwomajordifferenttypesofquery executionplans:cellbycell(nave)evaluationorblockmode(subspace)computation.Whichoneis chosenbytheenginecanhaveasignificantimpactonperformance.Formoreinformation,seeSubspace Computation. Tocommunicatewiththestorageengine,thequeryprocessorusestheexecutionplantotranslatethe datarequestintooneormoresubcuberequeststhatthestorageenginecanunderstand.Asubcubeisa logicalunitofquerying,caching,anddataretrievalitisasubsetofcubedatadefinedbythecrossjoin ofoneormoremembersfromasinglelevelofeachattributehierarchy.AnMDXquerycanberesolved intomultiplesubcuberequests,dependingtheattributegranularitiesinvolvedandcalculation complexity;forexample,aqueryinvolvingeverymemberoftheCountryattributehierarchy(assuming itsnotaparentchildhierarchy)wouldbesplitintotwosubcuberequests:onefortheAllmemberand anotherforthecountries. Asthequeryprocessorevaluatescells,itusesthequeryprocessorcachetostorecalculationresults.The primarybenefitsofthecachearetooptimizetheevaluationofcalculationsandtosupportthereuseof calculationresultsacrossusers(withthesamesecurityroles).Tooptimizecachereuse,thequery processormanagesthreecachelayersthatdeterminethelevelofcachereusability:global,session,and query. 2.3.1.2.1 Query Processor Cache DuringtheexecutionofanMDXquery,thequeryprocessorstorescalculationresultsinthequery processorcache.Theprimarybenefitsofthecachearetooptimizetheevaluationofcalculationsandto supportreuseofcalculationresultsacrossusers.Tounderstandhowthequeryprocessorusescaching duringqueryexecution,considerthefollowingexample:YouhaveacalculatedmembercalledProfit Margin.WhenanMDXqueryrequestsProfitMarginbySalesTerritory,thequeryprocessorcachesthe nonnullProfitMarginvaluesforeachSalesTerritory.Tomanagethereuseofthecachedresultsacross users,thequeryprocessordistinguishesdifferentcontextsinthecache:

41

QueryContextcontainstheresultofcalculationscreatedbyusingtheWITHkeywordwithina query.Thequerycontextiscreatedondemandandterminateswhenthequeryisover. Therefore,thecacheofthequerycontextisnotsharedacrossqueriesinasession. SessionContextcontainstheresultofcalculationscreatedbyusingtheCREATEstatement withinagivensession.Thecacheofthesessioncontextisreusedfromrequesttorequestinthe samesession,butitisnotsharedacrosssessions. GlobalContextcontainstheresultofcalculationsthataresharedamongusers.Thecacheof theglobalcontextcanbesharedacrosssessionsifthesessionssharethesamesecurityroles.

Thecontextsaretieredintermsoftheirlevelofreuse.Atthetop,thequerycontextiscanbereused onlywithinthequery.Atthebottom,theglobalcontexthasthegreatestpotentialforreuseacross multiplesessionsandusersbecausethesessioncontextwillderivefromtheglobalcontextandthe querycontextwillderiveitselffromthesessioncontext.

Figure 1919: Cache context layers

Duringexecution,everyMDXquerymustreferenceallthreecontextstoidentifyallofthepotential calculationsandsecurityconditionsthatcanimpacttheevaluationofthequery.Forexample,toresolve aquerythatcontainsaquerycalculatedmember,thequeryprocessorcreatesaquerycontextto resolvethequerycalculatedmember,createsasessioncontexttoevaluatesessioncalculations,and createsaglobalcontexttoevaluatetheMDXscriptandretrievethesecuritypermissionsoftheuser whosubmittedthequery.Notethatthesecontextsarecreatedonlyiftheyarentalreadybuilt.After theyarebuilt,theyarereusedwherepossible. Eventhoughaqueryreferencesallthreecontexts,itwilltypicallyusethecacheofasinglecontext.This meansthatonaperquerybasis,thequeryprocessormustselectwhichcachetouse.Thequery processoralwaysattemptstousethebroadlyapplicablecachedependingonwhetherornotitdetects thepresenceofcalculationsatanarrowercontext. Ifthequeryprocessorencounterscalculationscreatedatquerytime,italwaysusesthequerycontext, evenifaqueryalsoreferencescalculationsfromtheglobalcontext(thereisanexceptiontothis 42

querieswithquerycalculatedmembersoftheformAggregate()dosharethesessioncache).If therearenoquerycalculations,buttherearesessioncalculations,thequeryprocessorusesthesession cache.Thequeryprocessorselectsthecachebasedonthepresenceofanycalculationinthescope.This behaviorisespeciallyrelevanttouserswithMDXgeneratingfrontendtools.Ifthefrontendtool createsanysessioncalculationsorquerycalculations,theglobalcacheisnotused,evenifyoudonot specificallyusethesessionorquerycalculations. Thereareothercalculationscenariosthatimpacthowthequeryprocessorcachescalculations.When youcallastoredprocedurefromanMDXcalculation,theenginealwaysusesthequerycache.Thisis becausestoredproceduresarenondeterministic(meaningthatthereisnoguaranteewhatthestored procedurewillreturn).Asaresult,afteranondeterministiccalculationisencounteredduringthequery, nothingiscachedgloballyorinthesessioncache.Instead,theremainingcalculationsarestoredinthe querycache.Inaddition,thefollowingscenariosdeterminehowthequeryprocessorcachescalculation results: TheuseofMDXfunctionsthatarelocaledependent(suchasCaptionor.Properties)prevents theuseoftheglobalcache,becausedifferentsessionsmaybeconnectedwithdifferentlocales andcachedresultsforonelocalemaynotbecorrectforanotherlocale. Theuseofcellsecurity;functionssuchasUserName,StrToSet,StrToMember,andStrToTuple; orLookupCubefunctionsintheMDXscriptorinthedimensionorcellsecuritydefinitiondisable theglobalcache.Thatis,justoneexpressionthatusesanyofthesefunctionsorfeatures disablesglobalcachingfortheentirecube. IfvisualtotalsareenabledforthesessionbysettingthedefaultMDXVisualModepropertyin theAnalysisServicesconnectionstringto1,thequeryprocessorusesthequerycacheforall queriesissuedinthatsession. IfyouenablevisualtotalsforaquerybyusingtheMDXVisualTotalsfunction,thequery processorusesthequerycache. Queriesthatusethesubselectsyntax(SELECTFROMSELECT)orarebasedonasessionsubcube (CREATESUBCUBE)resultinthequeryor,respectively,sessioncachetobeused. Arbitraryshapescanonlyusethequerycacheiftheyareusedinasubselect,intheWHERE clause,orinacalculatedmember.Anarbitraryshapeisanysetthatcannotbeexpressedasa crossjoinofmembersfromthesamelevelofanattributehierarchy.Forexample,{(Food,USA), (Drink,Canada)}isanarbitraryset,asis{customer.geography.USA,customer.geography.[British Columbia]}.Notethatanarbitraryshapeonthequeryaxisdoesnotlimittheuseofanycache.

Basedonthisbehavior,whenyourqueryingworkloadcanbenefitfromreusingdataacrossusers,itisa goodpracticetodefinecalculationsintheglobalscope.Anexampleofthisscenarioisastructured reportingworkloadwhereyouhavefewsecurityroles.Bycontrast,ifyouhaveaworkloadthatrequires individualdatasetsforeachuser,suchasinanHRcubewhereyouhavemanysecurityrolesoryouare 43

usingdynamicsecurity,theopportunitytoreusecalculationresultsacrossusersislessenedor eliminated.Asaresult,theperformancebenefitsassociatedwithreusingthequeryprocessorcacheare notashigh.

2.3.1.3 Data RetrievalWhenyouqueryacube,thequeryprocessorbreaksthequeryintosubcuberequestsforthestorage engine.Foreachsubcuberequest,thestorageenginefirstattemptstoretrievedatafromthestorage enginecache.Ifnodataisavailableinthecache,itattemptstoretrievedatafromanaggregation.Ifno aggregationispresent,itmustretrievethedatafromthefactdatafromameasuregroupspartition data. RetrievingdatafromapartitionrequiresI/Oactivity.ThisI/Ocaneitherbeservedfromthefilesystem cacheorfromdisk.AdditionaldetailsoftheI/OsubsystemofAnalysisServicescanbefoundinPart2.

Figure 2020: High-level overview of the data retrieval process

2.3.1.3.1 Storage Engine Cache Thestorageenginecacheisalsoknownasthedatacacheregistrybecauseitiscomposedofthe dimensionandmeasuregroupcachesthatarethesamestructurally.Whenarequestismadefromthe 44

AnalysisServicesformulaenginetothestorageengine,itsendsarequestintheformofasubcube describingthestructureofthedatarequestandadatacachestructurethatwillcontaintheresultsof therequest.Usingthedatacacheregistryindexes,itattemptstofindacorrespondingsubcube: Ifthereisamatchingsubcube,thecorrespondingdatacacheisreturned. Ifasubcubesupersetisfound,anewdatacacheisgeneratedandtheresultsarefilteredtofit thesubcuberequest. Iflowergraindataexists,thedatacacheregistrycanaggregatethisdataandmakeitavailable aswellandthenewsubcubeanddatacachearealsoregisteredinthecacheregistry. Ifdatadoesnotexist,therequestgoestothestorageengineandtheresultsarecachedinthe cacheregistryforfuturequeries.

AnalysisServicesallocatesmemoryviamemoryholdersthatcontainstatisticalinformationaboutthe amountofmemorybeingused.Memoryholdersareintheformofnonshrinkableandshrinkable memory;eachcombinationofasubcubeanddatacacheformsasingleshrinkablememoryholder. WhenAnalysisServicesisunderheavymemorypressure,cleanerthreadsremoveshrinkablememory. Therefore,ensureyoursystemhasenoughmemory;ifitdoesnot,yourdatacacheregistrywillbe clearedout(resultinginslowerqueryperformance)whenitisplacedundermemorypressure. 2.3.1.3.2 Aggressive Data Scanning Sometimes,intheevaluationofanexpression,moredataisrequestedthanrequiredtodeterminethe result. Ifyoususpectmoredataisbeingretrievedthanisrequired,youcanuseSQLServerProfilertodiagnose howaqueryintosubcubequeryeventsandpartitionscans.Forsubcubescans,checktheverbose subcubeeventandwhethermoremembersthanrequiredareretrievedfromthestorageengine.For smallcubes,thislikelyisntaproblem.Forlargercubeswithmultiplepartitions,itcangreatlyreduce queryperformance.Thefollowingfiguredemonstrateshowasinglequerysubcubeeventresultsin partitionscans.

Figure2121:Aggressivepartitionscanning Therearetwopotentialsolutionstothis.Ifacalculationexpressioncontainsanarbitraryshape(thisis definedinthesectiononthequeryprocessorcache),thequeryprocessormaynotbeabletodetermine thatthedataislimitedtoasinglepartitionandrequestdatafromallpartitions.Trytoeliminatethe arbitraryshape.

45

Othertimes,thequeryprocessorissimplyoverlyaggressiveinaskingfordata.Forsmallcubes,this doesntmatter,butforverylargecubes,itdoes.Ifyouobservethisbehavior,potentialsolutionsinclude thefollowing: ContactMicrosoftCustomerServiceandSupportforfurtheradvice. DisablePrefetch=1(thisisdoneintheconnectionstring):SometimesAnalysisServicesrequests additionaldatafromthesourcetoprepopulatethecache;itmayhelptoturnitoffsothat AnalysisServicesdoesnotrequesttoomuchdata.

2.3.2 Query Processor InternalsThereareseveralchangestoqueryprocessorinternalsinSQLServer2008AnalysisServicesthatare applicabletoday(comparedtoSQLServer2005AnalysisServices).Inthissection,thesechangesare discussedbeforespecificoptimizationtechniquesareintroduced.

2.3.2.1 Subspace ComputationThekeyideabehindsubspacecomputationisbestintroducedbycontrastingitwithacellbycell evaluationofacalculation.(Thisisalsoknownasanavecalculation.)Consideratrivialcalculation RollingSumthatsumsthesalesforthepreviousyearandthecurrentyear,andaquerythatrequeststhe RollingSumfor2005forallProducts. RollingSum=(Year.PrevMember,Sales)+Sales SELECT2005oncolumns,Product.MembersonrowsWHERERollingSum

Acellbycellevaluationofthiscalculationproceedsasrepresentedinthefollowingfigure.

46

Figure 2222: Cell-by-cell evaluation

The10cellsfor[2005,AllProducts]areeachevaluatedinturn.Foreach,thepreviousyearislocated, andthenthesalesvalueisobtainedandthenaddedtothesalesforthecurrentyear.Therearetwo significantperformanceissueswiththisapproach. Firstly,ifthedataissparse(thatis,thinlypopulated),cellsarecalculatedeventhoughtheyareboundto returnanullvalue.Inthepreviousexample,calculatingthecellsforanythingbutProduct3andProduct 6isawasteofeffort.Theimpactofthiscanbeextremeinasparselypopulatedcube,thedifference canbeseveralordersofmagnitudeinthenumbersofcellsevaluated. Secondly,evenifthedataistotallydense,meaningthateverycellhasavalueandthereisno wastedeffortvisitingemptycells,thereismuchrepeatedeffort.Thesamework(forexample,getting thepreviousYearmember,settingupthenewcontextforthepreviousYearcell,checkingforrecursion) isredoneforeachProduct.Itwouldbemuchmoreefficienttomovethisworkoutoftheinnerloopof evaluatingeachcell. Nowconsiderthesameexampleperformedusingsubspacecomputation.Insubspacecomputation,the engineworksitswaydownanexecutiontreedeterminingwhatspacesneedtobefilled.Giventhe query,thefollowingspaceneedstobecomputed,where*meanseverymemberoftheattribute hierarchy. [Product.*,2005,RollingSum]

Giventhecalculation,thismeansthatthefollowingspaceneedstobecomputedfirst. 47 [Product.*,2004,Sales]

Next,thefollowingspacemustbecomputed. [Product.*,2005,Sales]

Finally,the+operatorneedstobeaddedtothosetwospaces. IfSaleswereitselfcoveredbycalculations,thespacesnecessarytocalculateSaleswouldbedetermined andthetreewouldbeexpanded.InthiscaseSalesisabasemeasure,sothestorageenginedataisused tofillthetwospacesattheleaves,andthen,workingupthetree,theoperatorisappliedtofillthe spaceattheroot.Hencetheonerow(Product3,2004,3)andthetworows{(Product3,2005,20), (Product6,2005,5)}areretrieved,andthe+operatorappliedtothemtoyieldsthefollowingresult.

Figure 2323: Execution plan

The+operatoroperatesonspaces,notsimplyscalarvalues.Itisresponsibleforcombiningthetwogiven spacestoproduceaspacethatcontainseachproductthatappearsineitherspacewiththesummed value.Thisisthequeryexecutionplan.Notethatitoperatesonlyondatathatcouldcontributetothe result.Thereisnonotionofthetheoreticalspaceoverwhichthecalculationmustbeperformed. Aqueryexecutionplanisnotoneortheotherbutcancontainbothsubspaceandcellbycellnodes. Somefunctionsarenotsupportedinsubspacemode,causingtheenginetofallbacktocellbycell mode.Butevenwhenevaluatinganexpressionincellbycellmode,theenginecanreturntosubspace mode.

2.3.2.2 Expensive vs. Inexpensive Query PlansItcanbecostlytobuildaqueryplan.Infact,thecostofbuildinganexecutionplancanexceedthecost ofqueryexecution.TheAnalysisServicesenginehasacoarseclassificationschemeexpensiveversus inexpensive.Aplanisdeemedexpensiveifcellbycellmodeisusedorifcubedatamustbereadtobuild theplan.Otherwisetheexecutionplanisdeemedinexpensive. 48

Cubedataisusedinqueryplansinseveralscenarios.Somequeryplansresultinthemappingofone membertoanotherbecauseofMDXfunctionssuchasPrevMemberandParent.Themappingsarebuilt fromcubedataandmaterializedduringtheconstructionofthequeryplans.TheIIf,CASE,andIF functionscangenerateexpensivequeryplansaswell,shoulditbenecessarytoreadcubedatainorder topartitioncubespaceforevaluationofoneofthebranches.Formoreinformation,seeIIfFunctionin SQLServer2008AnalysisServices.

2.3.2.3 Expression SparsityAnexpressionssparsityreferstothenumberofcellswithnonnullvaluescomparedtothetotalnumber ofcellsintheresultoftheevaluationoftheexpression.Iftherearerelativelyfewnonnullvalues,the expressionistermedsparse.Iftherearemany,theexpressionisdense.Asweshallseelater,whether anexpressionissparseordensecaninfluencethequeryplan. Buthowcanyoutellwhetheranexpressionisdenseorsparse?Considerasimplenoncalculated measureisitdenseorsparse?InOLAP,basefactmeasuresareconsideredsparsebytheAnalysis Servicesengine.Thismeansthatthetypicalmeasuredoesnothavevaluesforeveryattributemember. Forexample,acustomerdoesnotpurchasemostproductsonmostdaysfrommoststores.Infactits thequitetheopposite.Atypicalcustomerpurchasesasmallpercentageofallproductsfromasmall numberofstoresonafewdays.Thefollowingtablelistssomeothersimplerulesforpopular expressions. Expression Regularmeasure ConstantValue Scalarexpression;forexample,count, .properties + * / Sum(,) Aggregate(,) IIf(,,) Sparse/dense Sparse Dense(excludingconstantnullvalues, true/falsevalues) Dense Sparseifbothexp1andexp2aresparse; otherwisedense Sparseifeitherexp1orexp2issparse; otherwisedense Sparseifissparse;otherwisedense Inheritedfrom Determinedbysparsityofdefaultbranch (refertoIIffunction)

Formoreinformationaboutsparsityanddensity,seeGrossmargindensevs.sparseblockevaluation modeinMDX(http://sqlblog.com/blogs/mosha/archive/2008/11/01/grossmargindensevssparse blockevaluationmodeinmdx.aspx).

49

2.3.2.4 Default ValuesEveryexpressionhasadefaultvaluethevaluetheexpressionassumesmostofthetime.Thequery processorcalculatesanexpressionsdefaultvalueandreusesacrossmostofitsspace.Mostofthetime thisisnullbecauseoftentimes(butnotalways)theresultofanexpressionwithnullinputvaluesisnull. Theenginecanthencomputethenullresultonce,andthenitneedstocomputeonlyvaluesforthe muchreducednonnullspace. AnotherimportantuseofthedefaultvaluesisintheconditionintheIIffunction.Knowingwhichbranch isevaluatedmoreoftendrivestheexecutionplan.Thedefaultvaluesofsomepopularexpressionsare listedinthefollowingtable. Expression Regularmeasure IsEmpty() Defaultvalue Null True Comment None. Themajorityoftheoreticalspaceis occupiedbynullvalues.Therefore, IsEmptywillreturnTruemostoften. Valuesforbothmeasuresareprincipally null,sothisevaluatestoTruemostofthe time. Thisisdifferentthancomparingvalues theengineassumesthatdifferent membersarecomparedmostofthetime.

= IS

True

False

2.3.2.5 Varying AttributesCellvaluesmostlydependonattributecoordinates.Butsomecalculationsdonotdependonevery attribute.Forexample,thefollowingexpressiondependsonlyontheCustomerattributeinthecustomer dimension. [Customer].[CustomerGeography].properties("PostalCode") Whenthisexpressionisevaluatedoverasubspaceinvolvingotherattributes,anyattributesthe expressiondoesntdependoncanbeeliminated,andthentheexpressioncanberesolvedandprojected backovertheoriginalsubspace.Theattributesanexpressiondependsonaretermeditsvarying attributes.Forexample,considerthefollowingquery.withmembermeasures.Zipas [Customer].[CustomerGeography].currentmember.properties("PostalCode") selectmeasures.zipon0, [Product].[Category].memberson1 from[AdventureWorks]

50

where[Customer].[CustomerGeography].[Customer].&[25818]

Theexpressiondependsonthecustomerattributeandnotthecategoryattribute;therefore,customer isavaryingattributeandcategoryisnot.Inthiscasetheexpressionisevaluatedonlyonceforthe customerandnotasmanytimesasthereareproductcategories.

2.3.3 Optimizing MDXDebuggingcalculationperformanceissuesacrossacubecanbedifficultiftherearemanycalculations. Thefirststepistotrytonarrowdownwheretheproblemexpressionisandthenapplybestpracticesto theMDX.Inordertonarrowdownaproblem,youwillfirstneedabaseline.

2.3.3.1 Baselining Query SpeedsBeforebeginningoptimization,youneedreproduciblecoldcachebaselinemeasurements. Todothis,youshouldbeawareofthefollowingthreeAnalysisServicescaches: Theformulaenginecache Thestorageenginecache Thefilesystemcache

BoththeAnalysisServicesandtheoperatingsystemcachesneedtobeclearedbeforeyoustarttaking measurements. 2.3.3.1.1 Clearing the Analysis Services Caches TheAnalysisServicesformulaengineandstorageenginecachescanbeclearedwiththeXMLA ClearCachecommand.YoucanuseSQL