Post on 25-Oct-2021
Building Performant Dossiers:Tips and Best PracticesAlejandro OlveraTechnology
#analytics19
Safe Harbor Notice
This presentation may include statements that constitute “forward-looking statements” for purposes of the safe harbor provisionsunder the Private Securities Litigation Reform Act of 1995, including descriptions of technology and product features that are under development and estimates of future business prospects. Forward-looking statements inherently involve risks and uncertainties that could cause actual results of MicroStrategy Incorporated and its subsidiaries (collectively, the “Company”) to differ materially from the forward-looking statements.
Factors that could contribute to such differences include: the Company’s ability to meet product development goals while aligning costs with anticipated revenues; the Company’s ability to develop, market, and deliver on a timely and cost-effective basis new or enhanced offerings that respond to technological change or new customer requirements; the extent and timing of market acceptance of the Company’s new offerings; continued acceptance of the Company’s other products in the marketplace; the timing of significant orders; competitive factors; general economic conditions; and other risks detailed in the Company’sForm 10-Q for the three months ended September 30, 2018 and other periodic reports filed with the Securities and Exchange Commission. By making these forward-looking statements, the Company undertakes no obligation to update these statements for revisions or changes after the date of this presentation.
Copyright © 2019 MicroStrategy Incorporated. All Rights Reserved.
| Analytics and Mobility M icroStrategy C onfidentia l. C opyright © 2019 M icroS trategy Incorporated. A ll R ights R eserved.
Agenda • Who’s responsible for performance?• Learning the Dossier Execution
Workflow• Potential bottlenecks• Strategies to achieve good
performance
ENTERPRISE ASSETS(150+ D R IVER S + G ATEW AYS)
TECHNOLOGYOPPORTUNITIES
C loud C om puting
M achine Learning
M obile C om puting
B lockchain
B ig D ata
Internet of Things
D igita l Identity
MARKETDISRUPTORS
Apple
Am azon
G oogle
A libaba
Tw itter
M icrosoft
REGULATORYREQUIREMENTS
D ata Privacy and Security
Advertis ing and M arketing
C orporate G overnance
Em ploym ent and Labor
F inancia l
Environm ental
H ealth and Safety
COMPETITIVEFORCES
N ew Entrants
Supplier Bargain ing Pow er
Buyer Bargain ing Pow er
Threat of Substitute P roducts
Intensity of R ivalry am ong C om petitors
INTELLIGENCE CENTER
SQL
DATAFederated, certified collections of data published to enterprise
architects and departm ental analysts, data scientists, and developers.Dictionaries Lineage Views Cubes Marts
SCHEMAA sem antic graph contain ing reusable objects, captured in business term s, that
m ap to enterprise assets, and abstract com plexities of the underly ing data.
Models Attributes Metrics Templates Filters Sets Visualizations Prompts Forms
DRIVERS AND GATEWAYSO ut-of-the-box gatew ays and drivers that m ake it easy to connect
to a lm ost any enterprise inform ation resource.
Relational OLAP Big Data EMM PACS Logical Application
APPLICATIONC ontainers of actionable inte lligence, packaged and published
to the enterprise and departm ents.
Dossiers Dashboards Documents Distributions Custom Apps Web Services Data Services Cards
Sales Marketing Customer Service Manufacturing Field Service Finance HR IT VendorsFacilities Customers
BUSINESS USERSE xplore and interact w ith published analytics. E nhance applications using
self-service data d iscovery to create custom groups, derived m etrics, and dynam ic filters. Foster adoption through collaboration and sharing.
EXECUTIVESThe B usiness E xecutive sets the analytics and m obility strategy for the
function. They establish the function’s priorities, program s, budget, and plan, w hilst m ainta in ing justification for the p latform assets and resources by tracking and publishing adoption, im pact, and ultim ately return-on-investm ent.
ANALYSTSC reate, share, and m ainta in inte lligence applications for the
departm ent using enterprise security, data, and application objects to help ensure a s ingle version of the truth.
DATA SCIENTISTSB uild and publish advanced statistics, predictive m odels, and m achine
learning algorithm s using libraries such as TensorFlow , R , P ython, and M A TLA B , that are leveraged by A nalysts and D evelopers.
DEVELOPERSIn ject, extend, and em bed inte lligence into custom and th ird-party applications using
program m ing languages such as JavaS cript, Java, P H P , P ython, S W IFT, O bjective-C , C #, .N et, and other com m on languages.
DEVICES
FUNCTIONS
USERS Management Sales Reps Workforce Vendors Influencers VIP Clientele Customers CitizensExecutives
Enterprise Analytics
Enterprise Reporting Big Data Data Discovery
Embedded Analytics
Enterprise Mobility
Mobile Analytics
Mobile Productivity External Apps
Mobile Identity, Security, and Communications
Mobile Telemetry and IoTAPPLICATIONS
Wall TV Desktop Web Tablet Phone Watch Voice
TOOLS Email Google Search Excel PowerPoint Power BI Qlik Tableau Alteryx Paxata Trifacta Datawatch Jupyter Matlab R Studio SAS SPSS Eclipse Python IDE Visual Studio xCode
ROLES
ApplicationsP ublish an application fram ew ork and best practices that enable a ll functions to build consistently im pactfu l applications.
E stablish a foundation of shared com ponents to speed departm ental application developm ent.
PR O G R A M S
MobilityE stablish a fram ew ork, d iscip line, and architecture so A nalysts and D evelopers can build and deploy m obile applications.
E stablish processes, protocols, and program s so the Inte lligent E nterprise can consum e apps on m obile devices.
IntelligenceIn ject artific ia l in te lligence, m achine learning, deep learning, and predictive analytics a lgorithm s into enterprise applications
and federated datasets. P ublish and curate a library of m odels for A nalysts and departm ental D ata S cientists.
ServicesC onvert datasets and application com ponents into published services for D evelopers to in ject in te lligence into their custom
applications. P ublish sam ples and docum entation and em pow er D evelopers to use their preferred program m ing tools
and languages.
DepartmentalE m pow er departm ents to rapid ly build applications on federated trusted data using M icroS trategy or other tools (i.e.
Tableau, P ow er B I, E xcel). O rchestrate collaboration w ith the inte lligence center so datasets are continually appraised,
optim ized, and updated.
Enterprise
H arness the pow er of your other B I investm ents and
extend the value to a ll constituents on various devices. M igrate legacy S A P B usinessO bjects and IB M C ognos
applications onto a m odern platform .
DEVELOPMENT
PlatformsA rchitect, insta ll, configure, and deploy the Inte lligent E nterprise architecture. D esign the optim um
architecture to deliver security, stability , scalability , and econom y by com bining platform capabilities w ith
on-prem ises, c loud, and/or hybrid services.
AnalyticsD esign the optim al federated enterprise data layer and publish it to A nalysts, D ata S cientists, D evelopers, and
A rchitects. W ork w ith D epartm ents to evolve the data architecture to m eet changing business needs.
AdministrationM onitor, support, and m ainta in the Inte lligent E nterprise architecture to facilita te ongoing security, stability , and
econom y. M onitor system use, autom ate tasks, and im plem ent upgrades to help ensure an optim al, re liable, and m odern user
experience.
IdentityD esign and publish a d ig ita l identity architecture that enables enterprise-w ide m ulti-factor geo-specific
d ig ita l authentication for internal and external users. Im plem ent the dig ita l identity architecture and
gatew ays on top of a ll logical and physical assets.
Database C onfigure the inte lligence platform to optim ize perform ance against various database technologies (i.e.
O racle, S Q L, S now flake, H D FS ), including re lational, O LA P , b ig data, unstructured, vector, and stream ing
platform s. Track throughput and perform ance, and provide architecture design and optim ization
recom m endations to the database adm inistrator.
SystemsIntegrate data from enterprise system s (i.e. E R P , C R M , M R P , H R ) and blend w ith other data sources to build
custom analytics and m obility applications. D esign, im plem ent, and optim ize an integrated architecture to
overcom e the reporting lim itations and extend the capabilities of enterprise system s of record.
FOUNDATION
A R C H ITEC TU R E PER SO N A S
Intelligence DirectorC reate Inte lligence environm ents by deploying the Inte lligence A rchitecture, supervis ing the Inte lligence C enter and running
Inte lligence P rogram s to support enterprise and departm ental analytics and m obility applications for a ll constituents.
Application ArchitectC reate, share, and m ainta in inte lligence applications for the enterprise.
P ublish standardized application objects, and prom ote departm ental applications from
self-service into the enterprise environm ent.
Analytics ArchitectC reate, publish, and optim ize a federated data layer as the enterprise’s s ingle version of the
truth. B uild and m ainta in the schem a objects and abstraction layer on top of various, changing
enterprise assets.
Mobile ArchitectB uild, com pile, deploy, and m ainta in m obile environm ents and applications. O ptim ize the
user experience w hen accessing applications v ia m obile devices. Integrate w ith preferred V P N ,
S S O , and E M M protocols.
Identity ArchitectB uild, com pile, deploy, and m ainta in d ig ita l identity applications, integrated w ith
enterprise d irectories. D ig ita lly secure a ll existing and new logical and physical assets.
Integrate authentication, com m unication, and te lem etry into other applications.
Services ArchitectIn ject, extend, and em bed analytics into porta ls, th ird-party, m obile, and w hite-labeled
applications. P ublish w eb services and data services for use by D evelopers in build ing
departm ental applications.
Database ArchitectD esign and m ainta in database enterprise assets. O ptim ize database perform ance and utilization
based on query type, usage patterns, and application design requirem ents.
Platform AdministratorInsta ll and configure the Inte lligence A rchitecture on-prem ises and/or in the c loud. M ainta in the
security layer, m onitor system usage, and optim ize architecture in order to reduce errors,
m axim ize uptim e, and boost perform ance.
System AdministratorS et up, m ainta in, m onitor, and continuously support the infrastructure environm ent
through deploym ent on A W S , W indow s, or L inux, a ll w hile optim izing perform ance and
contro lling costs.
D EPLO YM EN T
D evelopm ent U ser A cceptance Testing
(U A T)
D epartm ental
E nterprise
O n - p r e m is e s , c lo u d , o r h y b r id
E n v i r o n m e n ts d e p lo y e d in m in u te s
V e r t ic a l a n d h o r iz o n ta l s c a l in g
M u l t i - n o d e c lu s te r in g
M u l t i - r e g io n s u p p o r t
H o t , w a r m , a n d c o ld fa i lo v e r
R e l ia b le d r a g - a n d - d r o p p r o m o t io n
In te g r i ty te s t in g o f a p p s a n d u p g r a d e s
3 6 0 d e g r e e m o n i to r in g b u i l t in
Amazon Web Services On-Premises Microsoft Azure
Alibaba Cloud CenturyLink Google Cloud IBM Cloud Oracle Cloud Rackspace
Docker Kubernetes vmware
Linux Windows
An Intelligence Platform helps make every device, application, and person more intelligent.
It creates an enterprise semantic graph that connects, indexes, abstracts, and federates your organization’s data, telemetry, and usage. Users can rapidly build contextual applications and deploy them anywhere, anytime, and on any standard device—delivering trusted answers to constituents based on who they are, where they are, and what they need.
M E T A D A T A
S E R V IC E S
APPLICATION SERVICESA suite of enterprise-caliber add-on services available for
architects to quickly and easily integrate into any application.
Intelligence Analytics Transaction Distribution Telemetry Identity Collaboration Geospatial Language
xArchitect Desktop Web Reporter Mobile Communicator Badge Hyper ApplicationCLIENT PRODUCTSC lients that enable intu itive, fast, and enjoyable analytics and m obility
experiences across w eb, desktop, and m obile interfaces.
IN TELLIG EN C E
PLA TFO R M
SECURITYC apabilities that enable the developm ent of personalized and secured
applications incorporating m ulti-factor, m ulti-layer authentication.
Library Users Groups Badges Privileges Permissions Roles Security Filters
PLATFORM SERVICESC ritical capabilities to deploy analytics and m obility applications w ith h igh
perform ance and scalability on top of enterprise assets.
Optimized Multi-Source Connectivity
Parallel Processing
Multi-node Server Cluster
Multi-Level Caching
Dynamic Sourcing
Platform Analytics Multi-Tenancy Usage
TelemetryCompute Elasticity
H IER A R C H Y O F N EED S
PopularityS u r fa c e in te g r a te d c o l la b o r a t io n a n d s h a r in g c a p a b i l i t ie s w i th in a p p l ic a t io n s th a t
p r o m o te v i r a l a d o p t io n a n d u s a g e b y d e s ig n .
EconomyM a n a g e m u l t ip le e n te r p r is e e n v i r o n m e n ts u s in g a s in g le p o w e r fu l w o r k s ta t io n to o l
fo r o n e - ta p d e p lo y m e n t , ta s k a u to m a t io n , a r c h i te c tu r e m a n a g e m e n t ,
a n d s y s te m m o n i to r in g .
SimplicityD e l iv e r u s e r e x p e r ie n c e s , l ik e v o ic e , c h a tb o ts , a n d n a tu r a l la n g u a g e , th a t fe e l
fa m i l ia r a n d e f fo r t le s s e v e n u p o n f i r s t u s e , s u p p o r te d w i th to o ls th a t r e q u i r e
z e r o t r a in in g .
ScalabilityS h a r e s o p h is t ic a te d p e r s o n a l iz e d a p p l ic a t io n s , b u i l t o n b i l l io n s o f r o w s o f d a ta ,
w i th 1 0 0 ,0 0 0 s u s e r s w h i le m a in ta in in g s u b - s e c o n d r e s p o n s e s .
StabilityD e p lo y a p p l ic a t io n s o n a r o b u s t a r c h i te c tu r e w i th h o t , w a r m , a n d c o ld fa i lo v e r
s t r a te g ie s , c lu s te r in g , a n d r e l ia b le g o v e r n o r s c o m b in in g to a v e r t s y s te m c r a s h e s
u n d e r p e a k u s a g e a n d d a ta lo a d s .
SecurityF a c i l i ta te s e a m le s s m a n a g e d a c c e s s to e n te r p r is e a s s e ts , c o n t r o l le d v ia a
g r a n u la r s e c u r i ty m o d e l , v ia c o n v e n ie n t m u l t i - fa c to r d ig i ta l c r e d e n t ia ls .
FunctionalityE n a b le B u s in e s s U s e r s , A n a ly s ts , a n d D e v e lo p e r s to p e r fo r m a l l ta s k s ( b u i ld ,
d e p lo y , a n a ly z e , s h a r e ) a c r o s s v a r io u s d e v ic e s ( d e s k to p , w e b , m o b i le , v o ic e )
w i th o u t r e s t r ic t io n o r l im i ta t io n .
MDX HADOOP NOSQL APPLICATIONS
The Intelligence Center
Aim at a 5s response time, keeping a maximum of 10s (end-to-end)
Every role of the Intelligent Enterprise matters
Building performant dossiers
6
Developing a mindset is the key!
• Systems Administrator• Platform Administrator• Analytics Architect• Applications Architect• Developers, Analysts, Business Users
• Thinking of performance implications with every dossier design decision• Intelligence Center working together with users to improve the overall system
Systems and Platform Administrators
7
Enable performant (network) access to enterprise datasets
Ensure enough Server resources to leverage In-Memory capabilities, including Caching
Manage environments for enterprise and departmental applications
• Enterprise environment requiring 99% uptime and meeting all
company performance Service Level Agreements (SLAs)
• Departmental environment with own governance allowing ad-hoc
reporting & self-service
ü Use job prioritization mechanisms already in
place to isolate users, including User Fencing and Workload fencing on clustered
environments
Applications Architect
8
Enterprise Applications standards and
guidelines
Performance tuning and troubleshooting
• Criteria for application datasets
• Application caching strategy
Work with business users, analysts,
developers, data scientists
Understanding the dossier execution
9
Populate Filters
Visualizations
Datasets Execution• Reports and View Reports• In Memory Cubes or Cached Reports• Live Connection
Dossier Schema
• Joins and Global Lookup Tables• Calculate Derived Attributes and Groups
• Get Visualization data• Visualization level calculations (e.g. DM’s)
• Rendering
Running Datasets
10
Datasets that require time to query data
• Reports run SQL against the Warehouse (unless already cached).
• View Reports run CSI against a Cube (unless cached – new!)
Datasets that do not consume time at this stage
• In Memory Cubes (already loaded / published)
• Reports that were pre-cached• Live Connection datasets
Dossier Schema
11
The layer that organizes the data available to the dossier globally
• Deals with Attribute linking (data blending across datasets)• Attribute relationships• Creation of Lookup Tables• Derived Attributes• Groups
Attribute Element List Filters
12
Avoiding heavy attributes on Filter styles that get populated with a list of elements
Running visualizations
13
What goes into executing each visualization?
Getting the data• In-Memory datasets: Data is subset from the
Cube (CSI Engine)
• Live datasets: Data is queried directly from the external source (SQL Engine)
Performing In-Memory calculations• Calculating Derived Metrics
• With Live datasets, Derived Metrics are pushed to the external data source. Some Derived Attributes calculations too.
JSON Generation + Transfer through Network
In-Memory vs Live Connection
14
What goes into executing each visualization?
Rendering visualizations
15
Rendering time depends on number of data points and your machine resources
• Grids have incremental fetching, so only the first 100 rows render
• However, other visualizations may attempt to render every data point
Strategies to achieve good performance
Dataset Caching & Dossier Caching (Cache Subscription)
Narrowing down the data for the dossier
Blending your Data the right way
Using In-Memory calculations wisely and remove unused objects
Limiting the amount of individual data points rendered in each visualization
Dataset Caching
Dataset Caching for Reports and View Reports (new!)
Dossier Caching
Caching all the dossier data and definition
Caching in Library
Configurable from Workstation (highly personalized!)
Caching Subscriptions
Schedule a recurrent cache update subscription to speed up first response time
Narrowing down the data
Dataset-level filters vs dossier-level filters
Filtering data at the individual visualization level
If using Reports, having a dataset-level Report Filter limits the data volume from the start
Using Prompts
Narrowing down the data
Dataset-level filters
• Report Filters
• Filters in View Report
Dossier-level filters
• Filter Panel
• Advanced Filter
• Filters at the Metric level (or Conditional Metrics)
Narrowing down the data with Prompts
Prompting Reports to limit the data they must fetch from the source
Using Data Blending the right way
Splitting your data in several datasets vs single consolidated dataset
Multiple joins across datasets and their performance implications
Data Blending and Global Lookup tables. How does this matter?
Controlling join behavior across datasets
Multiple Datasets vs a Single Consolidated Dataset
Consider implications of blending data from multiple datasets
OLAP vs Multi-Table Data Import Cubes
Two types of Cubes:
• OLAP: Governed project schema, typically created by BI department. Data is denormalized.
• MTDI Cubes: Sometimes created by business users. Mini-schema in-memory. Should be certified by Admin so they are governed (that way this data can be federated to others). Data can be normalized.
Using Data Blending the right way
One OLAP Cube (denormalized data) vs many smaller datasets
Using Data Blending the right way
Blending through a Multi-Table Data Import Cube (aka Super Cube)
Data blending between multiple datasets
Data Combination settings
• Dossier-level lookup tables for linked attributes: this allows populating full sets of distinct elements present in all datasets.
Load Chapters on Demand
• Run visualizations only of the currently viewed Chapter. Switching Chapters may be slower.
OLAP vs MTDI
OLAP Cubes vs Multi-Table Data Import Cubes
Multi-Table Data Import Cubes
What can impact performance? (Hint: joins)
• One-to-many relationships. Correctly set relationships to avoiding more expensive joins.
• Number of tables joined in query. Retrieving metric data from many tables in the MTDI cube.
• Security filters. CSI/SQL engine may “auto-join” table with user-requested data with necessary tables that contain secured attributes.
• Complex filters that will result in multiple passes and joins; especially if based on multiple attributes not present in the same table.
• Many-to-Many relationships usually unnecessary, so avoid defining them in the MTDI cube.
Cube Partitioning
Leverage parallel aggregation on multiple CPU’s by enabling partitioning
Data Blending on multiple datasets
Join Settings controlled via each dataset context menu
In-Memory calculations vs pre-calculated Columns
Derived Attributes and Groups are useful for ad-hoc reporting and self-service, but how do they scale?
How to make a dossier more scalable and better performing by moving these calculations to an In-Memory dataset, so they don’t have to calculate at the dossier execution time.
Performance considerations of Derived Attributes vs Data Blending (when they get calculated against a Global Lookup)
Using data wrangling instead of Derived Attributes, and why this may perform better.
Considerations with Derived Metrics (mention of “Use Lookup for Attributes” setting)
In-Memory calculations vs pre-calculated Columns
Using Data Wrangling instead of Derived Attributes
Visualization rendering time
Incremental fetch on Grid, but not in other visualizations
Performance considerations with Maps. Using clustering.
Performance considerations of Bar Charts, Scatter plot, Heatmap and other visualizations. Use filters to limit the volume of data points that the visualization will render.
Using Image (png or jpg) thresholds
In a nutshell
Watch for joins! Especially between unrelated attributes. Prefer inner joins.
Prefer In Memory cubes or cached Reports as datasets. Consider cache subscriptions.
Clean up your dossier. Remove unused objects and datasets.
One big dataset generally better than multiple smaller ones.
Use Derived Attributes, Groups and Derived Metrics only if necessary. Prefer having these calculations built on the dataset (pushed to DB, or done when publishing the Cube).
Try to avoid complex functions. Watch for “Smart Metrics”.
The more tables have to be accessed, the longer it takes to run.
Tips for Troubleshooting
Simple “binary search” on on Chapters, Pages, Visualizations, Derived Calculations, Filters
and Datasets
Capacity Planning Tool (Administrators)
Using Chrome’s Developer Tools to break down the response time (time on Server side vs
time on client/browser used for rendering)
Intelligent Server logs that can be used to further break down the response time.
Turning on Performance Statistics on MicroStrategy Web
• Datasets execution time• Time spent on calculations of Derived Attributes, Groups• Time spent on each individual visualization, including Derived Metrics calculations and other.• MCE Trace for Data Blending
Platform Analytics and other Tips for Troubleshooting
Platform Analytics
Captures telemetry from the MicroStrategy Platform in real-time and makes it available to administrators, developers and analysts to help them optimize performance
MicroStrategy Consulting
44
Application Performance Tuning Advisory
Best practice guidance to ensure your application performs seamlessly.
MicroStrategy.com/Services
Visit microstrategy.com/request-benefits to explore consulting services custom-built to help you become a more Intelligent Enterprise—and available at no cost to you.
Enterprise Support ProgramBecause we are vested in your success
Reinvesting in you.
Q + A
46