The Cloud Analytics Reference Architecture VP

download The Cloud Analytics Reference Architecture VP

of 20

Transcript of The Cloud Analytics Reference Architecture VP

  • 8/13/2019 The Cloud Analytics Reference Architecture VP

    1/20

    Harnessing Big Data to Solve Complex Problems:

    The Cloud Analytics Reference Architecture

  • 8/13/2019 The Cloud Analytics Reference Architecture VP

    2/20

  • 8/13/2019 The Cloud Analytics Reference Architecture VP

    3/20

    Table of Contents

    Introduction ....................................................................................................................... 1

    Cloud Analytics Reference Architecture ............................................................................... 1

    Using All the Data .............................................................................................................. 3

    Better Questions and Answers ............................................................................................ 3

    A Deliberate Approach to Unlocking the Promise of Big Data .................................................. 5

    A Strong Foundation ........................................................................................................... 5

    The Data Lake ................................................................................................................... 6

    The Analytics .................................................................................................................... 7

    Visualization and Interaction ............................................................................................. 10

    Conclusion....................................................................................................................... 13

  • 8/13/2019 The Cloud Analytics Reference Architecture VP

    4/20

  • 8/13/2019 The Cloud Analytics Reference Architecture VP

    5/20

    Introduction

    The ability to compete and win in the information

    economy will come from powerful analytics that draw

    insights and value from data, and from high-fidelity

    visualizations that present those insights in impactful,

    intuitive ways.

    Government and business organizations are

    increasingly looking to big data to create new

    opportunities and solve their most complex real-world

    problems. They hope to tap into the many rich new

    sources of information that are emergingfrom online

    consumer behavior to social networking to the growinguse of electronic health records. At the same time,

    organizations are building up immense databases of

    their own, using rapid advances in cloud storage.

    Despite this new wealth of information, the key to

    unlocking its value seems to be missing. Organizations

    are discovering that the size and diversity of big data

    make it difficult to use in a meaningful way. They are

    never able to explore all of the information at once, and

    so are unable to track overall trends, or find the kinds

    of larger, unexpected patterns that can lead to valuable

    knowledge and insight.

    The problem is that organizations are limited by

    computing techniques developed long before big

    data arrived on the scene. With these conventional

    techniques, only narrow slices of data can be

    accessed at any time. Datasets and analytics are

    highly structured, and must be torn down and rebuilt

    with each new line of inquiry. Information that does

    not neatly fit into such rigid structuressuch as

    Twitter and video feedsoften cannot be used. While

    organizations are collecting more information thanever, the data tends to reside in silos that are difficult

    to integrate. Cloud storage, despite its benefits, has

    not eliminated the data silosit has simply made

    them fatter.

    In addition, few of the worlds IT systems are ready for

    the technology revolution happening as organizationsseek to transform how they use data. As illustrated

    in Exhibit 1, their infrastructures. face three major

    challenges:

    To help organizations overcome these hurdles and

    prepare for whats next, Booz Allen has pioneered an

    entirely new approach for the implementation of big

    datain the digital enterprisea way of using technology,

    machine-based analytics, and human-powered analysis

    to create competitive and mission advantage.

    This innovative approach, known as the Cloud Analytics

    Reference Architecture, removes the conventional

    constraints, enabling organizations to integrate all

    of their available data, along with information from

    multiple outside data sources. This powerful capability

    makes it possible for organizations to find value, guide

    strategy, and solve mission and business problems

    long considered too complex.

    Cloud Analytics Reference ArchitectureThe Cloud Analytics Reference Architecture has been

    proven in high-stakes environments. It was developed

    through/during an ongoing collaboration between

    Booz Allen and the US government to leverage big

    data in the search for terrorists and other threats.

    Intelligence analysts are currently using the Cloud

    Analytics Reference Architecture to integrate the

    wide entire spectrum of intelligence sources, and

    apply sophisticated analytical tools to find hidden

    The Cloud Analytics Reference Architecture

    Harnessing Big Data to Solve Complex Problems

    Volume

    Not enough storage

    capacity and

    analytical capabilities

    to handle massive

    volumes of data

    Variety

    Data comes in many

    different formats,

    which can be difficult

    and expensive to

    integrate

    Velocity

    Inability to

    process data in

    real time in order

    to extract the

    most value from it

    Exhibit 1| Data Challenges in the Era of Big Data

    Source: Booz Allen Hamilton

  • 8/13/2019 The Cloud Analytics Reference Architecture VP

    6/20

    2

    connections and patterns. Similarly, the US military is

    using the Cloud Analytics Reference Architecture to

    provide information on insurgents and others who are

    planting improvised explosive devices (IEDs) and other

    bombs. The capability of the Cloud Analytics ReferenceArchitecture to analyze a vast array of disparate

    data sources is providing military commanders with

    unprecedented situational awareness. Commanders

    have reported that the approach is saving lives.

    In another example, Booz Allen and a large hospital

    chain in the Midwest have demonstrated how the

    Cloud Analytics Reference Architecture can also save

    lives in medicine. By analyzing a large volume of

    electronic health records, researchers have discovered

    unexpected patterns over time in the vital signs offormer patients whose serious, often hospital-acquired

    infections suddenly became life-threatening. Using

    those insights, the hospital system has begun a

    program to monitor current patients with infections,

    watching whether their vital signs are following the

    same patterns. This procedure is providing doctors with

    an early warning that their patients conditions may be

    deteriorating.

    Booz Allen is now adapting the Cloud Analytics

    Reference Architecture for the larger government and

    business communities. This groundbreaking approach

    can be applied to a broad range of critical problems,

    such as:

    Looking across large populations of internal and

    external network users to identify those most likely

    to steal information and commit fraud. The Cloud

    Analytics Reference Architecture can achieve this

    by integrating data from sources as varied as social

    media sites, public records and even users patterns

    of computer behavior.

    Uncovering threats to the stability of the US financial

    system, by discovering hidden patterns in the

    combined data of an array of government regulators

    and private financial institutions.

    Exhibit 2| Booz Allens Cloud Analytics Reference Architecture

    Source: Booz Allen Hamilton

    Streaming

    Indexes

    Human Insights and ActionsEnabled by customizable interfaces

    and visualizations of the data

    Analytics and Services

    Your tools for analysis, modeling,

    testing, and simulations

    Data Management

    The single, secure repository

    for all of your valuable data

    Infrastructure

    The technology platform for storing

    and managing your data

    Services (SOA)

    Analytics and

    Discovery

    Views and Indexes

    Data Lake

    Metadata Tagging

    Data Sources

    Infrastructure/Management

    Visualization,

    Reporting, Dashboards,and Query Interface

  • 8/13/2019 The Cloud Analytics Reference Architecture VP

    7/20

    Enabling two or more government investigative

    agencies with a shared mission to integrate their

    intelligence and create a common operating

    picturewhile precisely adhering to the restrictions,

    authorities and security issues pertaining to eachorganizations data.

    The Cloud Analytics Reference Architecture represents

    not an incremental step forward, but rather an entirely

    new approachone specifically designed to solve

    organizations real-world problems, and provide them

    new opportunities, by harnessing the power of big data.

    Using All the DataThe Cloud Analytics Reference Architecture takes

    advantage of the immense storage ability of the cloud,but in a completely new way. An organizations repository

    of information is no longer stored in rigid, regimented

    data structures, but rather is consolidated in a vast

    pool, or data lake. Every inquiry can make use of this

    entire pool, along with information from multiple outside

    data sourcesand it is all available at once. Users no

    longer need to move from database to database, pulling

    out specific information. And because there are no data

    silos, there is no need to integrate them.

    What results is not chaotic or overwhelming. Rather,

    the rich diversity of information in the data lake

    becomes a powerful force. The data lake is more than

    a means of storageit is a medium expressly designed

    to foster connections in data. And the Cloud Analytics

    Reference Architecture explores those connections

    to search for valuable correlations and patterns. This

    actually reduces the complexity of big data, making it

    manageable and useful, and creating efficiencies.

    The crucial role of the data lake can be seen when

    the Cloud Analytics Reference Architecture is viewed

    in layers (see Exhibit 2). The data lake is supportedfrom below by the cloud storage infrastructure, and

    in turn supports the computer analytics. All of these

    elements support the final phase, the visualization and

    interaction, where human insight and action take place.

    Better Questions and AnswersWith the conventional approach, we do not really ask

    questions of the datawe create hypotheses, and then

    test the data to see whether we are right. In order to

    pose these hypotheses, we have to guess in advance

    what the answers might be, often a difficult proposition.

    We also need to be familiar with the data we are

    considering, including where it is (in what specific

    datasets or databases), what format it is in, and even

    to a large extent what the data itself contains.

    That level of knowledge might be achievable when

    we are working with a limited number of datasets

    or databases, but not with the vast amounts of

    information now becoming available to us. We often

    have to put aside, or assume away, factors that wemight actually believe are critical. And so we end up

    settling for marginal questions, and marginal answers.

    Because the data lake removes the need for rigid data

    structures, all of these constraints are removed. We no

    longer need to pose hypotheses of defined data, and

    so can ask more big-picture, intuitive questions.

    The Cloud Analytics Reference Architecture also allows

    us to more readily look for unexpected patternsit

    lets the data talk to us, so to speak. While we can

    look for patterns with the conventional approach, wecan only do so within our narrowly defined datasets

    and databases, and we have to know in advance

    what patterns we might be looking for. With the Cloud

    Analytics Reference Architecture, we can discover

    unexpected patterns that naturally emerge in the data.

    This capability creates opportunities to predict the

    future by looking at the past. Because the Cloud

    Analytics Reference Architecture can store and analyze

    vast amounts of information, organizations can look

    for patterns in historical data, and see whether such

    patterns are repeating today. This can, for example,

    help regulators determine whether financial institutions

    are repeating the mistakes of the past. And it can help

    medical researchers look for patterns in the historical

    heath records of thousands of previous patients, to

    help treat patients today.

  • 8/13/2019 The Cloud Analytics Reference Architecture VP

    8/20

    4

  • 8/13/2019 The Cloud Analytics Reference Architecture VP

    9/20

    A Deliberate Approach to Unlocking thePromise of Big DataBooz Allens Cloud Analytics Reference Architecture

    provides a holistic approach to people, processes, and

    technology in four tightly integrated layers, as depicted

    in Exhibit 3. By design, these layers work seamlessly

    together to:

    Allow distributed storage and replication of bytes

    across networks and hardware that are assumed

    to fail at any time

    Allow for massive, world-scale storage that separates

    metadata from data

    Support a write-once, sporadic append, read-many

    usage structure

    Store records of various sizes, from a few bytes up

    to a few terabytes in size

    Allow compute cycles to be easily moved to the data

    store, instead of moving the data to a processer farm.

    The Cloud Analytics Reference Architecture has an

    inherent flexibility that enables organizations to pursue

    new analytical approaches with few if any changes to

    the underlying infrastructure. For example, the data

    lake is easily expandable. Because it stores information

    so efficiently, it can accommodate both the natural

    growth of an organizations data, as well as the

    addition of data from multiple outside sources. At the

    same time, the Cloud Analytics Reference Architecture

    replaces the current, custom-built analytic and

    visualization tools with ones that can easily be adapted

    for almost any number of inquiries.

    A Strong FoundationWith the conventional approach, organizations must

    continually reinvest in infrastructure as analytic needs

    change. Building bridges between silos, for example,

    typically requires reconfiguring and even expanding

    Infrastructure

    Data Management

    Human Insights and Actions

    Analytics and Services

    Human Insights and Actions

    Building on results and outputs from various analytical methods, multipledata visualizations can be created in your new cloud analytics solution.These are used to compose the interactive, real-time dashboard interfacesyour decision makers and analysts need to make sense of your data.

    Analytics and ServicesBoth traditional and Big Data tools and software can operate on theinformation stored in your Data Lake, producing advanced specific analysis,modeling, testing, and simulations you need for decision making.

    Data ManagementYour Data Lake is a secure, distributed repository of a wide variety ofdata sources. Security, metadata, and indexing of Big Data are enabledby distributed key value systems (NoSQL), but the Architecture allows for

    traditional relational databases as well.

    InfrastructureThis foundational layer allows for quick, streamlined, low-risk deploymentof the cloud implementation. The plug-and-play, vendor-neutral framework isunique to Booz Allen.

    Exhibit 3| Layers within the Cloud Analytics Reference Architecture

    Source: Booz Allen Hamilton

    LAYER 1

    LAYER 2

    LAYER 3

    LAYER 4

  • 8/13/2019 The Cloud Analytics Reference Architecture VP

    10/20

    6

    the infrastructure. With the Cloud Analytics Reference

    Architecture, the infrastructure becomes a stable

    platform to support all aspects of cloud computing.

    With the top-to-bottom flexibility of the Cloud Analytics

    Reference Architecture, organizations do not need tocontinually rebuild and reconfigure their infrastructure.

    Their initial investment in infrastructure is both enduring

    and cost-effective.

    The Data LakeWith the conventional approach, the computer finds

    information by looking in a particular database. With

    the data lake, information is located in an entirely

    different wayby tags, or details that have been

    embedded in them for sorting and identification.

    For example, an investors portfolio balance (the data)

    is generally stored with identifying information such

    as the name of the investor, the account number, one

    or more dates, the location of the account, the types

    of investments, the country the investor lives in, and

    so on. This metadata is what gets tagged, and is

    located by the computer during inquiries.

    The tags themselves are also a way of gaining

    knowledge from the data. In the example above,

    the tags might allow us to look for, say, connections

    between investors countries and their types of

    investments. The basic datathe portfolio balancemight not even be part of the inquiry. Such connections

    can be made with the conventional approach, but

    only if the custom-built databases and computer

    analytics have already been designed to take them into

    consideration. As illustrated in Exhibit 4, with the data

    lake, all of the data, metadata and identifying tags are

    available for any inquiry or search for patterns. And,

    such inquiries or searches can pivot off of any one of

    those pieces of information. This greatly expands the

    usability of the data available to an organization. It

    actually makes big data even bigger.

    In addition, the data lake smoothly accepts every type

    of data, including unstructured datainformation that

    has not been organized for inclusion in a data base. An

    example might be the doctors and nurses notes that

    accompany a patients electronic health records, or

    information from social networking sites.

    Exhibit 4| Data Management Architectural Model

    Source: Booz Allen Hamilton

  • 8/13/2019 The Cloud Analytics Reference Architecture VP

    11/20

    Two other critical emerging data types are batch

    and streaming. Batch data is typically collected on

    an automated basis and then delivered for analysis

    en massefor example, the utility meter readings

    from homes. Streaming data is information from acontinuous feed, such as video surveillance.

    Much of the flood of big data is unstructured, batch

    and streaming, and so it is essential that organizations

    have the ability to make full use of all types. With the

    data lake, there is no second-class or third-class data.

    All of it, including structured, unstructured, batch and

    streaming, is equally ingested into the data lake, and

    available for every inquiry.

    This mass of information is not random and chaotic,

    but rather is purposeful. The data lake is like a viscousmedium that holds the data in place, and at the same

    time fosters connections. Because the data is all in

    one place, it is, in a sense, all connected.

    As an example, cybersecurity experts trying to identify

    internal and external network users most likely to steal

    information and commit fraud might consolidate a

    broad range of disparate information into a data lake.

    In addition to unstructured information about individuals

    from social media sites, the data lake could include

    thousands of public records sources, from bankruptcyand criminal histories to trajectories of zip codes.

    These might show out-of-the ordinary improvements or

    declines in personal finances (possibly indicating that

    an individual has committed fraud or is in dire straits

    and may have motivation to do so). The data lake

    might also include users computer behavior, enabling

    the analytics to look for anomalies. Is the customer or

    employee staying on the network far longer than usual,

    or visiting new and different parts of the network, or

    engaging in activities uncharacteristic of prior use?

    With conventional methods, each potential datasource would have to be examined separatelyand

    the results would be difficult to integrate. A data lake

    would remove these constraints, making it possible

    for the analytics to look for patterns and connections

    in all of the available data at once, and to compile

    sophisticated risk scores on every internal and external

    user of the system.

    The AnalyticsThe data lake supports a two-step process to analyze

    the data. In the first step, the pre-analytical tools filter

    and organize information from the data lake. That sets

    the stage for computer analyticsin the next layer

    upto search for valuable knowledge.

    Extracting the Data

    Pre-analytics use the metadata tags to locate

    the relevant data from the data lake and give it

    an underlying organization. For example, in the

    collaboration between Booz Allen and the Midwest

    hospital system, the electronic health records of

    more than a thousand previous patients with serious

    infections were ingested into a version of a data

    lake. Special pre-analytics pulled out the patients

    vital signs, and thenusing the time-and-date

    stamps embedded in the recordsorganized them in

    chronological order. That enabled analytics, in the next

    step, to search for patterns in the way the patients

    vital signs changed over time.

    Although pre-analytical tools are commonly used inthe conventional approach, they are typically part of

    the rigid structure that must be torn down and rebuilt

    as inquiries change. Because such work is resource-

    intensive, only a limited number of such tools can be

    built, severely hampering an organizations ability to

    make full use of its data. By contrast, the pre-analytics

    in the Cloud Analytics Reference Architecture are

    designed for use with the data lake, and so are not

    part of a custom-built structure. They are both flexible

    and reusable, giving organizations almost endless

    windows into their data. Moreover, they are designed tobe interoperable from the moment they come on- line,

    creating a set of easily shared services for all users of

    the data.

  • 8/13/2019 The Cloud Analytics Reference Architecture VP

    12/20

    8

  • 8/13/2019 The Cloud Analytics Reference Architecture VP

    13/20

    Finding Connections and Patterns

    Once the data has been prepared by the pre-analytics,

    the search for knowledge and insight can begin. As with

    the other elements of the Cloud Analytics Reference

    Architecture, computer analytics are used in an entirely

    new way (see Exhibit 5). Two key types of analytics are:

    Ad hoc queries. These are the analytics that ask

    questions of the data. While in the conventional

    approach the analytics are part of the narrow,

    custom-built structure, here they are free to pursue

    any line of inquiry.

    Machine learning. This is the search for patterns.

    Because all of the data is available at once, and

    because there is no need to hypothesize in advance

    what patterns might exist, these analytics can look

    for patterns that emerge anywhere across the data.

    Giving Computers More Work

    A key feature of the Cloud Analytics Reference

    Architecture is that it allows computers to take over

    much of the work humans are doing now. Conventional

    methods require that people play a large role in

    processing the dataincluding selecting samples to be

    analyzed, creating data structures, posing hypotheses,

    and sifting through and refining results. That intense

    level of effort may be workable for small amounts

    of data, but no organization has the personnel or

    resources to use such methods to process big data.

    The Cloud Analytics Reference Architecture solves

    this problem by giving a great deal of that work to

    the computers, particularly tasks that are repetitiveand computationally intensive. This reduces human

    error, and substantially speeds up the work. When we

    use the Reference Architecture to pose more intuitive

    questions, or to find patterns, we are essentially asking

    Exhibit 5| Analytics and Services Architectural Model

    Source: Booz Allen Hamilton

  • 8/13/2019 The Cloud Analytics Reference Architecture VP

    14/20

    10

    the computer to take us as close as it can to finding

    the answers we want. It is then up to us, using our

    cognitive skills, to find meaning in those answers.

    By separating out what the computer can dothe

    analyticsand what only people can dothe actual

    analysisthe Cloud Analytics Reference Architecture

    greatly eases the human workload. It is a division

    of labor that frees subject matter experts to look at

    the larger picture. At the same time, the Reference

    Architecture rapidly highlights areas that analysts

    should not waste their time exploringenabling them

    to focus their resources in the right direction.

    For example, agencies that investigate consumer

    complaints against financial institutions often do not

    know which individual complaints are indicative of abroader pattern of consumer abuse, and so deserve

    the most attention. Investigators rarely have the time

    to sort through the vast array of sources that might

    provide valuable clues, such as blogs and social media

    sites where consumers commonly air their grievances.

    With a data lake that included all such available

    information, the Reference Architectures analytics

    could quickly identify patterns, such as consumer abuse

    affecting large numbers of people. Investigators could

    then focus their resources on the most serious cases.

    Security of the Data

    In its ability to integrate disparate data sources,

    the Cloud Analytics Reference Architecture makes it

    possible for organizations to easily share information,

    confident that security, privacy and other rules

    governing the data will be strictly maintained. With

    the conventional approach, the primary obstacle to

    information-sharing is not technology, but rather the

    concern that secure information will be compromised.

    Investigative agencies, for example, worry that

    confidential sources will be inadvertently revealed.

    Hospitals and doctors are concerned that patient

    privacy will be violated.

    Those concerns go away with the Cloud Analytics

    Reference Architecture. As information is put into

    the data lake, the relevant restrictions, authorities,

    and security issues are tagged. All or portions of

    documents are tagged as well, indicating the security

    and privacy levels of specific information. Using these

    tags, organizations can establish rules regarding which

    information can be shared, with whom, and underwhat circumstances. If new information agreements

    are instituted, organizations do not need to re-tag the

    datathey simply change the rules regarding the tags

    already in place.

    The security of data in the Cloud Analytics Reference

    Architecture has been proven to work in very secure

    environments within the US government, where the

    highest levels of precision in security and privacy

    are required.

    Visualization and InteractionDecision makers may be understandably concerned that

    big data will be overwhelming, and lead to information

    overload. Quite the opposite is true. The Cloud Analytics

    Reference Architecture addresses the issue head-on

    by incorporating the visualizationhow the knowledge

    is presented to usinto the analytics from the outset.

    That is, the analytics not only conduct the inquiries,

    they help contextualize and focus the results.

    This enables analysts to more easily make sense of the

    information, to frame better, more intuitive inquiries,

    and to gain deeper insights. Building the visualization

    into the analytics has another advantageit provides

    the ability for quick and effective feedback between the

    two layers, so that the presentation of the findings can

    be continually refined for the decision maker.

    The visualization tools also make it possible for

    different organizations to tailor how they see the same

    data. For example, two agencies of the Department

    of Homeland SecurityImmigration and Customs

    Enforcement (ICE) and Customs and Border Protection(CBP)may want to visualize certain data in their

    own way. ICE, which has an investigative focus, might

    prefer that the visualization show how individuals are

    connected to one another. CBP, which is interdiction

    focused, may want the same data displayed

    geographically and temporally, to understand where

  • 8/13/2019 The Cloud Analytics Reference Architecture VP

    15/20

    1

    and when most activity occurs. The Cloud Analytics

    Reference Architecture easily accommodates both

    views of the dataand any number of others, as

    illustrated in Exhibit 6.

    Another important breakthrough is that analysts,

    or subject matter experts, can explore the data

    without the need for computer experts to serve as

    intermediaries. Because of the high level of computer

    expertise needed to design custom data storage

    structures and analytics, much of the analysis in the

    conventional approach is conducted by computer

    scientists, computer engineers, and mathematicians

    acting as agents for the subject matter experts.

    They are typically the ones who translate the overall

    goals of the business and government analysts into

    the language of the machine. Whenever there is a

    middleman in any field, things tend to get lost in the

    translation, and data analysis is no exception. Here, it

    leads to a disconnect between the people who need

    knowledge and insight (the subject matter experts) and

    the data itself. It also substantially slows the process.

    In the top layers of the Cloud Analytics Reference

    Architecture, the middleman syndrome disappears.

    The ability to ask intuitive questions, and to look for

    patterns, provides the analysts with direct access to

    the data. That gives them the flexibility they need to

    experiment and explore, and allows the system to reachmaximum velocity. The computer scientists, computer

    engineers and mathematicians still play a key role, but

    now are no longer the ones who drive the inquiries into

    the data.

    For example, investigators who suspect that credit

    card fraud may be occurring are often hampered by

    the need to go through computer experts to query the

    data. Their request may be one of many, and by the

    time they get back the information they need to act,

    the criminals have often made large purchases on

    the credit cards. With the Cloud Analytics Reference

    Architecture, however, investigators could query the

    data themselves, quickly pinpoint the fraud, and take

    action in time to stop the activity. Subject matter

    experts in other fields, such as financial analysts,

    medical researchers and policy experts, can have

    similar direct access to the data.

    Exhibit 6| Human Insights and Actions Architectural Model

    Source: Booz Allen Hamilton

  • 8/13/2019 The Cloud Analytics Reference Architecture VP

    16/20

    12

  • 8/13/2019 The Cloud Analytics Reference Architecture VP

    17/20

    1

    With the Cloud Analytics Reference Architecture,

    the flood of information is not overwhelmingit is

    readied for action as never before. This breakthrough

    in visualization could have as profound an effect on

    decision making as bar graphs and pie charts did in the1950s and 1960s, when statistics became widely used

    in business. Those visuals presented all the essential

    information at a glance, changing the nature of decision

    making. The Cloud Analytics Reference Architecture will

    do the samebut this time with big data.

    ConclusionThe opportunities offered by the Cloud Analytics

    Reference Architecture will not emerge on their own

    conscious effort and deliberate planning are needed.

    Unless organizations make the right infrastructure

    decisions, they cannot hope to build a data lake.

    Unless they make the right data management

    decisions, they will never break free from the rigid

    data and analytic structures that are so limiting. The

    Cloud Analytics Reference Architecture can be seen as

    a road map for that decision making, one that shows

    the importance of a holistic, rather than piecemeal,

    haphazard approach. Each element is closely tied

    to each of the other elements, and so all must be

    considered together.

    The Cloud Analytics Reference Architecture is no more

    expensive to build than one based on the traditional

    approach, and is considerably more cost-effective in the

    long run. Because the elements of the Cloud Analytics

    Reference Architecture are largely reusable, they canscale an organizations big data in an affordable way.

    In this time of doing more with less, the Cloud

    Analytics Reference Architecture enables organizations

    to leverage the substantial investment the US

    government has already made in this area. Many of

    the same data challenges business and government

    organizations currently face are being successfully

    addressed by military and non-military agencies.

    Organizations now have an opportunity to take

    advantage of the advanced technologies and bestpractices that have led to that success.

    It is impossible to harness big data with approaches and

    techniques designed for small data. But by reimagining

    how data can be stored, analyzed and visualized,

    the Cloud Analytics Reference Architecture gives

    organizations a powerful tool to solve their most complex

    problems, and drive mission and business success.

    See our ideas in action at boozallen.com/cloud.

  • 8/13/2019 The Cloud Analytics Reference Architecture VP

    18/20

    14

  • 8/13/2019 The Cloud Analytics Reference Architecture VP

    19/20

    1

    Booz Allen Hamilton has been at the forefront ofstrategy and technology consulting for nearly a century.

    Today, Booz Allen Hamilton is a leading provider of

    management and technology consulting services to

    the US and international governments in defense,

    intelligence, and civil sectors, and to major corporations,

    institutions, and not-for-profit organizations. In the

    commercial sector, the firm focuses on leveraging its

    existing expertise for clients in the financial services,

    healthcare, and energy markets, and to international

    clients in the Middle East. Booz Allen Hamilton offers

    clients deep functional knowledge spanning strategy andorganization, engineering and operations, technology,

    and analyticswhich it combines with specialized

    expertise in clients mission and domain areas to help

    solve their toughest problems.

    The firms management consulting heritage is the

    basis for its unique collaborative culture and operating

    model, enabling Booz Allen Hamilton to anticipate

    needs and opportunities, rapidly deploy talent and

    resources, and deliver enduring results. By combining

    a consultants problem-solving orientation with deep

    technical knowledge and strong execution, Booz

    Allen Hamilton helps clients achieve success in their

    most critical missionsas evidenced by the firms

    many client relationships that span decades. Booz

    Allen Hamilton helps shape thinking and prepare for

    future developments in areas of national importance,

    including cybersecurity, homeland security, healthcare,

    and information technology.

    Booz Allen Hamilton is headquartered in McLean,

    Virginia, employs approximately 25,000 people, and had

    revenue of $5.86 billion for 12 months ended March31, 2012. Fortunehas named Booz Allen Hamilton

    one of its 100 Best Companies to Work For for eight

    consecutive years. Working Motherhas ranked the firm

    among its 100 Best Companies for Working Mothers

    annually since 1999. More information is available at

    www.boozallen.com. (NYSE: BAH)

    About Booz Allen Hamilton

    Contacts

    Josh Sullivan, Ph.D.

    Vice President

    [email protected]

    301-543-4611

    Jason Escaravage

    Principal

    [email protected]

    703-902-5635

    Peter Guerra

    Senior Associate

    [email protected]

    301-497-6754

  • 8/13/2019 The Cloud Analytics Reference Architecture VP

    20/20

    The most complete, recent list of offices and their addresses and telephone numbers can be found on

    www.boozallen.com

    Principal Offices

    Huntsville, Alabama

    Sierra Vista, Arizona

    Los Angeles, California

    San Diego, California

    San Francisco, California

    Colorado Springs, Colorado

    Denver, Colorado

    District of Columbia

    Orlando, Florida

    Pensacola, Florida

    Sarasota, Florida

    Tampa, Florida

    Atlanta, Georgia

    Honolulu, Hawaii

    OFallon, Illinois

    Indianapolis, Indiana

    Leavenworth, Kansas

    Aberdeen, Maryland

    Annapolis Junction, Maryland

    Hanover, Maryland

    Lexington Park, Maryland

    Linthicum, Maryland

    Rockville, Maryland

    Troy, Michigan

    Kansas City, Missouri

    Omaha, Nebraska

    Red Bank, New Jersey

    New York, New York

    Rome, New York

    Dayton, Ohio

    Philadelphia, Pennsylvania

    Charleston, South Carolina

    Houston, Texas

    San Antonio, Texas

    Abu Dhabi, United Arab Emirates

    Alexandria, Virginia

    Arlington, Virginia

    Chantilly, Virginia

    Charlottesville, Virginia

    Falls Church, Virginia

    Herndon, Virginia

    McLean, Virginia

    Norfolk, Virginia

    Stafford, Virginia

    Seattle, Washington