Data Center Health Checkup

download Data Center Health Checkup

of 28

Transcript of Data Center Health Checkup

  • 7/30/2019 Data Center Health Checkup

    1/28

    Datacenter Health CheckTerri-Lynn ThayerAVP/CIO, Computing & Information ServicesEDUCAUSE Enterprise Technology Conference, May 2007

  • 7/30/2019 Data Center Health Checkup

    2/28

    Copyright Terri-Lynn B. Thayer 2007

    This work is the intellectual property of theauthor. Permission is granted for this material to

    be shared for non-commercial, educational

    purposes, provided that this copyright statement

    appears on the reproduced materials and notice isgiven that the copying is by permission of the

    author. To disseminate otherwise or to republish

    requires written permission from the author.

  • 7/30/2019 Data Center Health Checkup

    3/28

    Presentation Overview

    Datacenter blood pressure is rising Pre-assessment work

    Reliability goals

    Assess existing space, power, cooling,fire suppression, & security

    Assessment results & options

    Recommendations for a renovation

    Key considerations in site selection

  • 7/30/2019 Data Center Health Checkup

    4/28

    Datacenter Blood Pressure isRising Growing Demands

    Increased number of servers Increase in breadth of customer base taking in previously

    distributed computing

    Technology Changes Storage growth Power/cooling needs new boxes are smaller but they are

    energy hogs and they are hot, hot, hot (10X power for a fullypopulated rack and 3-4X power to cool it down)

    Business Resumption Concerns 24 x 7 demands Well publicized disasters - 911, Katrina Many university datacenters today lack standby power

    generation Research Support

    National trend for R1s to take on more support centrally

    Our Datacenters are Old

  • 7/30/2019 Data Center Health Checkup

    5/28

    Pre-Assessment Work

    Capacity Planning and Growth Analysis Determine a planning horizon

    Identify services likely to be provided from yourdatacenter during that time period

    Business Continuity and Disaster RecoveryObjectives

    How long can your University operate without afunctioning datacenter?

    Do you have a cold or hot site?

    Should you consider a multiple datacenterapproach?

  • 7/30/2019 Data Center Health Checkup

    6/28

    More Homework

    Research computing support decision isfundamental Review your insurance

    How much do you have Other requirements of your insurer

    Get professional help with the assessment Involve your university facilities engineers Seek advice from outside professionals who are

    familiar with modern datacenter design and

    operation What cost/risk profile is your institution

    comfortable with? Reliability goals

  • 7/30/2019 Data Center Health Checkup

    7/28

    Numerical

    Rankings Terminology Summary Definition

    (1) Unreliable Shared building power and cooling; no generator

    (2)Partially Isolated,

    Unreliable

    Dedicated power system; shared cooling system; unconditioned power; non-redundant air

    conditioning; no generator

    (3) Isolated UnreliableDedicated power and cooling systems; unconditioned power; non-redundant dedicated air

    conditioning units; no generator

    (4) Isolated ConditionedDedicated power and cooling systems; conditioned power; non-redundant dedicated A/C units; no

    generator

    (5) Isolated ImprovedDedicated power and cooling systems; uninterruptible power system; non-redundant dedicated A/C

    units; no generator

    (6)Isolated, Mostly

    Reliable

    Dedicated power and cooling systems; uninterruptible power system; redundant dedicated A/C units;

    no generator

    (7) ReliableDedicated power and cooling systems; uninterruptible power system; redundant dedicated A/C units;generator

    (8) Reliable Redundant

    Dedicated power and cooling systems; redundant UPS systems; redundant dedicated A/C units;

    redundant generators

    (9) Ultra-ReliableRedundant power train; redundant cooling system; redundant UPS systems; redundant dedicated A/Cunits; redundant generator systems; redundant fuel system

    (10) State of the Art

    Redundant power train; redundant cooling system, redundant UPS systems, redundant dedicated A/Cunits; redundant generator systems; redundant fuel system; site hardened for weather and geographic

    exposures; location minimizes exposure to jurisdictional closure from hazardous spill, terrorism, or

    similar risks.

    BRUNS-PAK Data Center Reliability Ranking

    BRUNS-PAK 999 NEW DURHAM ROAD EDISON, NJ 08817

    (732) 248-4455 Fax: (732) 248-3644 http://www.bruns-pak.com

  • 7/30/2019 Data Center Health Checkup

    8/28

    Datacenter Evaluation

    Space Electrical System

    Mechanical System

    Fire Protection System

    Security

  • 7/30/2019 Data Center Health Checkup

    9/28

    Space

    Square footage of conditioned space Raised floor

    Access

    Elevators Door size

    Machine room layout

    Furnishings, racks, command center Is the space expandable?

    Is this a multi purpose facility?

  • 7/30/2019 Data Center Health Checkup

    10/28

    Electrical/Power Considerations

    Source and costs Patch Panel/Power Control Units/ Power

    Distribution Units

    Standby power Uninterruptible Power Supply (UPS)

    Redundant/non-redundant

    Battery type (wet vs dry), capacity, and monitoring

    Generator Type

    Power and cooling

    Are the systems expandable? Delicate balance

    Other Surge protection, lightning protection, grounding

  • 7/30/2019 Data Center Health Checkup

    11/28

    Mechanical Systems Evaluation

    Cooling & humidity control Chilled water do you have a dedicated

    chiller?

    Computer Room Air Conditioner - CRACunits - # and location

    Capacity and reliability

    Heat Detectors

    Airflow distribution

    Water sensors

  • 7/30/2019 Data Center Health Checkup

    12/28

    Fire Protection Systems

    Detection Smoke alarms

    Heat detectors

    Air sampling

    Abatement

    Halon system (production banned in 1994)

    Full flooding clean agent system

    FM-200, NAF, Inergen etc Sprinkler system

    Wet

    Pre-action

  • 7/30/2019 Data Center Health Checkup

    13/28

    Security

    Physical access to the facility elevators & doors

    Caged areas and visitors

    Multi-purpose facility Door access system

    Windows

    Monitoring Closed circuit TV

  • 7/30/2019 Data Center Health Checkup

    14/28

    Ancillary Services and Support

    Ancillary Services Tape storage Secure storage/staging Paper storage Test/setup lab Printers and print support General storage room Break room

    Other staffing and services which are

    provided from your datacenter Machine hosting and associated SLAs

    others University departments groups external to the University

  • 7/30/2019 Data Center Health Checkup

    15/28

    Assessment Results & Options

    Most of us will find that ourdatacenters are not adequate for theanticipated growth over the next five

    years Majority will identify power and

    cooling as the most significant issue

    Cooling and the power to cool will be thenumber one issue

    Space constraints will be the runnerup

  • 7/30/2019 Data Center Health Checkup

    16/28

    Options

    Renovate Build a new datacenter

    Both of the above

    Multi-datacenter campus

    Outsource or

    Hosting

    d f

  • 7/30/2019 Data Center Health Checkup

    17/28

    Recommendations forRenovation

    Implement standby power generationcapable of supporting both power andcooling

    Remove ancillary services from machine

    room and relocate to other spaces

    Trade off between space and density is acomplex issue

    High density racking results in significant heatand power provision issues

    It is general cheaper to provide more spacethan to keep a small space with high density

    equipment adequately powered and cooled

  • 7/30/2019 Data Center Health Checkup

    18/28

    Improve Air Flow & Circulation

    Provide additional space between racks topromote air circulation Open up plenum space by relocating

    cabling to overhead trays Increase height of the raised floor if

    possible Consider new cooling solutions and rack

    technologies everything old is new again Chilled water is far more efficient than cool air

    for heat removal Reconfigure the layout to implement a

    double hot aisle/cold aisle configuration Distribute high density racks

  • 7/30/2019 Data Center Health Checkup

    19/28

    Double Hot Aisle/Cold Aisle

    hot aisle/cold aisle layout is wherecold air is segregated in front ofequipment cabinets and hot exhaustair is expelled behind equipment

    cabinets. This layout eliminates thedirect transfer of hot exhaust air fromone system into the intake air of

    another system Double

    A CRAC unit is located between two hotaisles

  • 7/30/2019 Data Center Health Checkup

    20/28

    Site Selection for a New Facility

    An opportunity to consider cost Corporate world has moved their

    datacenters in some cases quite remotefrom the rest of their operation whichallows them to consider

    Power costs

    Real estate cost

    Labor costs

  • 7/30/2019 Data Center Health Checkup

    21/28

    Look for a Old Supermarket

    Single story Single use facility

    Slab

    Few windows

    Lots of open space around thebuilding

    Loading docks and delivery truckaccess

  • 7/30/2019 Data Center Health Checkup

    22/28

    Other Considerations

    Voice and data connectivity Flooding and other weather related

    issues

    If moving to a multi-datacenterapproach as part of a businesscontinuity plan then considerationshould be given to put the two

    datacenters at a sufficient distance toreduce dependence on the samepower grid and to minimize weatherand other regional disasters

  • 7/30/2019 Data Center Health Checkup

    23/28

    Staffing Implications

    Data Center Managers will need to be moreskilled in the area of environmental issues,engineering, and server technologies

    Facilities organizations may need to devote

    more time and specialization to cooling andpower technologies related to thedatacenter

    Managing data center renovation or buildprojects will be resource intensive and mayresult in downtime for key services

  • 7/30/2019 Data Center Health Checkup

    24/28

    Go Green

    Reduce energy costs (datacenter build maybe more expensive)

    Legislation

    Environmental concerns and institutionalplans to reduce carbon emissions

    Vendor products

    Rack and server cooling technologies

    CO2 for cooling, DC power systems

    Design Considerations

    Solar panels and wind energy

    Heat recycling

  • 7/30/2019 Data Center Health Checkup

    25/28

    Data Center

    Adata center is a facility used tohouse computer systems andassociated components, such as

    telecommunications and storagesystems. It generally includesredundant or backup power supplies,redundant data communications

    connections, environmental controls(e.g., air conditioning, firesuppression) and security devices

  • 7/30/2019 Data Center Health Checkup

    26/28

    TierLevel Requirements

    1 Single non-redundant distribution path serving the IT equipmentsNon-redundant capacity componentsBasic site infrastructure guaranteeing 99.671% availability

    2 Fulfils all Tier 1 requirementsRedundant site infrastructure capacity components guaranteeing 99.741%availability

    3 Fulfils all Tier 1 & Tier 2 requirementsMultiple independent distribution paths serving the IT equipmentsAll IT equipments must be dual-powered and fully compatible with the topologyof a site's architectureConcurrently maintainable site infrastructure guaranteeing 99.982% availability

    4 Fulfils all Tier 1, Tier 2 and Tier 3 requirementsAll cooling equipment is independently dual-powered, including chillers andHeating, Ventilating and Air Conditioning (HVAC) systemsFault tolerant site infrastructure with electrical power storage and distributionfacilities guaranteeing 99.995% availability

  • 7/30/2019 Data Center Health Checkup

    27/28

    Conclusion

    Our datacenters are under significant stress If we take a close look we will find that most of us

    will experience power and cooling problems in thenear future. Cooling and the power to cool will bethe most substantial issue we face.

    There are new technologies and best practiceswhich will provide some relief

    Many of us will build new datacenters over thenext five years and we should consider remotelocations, outsourcing, and green IT solutions

    These project will require both significant financialresources as well as IT and Facilities staff time.We may need to employ new skill sets

    It is highly recommended that you engageprofessional assistance to evaluate your facilityand to assist in renovation and new build designs

  • 7/30/2019 Data Center Health Checkup

    28/28

    Additional Resources

    www.stonesoup.org Past Meetings

    Spring 2006 meeting

    Data Center Futures WorkshopPresentations

    http://www.stonesoup.org/http://www.stonesoup.org/