Cisco Data Center High Availability Clusters Design Guide 2010 OLD
Information Technology Availability Center
-
Upload
datacenters -
Category
Technology
-
view
185 -
download
2
Transcript of Information Technology Availability Center
Information Technology
Availability CenterDaily Infrastructure Report
Tuesday, June 15, 2010
Quick Reference Card
Infrastructure Incidents
Data Network
UH Systems
CVN
Environmentals
IT Security
Network Availability Statistics
Core Routers
Distribution Switches
Access Switches
Secondary Data Center
Peoplesoft Systems
Critical Systems
Cullen Oaks
Bayou Oaks
Infrastructure Tests
Phone Systems
Voicemail Systems
Virtual Private Networking
SiteScope A & P Statistics
DNS Services
E-Mail Services
Web Tech Services
BAC StatisticsSecondary Data Center (Mail)
Exchange - Outlook Web Access
Digital FAX Services
Mail - Web Access
PeopleSoft Change Management
PeopleSoft Finance
PeopleSoft SA/HR
WebCT Vista
Blackboard Vista
Sophos Information
Sophos Pure Message
Tropical Weather
InformationNo Storms Present
Change Management
CMC Events
CMC Calendar
InfrastructureIncidentsData Range: 11/16/09, 0600 – 11/17/09, 0600
No Incidents
Minor Service Affecting Incidents
Major Service Affecting Incidents
University of Houston Weather
IT Availability Center
Back to Top
Data Network11/16/2009, 0748 - 1017: Monitoring lost contact with multiple network switches at Cullen Oaks G building. A network analyst examined management configuration and was able to restore services. Inquiry # 153725 is closed.
UH SystemsNo Incidents
Compressed Video NetworkNo Incidents
EnvironmentalsNo Incidents
IT SecurityNo Incidents
NetworkAvailabilityStatisticsData Range: 11/16/09, 0000 - 2359
Network Availability
Availability 99.999% to 100%
Availability 80% to 99.998%
Availability 0% to 79.999%
Source Data: Here
Back to Top
Core Routers (High Impact) 100 % availabilityNo Incidents
Distribution Switches (Medium Impact) 100 % availabilityNo Incidents
Access Switches (Low Impact) 99.84 % availability11/05/2009, 1122– Present: Monitor lost contact with a wireless access point at Cullen College of Engineering building. Telecommunication technician is waiting for an analyst to reconfigure a new WAP to replace and restore the current one. Inquiry # 163151 is pending.
Secondary Data Center (SDC) 100 % availabilityNo Incidents
PeopleSoft Systems (PSFT) 100 % availabilityNo Incidents
Critical Systems (CRSYS) 100 % availabilityNo Incidents
Cullen Oaks (UHCO) 98.44 % availability11/16/2009, 0748 - 1017: Monitoring lost contact with multiple network switches at Cullen Oaks G building. A network analyst examined management configuration and was able to restore services. Inquiry # 153725 is closed.
Bayou Oaks (UHBO) 100 % availabilityNo Incidents
InfrastructureTestingResultsData Range: 11/16/09, 0600 – 11/17/09, 0600
No Incidents
Minor Service Affecting Incidents
Major Service Affecting Incidents
Back to Top
Phone Systems No Incidents
Voicemail SystemsNo Incidents
Virtual Private NetworkingNo Incidents
BusinessAvailabilityCenterStatisticsData Range: 11/16/09, 0000 - 2359
BAC Availability
Availability 99.999% to 100%
Availability 80% to 99.998%
Availability 0% to 79.999%
BAC Performance
Minimum Response Time
Average Response Time
Maximum Response Time
Legend
Transaction
A series of steps that a customer performs in an application, whose availability we are monitoring.
Total transactions
Displays the total number of transactions run during the measured period
Failed transactions
Displays the total number of failed transactions for the measured period
Performance Degradation
Displays the percentage of service degradation for the measured period due to sporadic transaction failures
Monitor Locations
BPMUH1 – E-Cullen; BPMUH2 – Cougar Place; BPMUH3 - McElhinney; BPMDT1 – Downtown Campus; BPMVC1 –
Secondary Data Center (Mail)Availability: 99.996 %
Alerts sent: 0 (0 with critical severity)
Total transactions: 24189
Failed transactions: 1
Summary of Events
11/16/2009, BPMUH564 reported a failed Options due to an error occurred.
Digital Fax ServiceAvailability:
Alerts sent: 0 (0 with critical severity)
Total transactions: 144
Failed transactions: 0
Summary of Events
No Incidents
Mail - Web AccessAvailability: 98.712 %
Alerts sent: 1 (0 with critical severity)
Total transactions: 15921
Failed transactions: 205
Summary of Events
10/06/2009, 0000 - Present: Multiple sites reported failed Address Book transactions. Frame not found in browser/dialog errors occurred. These repeated errors have been occurring for several days and have not been addressed by administrators.
11/16/2009, 2009: BPMUH501 reported a failed URL Web Page transaction due to the host could not be resolved..
PeopleSoft Change ManagementAvailability: 99.994 %
Alerts sent: 0 (0 with critical severity)
Total transactions: 16122
Failed transactions: 1
Summary of Events
11/16/2009, 1526: BPMUH593 reported a failed Login transaction due to an error occurred.
PeopleSoft FinanceAvailability: 99.975 %
Alerts sent: 2 (0 with critical severity)
Total transactions: 24192
Failed transactions: 6
Summary of Events
11/16/2009, 0920: BPMUHCL reported all failed transactions due to failure to connect to the server.
PeopleSoft Student Administration/Human ResourcesAvailability: 99.991 %
Alerts sent: 0 (0 with critical severity)
Total transactions: 32246
Failed transactions: 3
Summary of Events
11/16/2009, 1500 – 1501: BPMUH534 and BPMUH542 reported failed Class Search Catalog transactions due to the connection was reset by the peer errors.
WebCT VistaAvailability: 99.993 %
Alerts sent: 0 (0 with critical severity)
Total transactions: 44316
Failed transactions: 3
Summary of Events
11/16/2009, 1238, 1349, and 1438: BPMUHCL and BPMUHD reported failed URL Web Page transactions due to step download timeout errors.
Blackboard VistaAvailability: 99.831 %
Alerts sent: 31 (31 with critical severity)
Total transactions: 43691
Failed transactions: 74
Summary of Events
11/16/2009, 0500 – 0554: All sites reported failed UH WebCT Page and Help transactions due to internal server errors and not found errors.
11/16/2009, 1227, and 1802 - 1812: BPMUH551 BPMUH543 BPMUHV BPMUH553 BPMUH529 and BPMUHCR reported failed Login transactions due to step download timeout errors.
11/16/2009, 1803: BPMUH596 reported a failed URL Web Page transaction due to a step download timeout error.
SiteScopeAvailability &PerformanceStatisticsData Range: 11/16/09, 0000 - 2359
Back to Top
Terms Definitions
Uptime % Percent of successful transactions within 24 hours.
Error % Percent of transactions exceeding the error threshold within 24 hours. (Counts against Uptime %)
Warning % Percent of transactions exceeding the warning threshold within 24 hours. (Does not count against Uptime %)
Average The average amount of time for transactions to complete (24 Hour Average)
Max / Peak Largest single instance of a transaction in a 24 hour period
RTT Short for “Round Trip Time” – The amount of time for a transaction to be sent, processed, and returned to sender.
Availability The time that any specific service is operational, and “available” to customers
Domain Name Services
DNS Availability
Availability 99.999% to 100%
Availability 80% to 99.998%
Availability 0% to 79.999%
Warning Threshold <= 1 Second
Error Threshold >= 2 Seconds
Availability
Name Uptime % Error % Warning %
DNS - ACACIA.cc.uh.edu (KUHT) 100 0 0
DNS - CEDAR.cc.uh.edu (CC) 100 0 0
DNS - FIR.cc.uh.edu (PGH) 100 0 0
DNS - PONG.uh.edu (Dallas) 100 0 0
DNS - Post-Office.cougarnet.uh.edu 100 0 0
DNS - WALNUT.cc.uh.edu (CC) 100 0 0
DNS - ERDC1.er.uh.edu (CC) 100 0 0
DNS - ERDC3.er.uh.edu (MCPB) 100 0 0
DNS - ERDC5.er.uh.edu (PGH) 100 0 0
DNS - ERDC8.er.uh.edu (Dallas) 100 0 0
DNS Performance
Round Trip Time 0 - .999 Seconds
Round Trip Time 1 – 1.499 Seconds
Round Trip Time 1.5 – 1.999 Seconds
RTT more than 2 Seconds - See Availability
Performance
Name Measurement Max Avg
DNS - ACACIA.cc.uh.edu (KUHT) round trip time 0.91 sec 0.22 sec
DNS - CEDAR.cc.uh.edu (CC) round trip time 0.91 sec 0.24 sec
DNS - FIR.cc.uh.edu (PGH) round trip time 1.11 sec 0.23 sec
DNS - PONG.uh.edu (Dallas) round trip time 1.06 sec 0.27 sec
DNS - Post-Office.uh.edu round trip time 0.95 sec 0.21 sec
DNS - WALNUT.cc.uh.edu (CC) round trip time 0.91 sec 0.23 sec
DNS - ERDC1.er.uh.edu (CC) round trip time 1.08 sec 0.21 sec
DNS - ERDC3.er.uh.edu (MCPB) round trip time 1.3 sec 0.26 sec
DNS - ERDC5.er.uh.edu (PGH) round trip time 0.91 sec 0.22 sec
DNS - ERDC8.er.uh.edu (Dallas) round trip time 0.92 sec 0.27 sec
Summary of Events
No Incidents
Electronic Mail Services
E-Mail Availability
Availability 99.999% to 100%
Availability 80% to 99.998%
Availability 0% to 79.999%
Warning Threshold <= 900 Seconds
Error Threshold >= 3600 Seconds
Availability
Name Uptime % Error % Warning %
post-office.uh.edu to Exchange 99.61 0.39 0
post-office.uh.edu to mail.uh.edu DISABLED DISABLED DISABLED
Ping - piranha 100 0 0
Ping - capano 100 0 0
Ping - comal 100 0 0
Ping - pecos 100 0 0
Ping - paluxy 100 0 0
E-Mail Performance
Round Trip Time 0 - 300 Seconds
Round Trip Time 301 – 2500 Seconds
Round Trip Time 2501 – 3600 Seconds
RTT more than 3600 Seconds - See Availability
Performance
Name Measurement Peak Average
post-office.uh.edu to Exchange round trip time 275.69 sec 45.34 sec
post-office.uh.edu to mail.uh.edu round trip time n/a n/a
Ping - piranha round trip time 0.01 sec 0.01 sec
Ping - capano round trip time 0.02 sec 0.01 sec
Ping - comal round trip time 0.02 sec 0.01 sec
Ping - pecos round trip time 0.02 sec 0.01 sec
Ping - paluxy round trip time 0.02 sec 0.01 sec
Summary of Events
3:10 PM 11/16/09Mail - Post-Office.uh.edu to Exchange
send failed: unable to connect to server
Web Technology Services
Web Services Availability
Availability 99.999% to 100%
Availability 80% to 99.998%
Availability 0% to 79.999%
Warning Threshold <= 0 Seconds
Error Threshold >= 7.5 Seconds
Availability
Name Uptime % Error % Warning %
Ping - Blogs.uh.edu 100 0 0
Ping - CMS.uh.edu 100 0 0
Ping - Morpheus.matrix.uh.edu 100 0 0
Ping - Neo.matrix.uh.edu 100 0 0
Ping - Oracle.matrix.uh.edu DISABLED DISABLED DISABLED
Ping - Search.uh.edu 100 0 0
URL - Cinco Ranch (cincoranch.uh.edu)
100 0 0
URL - Sugar Land (sugarland.uh.edu) 100 0 0
URL - Blogs (blogs.uh.edu/evolvinguh)
100 0 0
URL - Information Technology (uh.edu/infotech)
100 0 0
URL - SDC (sdc-ws.uh.edu) 100 0 0
URL - Search (search.uh.edu) 100 0 0
URL - UH (www.uh.edu) 100 0 0
URL - UHSA (uhsa.uh.edu) 100 0 0
URL - Weather (uh.edu/weather) 100 0 0
URL - CMS (cms.uh.edu) 100 0 0
Apache Counters 100 0 0
Web Services Performance
Round Trip Time 0 - 2 Seconds
Round Trip Time 2 – 3 Seconds
Round Trip Time 4 – 5 Seconds
RTT more than 5 Seconds - See Availability
Performance
Name Measurement Max Avg
Ping - Blogs.uh.edu round trip time 0.01 sec 0.01 sec
Ping - CMS.uh.edu round trip time 0.02 sec 0.01 sec
Ping - Morpheus.matrix.uh.edu round trip time 0.02 sec 0.01 sec
Ping - Neo.matrix.uh.edu round trip time 0.02 sec 0.01 sec
Ping - Oracle.matrix.uh.edu round trip time n/a n/a
Ping - Search.uh.edu round trip time 0.02 sec 0.01 sec
URL - Cinco Ranch (cincoranch.uh.edu)
round trip time 0.64 sec 0.04 sec
URL - Sugar Land (sugarland.uh.edu) round trip time 0.7 sec 0.04 sec
URL - Blogs (blogs.uh.edu/evolvinguh)
round trip time 3.08 sec 0.62 sec
URL - Information Technology (uh.edu/infotech)
round trip time 1.39 sec 0.38 sec
URL - SDC (sdc-ws.uh.edu) round trip time 0.05 sec 0.02 sec
URL - Search (search.uh.edu) round trip time 0.84 sec 0.03 sec
URL - UH (www.uh.edu) round trip time 1.7 sec 0.99 sec
URL - UHSA (uhsa.uh.edu) round trip time 0.05 sec 0.01 sec
URL - Weather (uh.edu/weather) round trip time 0.97 sec 0.06 sec
URL - CMS (cms.uh.edu) round trip time 0.11 sec 0.06 sec
Apache Counters BusyWorkers 29 11.43
Apache Counters IdleWorkers 27 8.43
Summary of Events
No Incidents
SophosInformationData Range: Last 30 days
Legend
VirusMessages flagged containing a virus or virus & spam
SpamMessages flagged containing spam
OtherNo spam or virus detected
Key
Flagged Virus
Flagged Spam
Unflagged Messages
Back to Top
Sophos (Last 7 Days) Sophos Top 10 Virus Data (Yesterday)
Date/Time Messages Virus Spam Other % Flagged
11/10/2009 0:00 439792 315 46219 393258 10.58
11/11/2009 0:00 476775 25 40831 435919 8.57
11/12/2009 0:00 420035 34 45003 374998 10.72
11/13/2009 0:00 357806 247 42722 314837 12.01
11/14/2009 0:00 171931 21 38114 133796 22.18
11/15/2009 0:00 207758 11 31149 176598 15.00
11/16/2009 0:00 411917 64 41360 370493 10.06
Rule
W32/MyDoom-O
W32/Mabezat-B
SOPHOS_SAVI_FILE_ENCRYPTED
Troj/Agent-LNC
Mal/EncPk-LP
Mal/DownLnk-B
W32/Sality-AA
Mal/Basine-C
Mal/Zbot-P
Mal/ZipMal-B
Sophos PureMessage (Last 30 Days)
0
100000
200000
300000
400000
500000
600000
700000
18-Oct-2009 25-Oct-2009 1-Nov-2009 8-Nov-2009 15-Nov-2009
Date
Mes
sag
es
Other Spam Virus Total Linear (Virus) Linear (Spam) Linear (Other)
Tropical WeatherInformationData Range: June 1st Thru November 31st
Probability of Formation
Low - <20%
Medium – 20% - 50%
High - > 50%
Tropical Weather Outlook Discussion
Source Data: Here
Back to Top
Graphical Tropical Weather Outlook
ScheduledCMCEventsData Range: Next 7 days
Back to Top
10/26/2009, 0700 – 01/08/2010, 1700: Install 400kVA UPS. Commence installation of Liebert 400kVA Uninterruptible Power System (UPS) which is replacement for the removed 125kVA UPS. Impact set to Very High if not done; Computing Center will not be able to add additional equipment to the power infrastructure. Risk - To protect systems, systems will be required to shutdown during cutover of power.POC: Savage, James DChange No: 00003516Status: Approved
11/17/2009, 0500 -11/18/2009, 0800: Vmware Virtual Center upgrade to Vsphere 4 Vcenter. UpgradeVmware Virtual Center to the new version Vcenter - This work is a reschedule from an earlier item and is non-service affecting in the application of the upgrade. centralize enterprise virtual server management console won't be available, but management of systems and virtual Hosts will be available via direct connect and VM's wont' be affected as it does not touch the ESX host systems.POC: Frankfort, DavidChange No: 00003548Status: Approved
11/20/2009, 0400 - 0800: MS Patch install on Turning.cougarnet.uh.edu(prod). Install MS Patches on Turning.cougarnet.uh.edu(prod).POC: Gillit, JodyChange No: 00003542Status: Approved
11/20/2009, 0500 - 0800: Microsoft Updates Managed Servers. Microsoft Monthly updates and security patches. Install and reboot.POC: Frankfort, DavidChange No: 00003560Status: Pending
11/20/2009, 0500 -0800: MKeo Microsoft Monthly Security Patches Part2. Urgency: High - Majority of Patches are Security Related.Keo Microsoft Monthly Security Patches Part2. Category: Division of Labor. DES1, DES2, ECSSQL2,ECSCL1, ECSCL2, Lucerne, Geneva, QuestionMark1, QuestionMark2, TSSEPO2, Desktop2, Zurich, EcsEpo-test, Alertus, NetDoc. These servers are to be completed as part of the division of labor do not have test environments and being split over two weekends due to Maintenance Window Contraints. Impact LOW:POC: Michael, KeoChange No: 00003529Status: Approved
11/22/2009, 0600 - 1200: MS Patches for Data Warehouse (prod) Apply MS patches to servers: Wahoo, Redfish, Flounder, Cobia. DataWarehouse Windows Servers.POC: Gillit, JodyChange No: 00003541Status: Approved
11/22/2009, 0600 -1200: MS Patch Install on PS Prod Servers. Install MS patch on servers: psuh1, psuh4, psuh5, psuh6, psuh7, psuh11, psweb1, acauddev, psacauddev , psimgprd. POC: Gillit, JodyChange No: 00003540Status: Approved
ChangeManagementCalendar
June, 2010Change Calendar
Back to Top
Sunday Monday Tuesday Wednesday Thursday November 01
All Day - A : Open : Install 400kVA UPS 00:01 - A : Open : Exchange: Move mailboxes (no voicemail) 06:00 - 14:00 - A : Completed : MS Patches & RegEdits for Data Warehouse Prod 07:00 - 10:00 - A : Open : Cribbage Maint 07:00 - 14:00 - A : Open : RHEL Remedy Maint 08:00 - 09:00 - A : Completed : Move UC-UCU uplink 09:30 - 12:00 - A : Open : Replace memory in phoenix 22:00 - 23:40 - A : Completed : Adjust time zones for legacy ip telephony phones
November 02All Day - A : Open : Install 400kVA UPS All Day - A : Open : Exchange: Move mailboxes (no voicemail) 12:00 - 16:00 - A : Completed : Upgrade Raiser 's Edge Software on Blackbaud 16:00 - 16:30 - A : Completed : Change Wireless Firewall NAT/PAT Assignments
November 03All Day - A : Open : Install 400kVA UPS All Day - A : Open : Exchange: Move mailboxes (no voicemail)
November 04All Day - A : Open : Install 400kVA UPS All Day - A : Open : Exchange: Move mailboxes (no voicemail) 18:00 - 18:30 - A : Completed : Reset TFTP service on CallManagers
November 05All Day - A : Open : Install 400kVA UPSAll Day - A : Open : Exchange: Move mailboxes (no voicemail)
November 08All Day - A : Open : Install 400kVA UPS All Day - A : Open : Exchange: Move mailboxes (no voicemail) 21:00 - 23:00 - A : Open : Adjust time zones for legacy ip telephony phones
November 09All Day - A : Open : Install 400kVA UPS All Day - A : Open : Exchange: Move mailboxes (no voicemail)
November 10All Day - A : Open : Install 400kVA UPS All Day - A : Open : Exchange: Move mailboxes (no voicemail)
November 11All Day - A : Open : Install 400kVA UPS All Day - A : Open : Exchange: Move mailboxes (no voicemail) 20:00 - 22:00 - A : Open : AnchorPoint Data Restore 21:00 - 23:59 - A : Open : MS Sec Updates UHPCI systems
November 12All Day - A : Open : Install 400kVA UPSAll Day - A : Open : Exchange: Move mailboxes (no voicemail)04:00 - 04:15 - A : Open : Create Static ARP entry for UH Mail servers05:00 - 06:00 - A : Open : Reset CRS Engine (ACD services) 12:00 - 15:00 - A : Open : MS Sec Updates UHS DCs 21:00 - 23:59 - A : Open : MS Sec Updates Cougarnet DCs 21:00 - 23:59 - A : Open : MS Sec updates ER DCs
November 15All Day - A : Open : Install 400kVA UPS 04:00 - A : Open : Exchange: Move mailboxes (no voicemail) 04:00 - 08:00 - A : Open : MS Patch install on SYSDATA 06:00 - 12:00 - A : Open : MS Patches for Data Warehouse Tst/Dev 06:00 - 14:00 - A : Open : Apply RHEL Updates on Hershey Singularity System 06:00 - 14:00 - A : Open : PSUH1 - Extend c:\
November 16All Day - A : Open : Install 400kVA UPS
November 17All Day - A : Open : Install 400kVA UPS 05:00 - A : Open : Vmware Virtual Center upgrade to Vsphere 4 Vcenter
November 18All Day - A : Open : Install 400kVA UPS 08:00 - A : Open : Vmware Virtual Center upgrade to Vsphere 4 Vcenter
November 19All Day - A : Open : Install 400kVA UPS
Revised 12/01/2008