Information Technology Availability Center

23
Information Technology Availability Center Daily Infrastructure Report Tuesday, June 15, 2010 Quick Reference Card Infrastructure Incidents Data Network UH Systems CVN Environmentals IT Security Network Availability Statistics Core Routers Distribution Switches Access Switches Secondary Data Center Peoplesoft Systems Critical Systems Cullen Oaks Bayou Oaks Infrastructure Tests Phone Systems Voicemail Systems Virtual Private Networking SiteScope A & P Statistics DNS Services E-Mail Services Web Tech Services BAC Statistics Secondary Data Center (Mail) Exchange - Outlook Web Access Digital FAX Services Mail - Web Access PeopleSoft Change Management PeopleSoft Finance PeopleSoft SA/HR WebCT Vista Blackboard Vista Sophos Information Sophos Pure Message Tropical Weather Information No Storms Present Change Management CMC Events CMC Calendar Infrastructure Incidents Data Range: 11/16/09, 0600 – 11/17/09, 0600 No Incidents Minor Service Affecting Incidents Major Service Affecting Incidents University of Houston Weather IT Availability Center Back to Top Data Network 11/16/2009, 0748 - 1017: Monitoring lost contact with multiple network switches at Cullen Oaks G building. A network analyst examined management configuration and was able to restore services. Inquiry # 153725 is closed. UH Systems No Incidents Compressed Video Network No Incidents Environmentals No Incidents IT Security No Incidents

Transcript of Information Technology Availability Center

Information Technology

Availability CenterDaily Infrastructure Report

Tuesday, June 15, 2010

Quick Reference Card

Infrastructure Incidents

Data Network

UH Systems

CVN

Environmentals

IT Security

Network Availability Statistics

Core Routers

Distribution Switches

Access Switches

Secondary Data Center

Peoplesoft Systems

Critical Systems

Cullen Oaks

Bayou Oaks

Infrastructure Tests

Phone Systems

Voicemail Systems

Virtual Private Networking

SiteScope A & P Statistics

DNS Services

E-Mail Services

Web Tech Services

BAC StatisticsSecondary Data Center (Mail)

Exchange - Outlook Web Access

Digital FAX Services

Mail - Web Access

PeopleSoft Change Management

PeopleSoft Finance

PeopleSoft SA/HR

WebCT Vista

Blackboard Vista

Sophos Information

Sophos Pure Message

Tropical Weather

InformationNo Storms Present

Change Management

CMC Events

CMC Calendar

InfrastructureIncidentsData Range: 11/16/09, 0600 – 11/17/09, 0600

No Incidents

Minor Service Affecting Incidents

Major Service Affecting Incidents

University of Houston Weather

IT Availability Center

Back to Top

Data Network11/16/2009, 0748 - 1017: Monitoring lost contact with multiple network switches at Cullen Oaks G building. A network analyst examined management configuration and was able to restore services. Inquiry # 153725 is closed.

UH SystemsNo Incidents

Compressed Video NetworkNo Incidents

EnvironmentalsNo Incidents

IT SecurityNo Incidents

NetworkAvailabilityStatisticsData Range: 11/16/09, 0000 - 2359

Network Availability

Availability 99.999% to 100%

Availability 80% to 99.998%

Availability 0% to 79.999%

Source Data: Here

Back to Top

Core Routers (High Impact) 100 % availabilityNo Incidents

Distribution Switches (Medium Impact) 100 % availabilityNo Incidents

Access Switches (Low Impact) 99.84 % availability11/05/2009, 1122– Present: Monitor lost contact with a wireless access point at Cullen College of Engineering building. Telecommunication technician is waiting for an analyst to reconfigure a new WAP to replace and restore the current one. Inquiry # 163151 is pending.

Secondary Data Center (SDC) 100 % availabilityNo Incidents

PeopleSoft Systems (PSFT) 100 % availabilityNo Incidents

Critical Systems (CRSYS) 100 % availabilityNo Incidents

Cullen Oaks (UHCO) 98.44 % availability11/16/2009, 0748 - 1017: Monitoring lost contact with multiple network switches at Cullen Oaks G building. A network analyst examined management configuration and was able to restore services. Inquiry # 153725 is closed.

Bayou Oaks (UHBO) 100 % availabilityNo Incidents

InfrastructureTestingResultsData Range: 11/16/09, 0600 – 11/17/09, 0600

No Incidents

Minor Service Affecting Incidents

Major Service Affecting Incidents

Back to Top

Phone Systems No Incidents

Voicemail SystemsNo Incidents

Virtual Private NetworkingNo Incidents

BusinessAvailabilityCenterStatisticsData Range: 11/16/09, 0000 - 2359

BAC Availability

Availability 99.999% to 100%

Availability 80% to 99.998%

Availability 0% to 79.999%

BAC Performance

Minimum Response Time

Average Response Time

Maximum Response Time

Legend

Transaction

A series of steps that a customer performs in an application, whose availability we are monitoring.

Total transactions

Displays the total number of transactions run during the measured period

Failed transactions

Displays the total number of failed transactions for the measured period

Performance Degradation

Displays the percentage of service degradation for the measured period due to sporadic transaction failures

Monitor Locations

BPMUH1 – E-Cullen; BPMUH2 – Cougar Place; BPMUH3 - McElhinney; BPMDT1 – Downtown Campus; BPMVC1 –

Secondary Data Center (Mail)Availability: 99.996 %

Alerts sent: 0 (0 with critical severity)

Total transactions: 24189

Failed transactions: 1

Summary of Events

11/16/2009, BPMUH564 reported a failed Options due to an error occurred.

Digital Fax ServiceAvailability:

Alerts sent: 0 (0 with critical severity)

Total transactions: 144

Failed transactions: 0

Summary of Events

No Incidents

Mail - Web AccessAvailability: 98.712 %

Alerts sent: 1 (0 with critical severity)

Total transactions: 15921

Failed transactions: 205

Summary of Events

10/06/2009, 0000 - Present: Multiple sites reported failed Address Book transactions. Frame not found in browser/dialog errors occurred. These repeated errors have been occurring for several days and have not been addressed by administrators.

11/16/2009, 2009: BPMUH501 reported a failed URL Web Page transaction due to the host could not be resolved..

PeopleSoft Change ManagementAvailability: 99.994 %

Alerts sent: 0 (0 with critical severity)

Total transactions: 16122

Failed transactions: 1

Summary of Events

11/16/2009, 1526: BPMUH593 reported a failed Login transaction due to an error occurred.

PeopleSoft FinanceAvailability: 99.975 %

Alerts sent: 2 (0 with critical severity)

Total transactions: 24192

Failed transactions: 6

Summary of Events

11/16/2009, 0920: BPMUHCL reported all failed transactions due to failure to connect to the server.

PeopleSoft Student Administration/Human ResourcesAvailability: 99.991 %

Alerts sent: 0 (0 with critical severity)

Total transactions: 32246

Failed transactions: 3

Summary of Events

11/16/2009, 1500 – 1501: BPMUH534 and BPMUH542 reported failed Class Search Catalog transactions due to the connection was reset by the peer errors.

WebCT VistaAvailability: 99.993 %

Alerts sent: 0 (0 with critical severity)

Total transactions: 44316

Failed transactions: 3

Summary of Events

11/16/2009, 1238, 1349, and 1438: BPMUHCL and BPMUHD reported failed URL Web Page transactions due to step download timeout errors.

Blackboard VistaAvailability: 99.831 %

Alerts sent: 31 (31 with critical severity)

Total transactions: 43691

Failed transactions: 74

Summary of Events

11/16/2009, 0500 – 0554: All sites reported failed UH WebCT Page and Help transactions due to internal server errors and not found errors.

11/16/2009, 1227, and 1802 - 1812: BPMUH551 BPMUH543 BPMUHV BPMUH553 BPMUH529 and BPMUHCR reported failed Login transactions due to step download timeout errors.

11/16/2009, 1803: BPMUH596 reported a failed URL Web Page transaction due to a step download timeout error.

SiteScopeAvailability &PerformanceStatisticsData Range: 11/16/09, 0000 - 2359

Back to Top

Terms Definitions

Uptime % Percent of successful transactions within 24 hours.

Error % Percent of transactions exceeding the error threshold within 24 hours. (Counts against Uptime %)

Warning % Percent of transactions exceeding the warning threshold within 24 hours. (Does not count against Uptime %)

Average The average amount of time for transactions to complete (24 Hour Average)

Max / Peak Largest single instance of a transaction in a 24 hour period

RTT Short for “Round Trip Time” – The amount of time for a transaction to be sent, processed, and returned to sender.

Availability The time that any specific service is operational, and “available” to customers

Domain Name Services

DNS Availability

Availability 99.999% to 100%

Availability 80% to 99.998%

Availability 0% to 79.999%

Warning Threshold <= 1 Second

Error Threshold >= 2 Seconds

Availability

Name Uptime % Error % Warning %

DNS - ACACIA.cc.uh.edu (KUHT) 100 0 0

DNS - CEDAR.cc.uh.edu (CC) 100 0 0

DNS - FIR.cc.uh.edu (PGH) 100 0 0

DNS - PONG.uh.edu (Dallas) 100 0 0

DNS - Post-Office.cougarnet.uh.edu 100 0 0

DNS - WALNUT.cc.uh.edu (CC) 100 0 0

DNS - ERDC1.er.uh.edu (CC) 100 0 0

DNS - ERDC3.er.uh.edu (MCPB) 100 0 0

DNS - ERDC5.er.uh.edu (PGH) 100 0 0

DNS - ERDC8.er.uh.edu (Dallas) 100 0 0

DNS Performance

Round Trip Time 0 - .999 Seconds

Round Trip Time 1 – 1.499 Seconds

Round Trip Time 1.5 – 1.999 Seconds

RTT more than 2 Seconds - See Availability

Performance

Name Measurement Max Avg

DNS - ACACIA.cc.uh.edu (KUHT) round trip time 0.91 sec 0.22 sec

DNS - CEDAR.cc.uh.edu (CC) round trip time 0.91 sec 0.24 sec

DNS - FIR.cc.uh.edu (PGH) round trip time 1.11 sec 0.23 sec

DNS - PONG.uh.edu (Dallas) round trip time 1.06 sec 0.27 sec

DNS - Post-Office.uh.edu round trip time 0.95 sec 0.21 sec

DNS - WALNUT.cc.uh.edu (CC) round trip time 0.91 sec 0.23 sec

DNS - ERDC1.er.uh.edu (CC) round trip time 1.08 sec 0.21 sec

DNS - ERDC3.er.uh.edu (MCPB) round trip time 1.3 sec 0.26 sec

DNS - ERDC5.er.uh.edu (PGH) round trip time 0.91 sec 0.22 sec

DNS - ERDC8.er.uh.edu (Dallas) round trip time 0.92 sec 0.27 sec

Summary of Events

No Incidents

Electronic Mail Services

E-Mail Availability

Availability 99.999% to 100%

Availability 80% to 99.998%

Availability 0% to 79.999%

Warning Threshold <= 900 Seconds

Error Threshold >= 3600 Seconds

Availability

Name Uptime % Error % Warning %

post-office.uh.edu to Exchange 99.61 0.39 0

post-office.uh.edu to mail.uh.edu DISABLED DISABLED DISABLED

Ping - piranha 100 0 0

Ping - capano 100 0 0

Ping - comal 100 0 0

Ping - pecos 100 0 0

Ping - paluxy 100 0 0

E-Mail Performance

Round Trip Time 0 - 300 Seconds

Round Trip Time 301 – 2500 Seconds

Round Trip Time 2501 – 3600 Seconds

RTT more than 3600 Seconds - See Availability

Performance

Name Measurement Peak Average

post-office.uh.edu to Exchange round trip time 275.69 sec 45.34 sec

post-office.uh.edu to mail.uh.edu round trip time n/a n/a

Ping - piranha round trip time 0.01 sec 0.01 sec

Ping - capano round trip time 0.02 sec 0.01 sec

Ping - comal round trip time 0.02 sec 0.01 sec

Ping - pecos round trip time 0.02 sec 0.01 sec

Ping - paluxy round trip time 0.02 sec 0.01 sec

Summary of Events

3:10 PM 11/16/09Mail - Post-Office.uh.edu to Exchange

send failed: unable to connect to server

Web Technology Services

Web Services Availability

Availability 99.999% to 100%

Availability 80% to 99.998%

Availability 0% to 79.999%

Warning Threshold <= 0 Seconds

Error Threshold >= 7.5 Seconds

Availability

Name Uptime % Error % Warning %

Ping - Blogs.uh.edu 100 0 0

Ping - CMS.uh.edu 100 0 0

Ping - Morpheus.matrix.uh.edu 100 0 0

Ping - Neo.matrix.uh.edu 100 0 0

Ping - Oracle.matrix.uh.edu DISABLED DISABLED DISABLED

Ping - Search.uh.edu 100 0 0

URL - Cinco Ranch (cincoranch.uh.edu)

100 0 0

URL - Sugar Land (sugarland.uh.edu) 100 0 0

URL - Blogs (blogs.uh.edu/evolvinguh)

100 0 0

URL - Information Technology (uh.edu/infotech)

100 0 0

URL - SDC (sdc-ws.uh.edu) 100 0 0

URL - Search (search.uh.edu) 100 0 0

URL - UH (www.uh.edu) 100 0 0

URL - UHSA (uhsa.uh.edu) 100 0 0

URL - Weather (uh.edu/weather) 100 0 0

URL - CMS (cms.uh.edu) 100 0 0

Apache Counters 100 0 0

Web Services Performance

Round Trip Time 0 - 2 Seconds

Round Trip Time 2 – 3 Seconds

Round Trip Time 4 – 5 Seconds

RTT more than 5 Seconds - See Availability

Performance

Name Measurement Max Avg

Ping - Blogs.uh.edu round trip time 0.01 sec 0.01 sec

Ping - CMS.uh.edu round trip time 0.02 sec 0.01 sec

Ping - Morpheus.matrix.uh.edu round trip time 0.02 sec 0.01 sec

Ping - Neo.matrix.uh.edu round trip time 0.02 sec 0.01 sec

Ping - Oracle.matrix.uh.edu round trip time n/a n/a

Ping - Search.uh.edu round trip time 0.02 sec 0.01 sec

URL - Cinco Ranch (cincoranch.uh.edu)

round trip time 0.64 sec 0.04 sec

URL - Sugar Land (sugarland.uh.edu) round trip time 0.7 sec 0.04 sec

URL - Blogs (blogs.uh.edu/evolvinguh)

round trip time 3.08 sec 0.62 sec

URL - Information Technology (uh.edu/infotech)

round trip time 1.39 sec 0.38 sec

URL - SDC (sdc-ws.uh.edu) round trip time 0.05 sec 0.02 sec

URL - Search (search.uh.edu) round trip time 0.84 sec 0.03 sec

URL - UH (www.uh.edu) round trip time 1.7 sec 0.99 sec

URL - UHSA (uhsa.uh.edu) round trip time 0.05 sec 0.01 sec

URL - Weather (uh.edu/weather) round trip time 0.97 sec 0.06 sec

URL - CMS (cms.uh.edu) round trip time 0.11 sec 0.06 sec

Apache Counters BusyWorkers 29 11.43

Apache Counters IdleWorkers 27 8.43

Summary of Events

No Incidents

SophosInformationData Range: Last 30 days

Legend

VirusMessages flagged containing a virus or virus & spam

SpamMessages flagged containing spam

OtherNo spam or virus detected

Key

Flagged Virus

Flagged Spam

Unflagged Messages

Back to Top

Sophos (Last 7 Days) Sophos Top 10 Virus Data (Yesterday)

Date/Time Messages Virus Spam Other % Flagged

11/10/2009 0:00 439792 315 46219 393258 10.58

11/11/2009 0:00 476775 25 40831 435919 8.57

11/12/2009 0:00 420035 34 45003 374998 10.72

11/13/2009 0:00 357806 247 42722 314837 12.01

11/14/2009 0:00 171931 21 38114 133796 22.18

11/15/2009 0:00 207758 11 31149 176598 15.00

11/16/2009 0:00 411917 64 41360 370493 10.06

Rule

W32/MyDoom-O

W32/Mabezat-B

SOPHOS_SAVI_FILE_ENCRYPTED

Troj/Agent-LNC

Mal/EncPk-LP

Mal/DownLnk-B

W32/Sality-AA

Mal/Basine-C

Mal/Zbot-P

Mal/ZipMal-B

Sophos PureMessage (Last 30 Days)

0

100000

200000

300000

400000

500000

600000

700000

18-Oct-2009 25-Oct-2009 1-Nov-2009 8-Nov-2009 15-Nov-2009

Date

Mes

sag

es

Other Spam Virus Total Linear (Virus) Linear (Spam) Linear (Other)

Tropical WeatherInformationData Range: June 1st Thru November 31st

Probability of Formation

Low - <20%

Medium – 20% - 50%

High - > 50%

Tropical Weather Outlook Discussion

Source Data: Here

Back to Top

Graphical Tropical Weather Outlook

ScheduledCMCEventsData Range: Next 7 days

Back to Top

10/26/2009, 0700 – 01/08/2010, 1700: Install 400kVA UPS. Commence installation of Liebert 400kVA Uninterruptible Power System (UPS) which is replacement for the removed 125kVA UPS. Impact set to Very High if not done; Computing Center will not be able to add additional equipment to the power infrastructure. Risk - To protect systems, systems will be required to shutdown during cutover of power.POC: Savage, James DChange No: 00003516Status: Approved

11/17/2009, 0500 -11/18/2009, 0800: Vmware Virtual Center upgrade to Vsphere 4 Vcenter. UpgradeVmware Virtual Center to the new version Vcenter - This work is a reschedule from an earlier item and is non-service affecting in the application of the upgrade. centralize enterprise virtual server management console won't be available, but management of systems and virtual Hosts will be available via direct connect and VM's wont' be affected as it does not touch the ESX host systems.POC: Frankfort, DavidChange No: 00003548Status: Approved

11/20/2009, 0400 - 0800: MS Patch install on Turning.cougarnet.uh.edu(prod). Install MS Patches on Turning.cougarnet.uh.edu(prod).POC: Gillit, JodyChange No: 00003542Status: Approved

11/20/2009, 0500 - 0800: Microsoft Updates Managed Servers. Microsoft Monthly updates and security patches. Install and reboot.POC: Frankfort, DavidChange No: 00003560Status: Pending

11/20/2009, 0500 -0800: MKeo Microsoft Monthly Security Patches Part2. Urgency: High - Majority of Patches are Security Related.Keo Microsoft Monthly Security Patches Part2. Category: Division of Labor. DES1, DES2, ECSSQL2,ECSCL1, ECSCL2, Lucerne, Geneva, QuestionMark1, QuestionMark2, TSSEPO2, Desktop2, Zurich, EcsEpo-test, Alertus, NetDoc. These servers are to be completed as part of the division of labor do not have test environments and being split over two weekends due to Maintenance Window Contraints. Impact LOW:POC: Michael, KeoChange No: 00003529Status: Approved

11/22/2009, 0600 - 1200: MS Patches for Data Warehouse (prod) Apply MS patches to servers: Wahoo, Redfish, Flounder, Cobia. DataWarehouse Windows Servers.POC: Gillit, JodyChange No: 00003541Status: Approved

11/22/2009, 0600 -1200: MS Patch Install on PS Prod Servers. Install MS patch on servers: psuh1, psuh4, psuh5, psuh6, psuh7, psuh11, psweb1, acauddev, psacauddev , psimgprd. POC: Gillit, JodyChange No: 00003540Status: Approved

ChangeManagementCalendar

June, 2010Change Calendar

Back to Top

Sunday Monday Tuesday Wednesday Thursday November 01

All Day - A : Open : Install 400kVA UPS 00:01 - A : Open : Exchange: Move mailboxes (no voicemail) 06:00 - 14:00 - A : Completed : MS Patches & RegEdits for Data Warehouse Prod 07:00 - 10:00 - A : Open : Cribbage Maint 07:00 - 14:00 - A : Open : RHEL Remedy Maint 08:00 - 09:00 - A : Completed : Move UC-UCU uplink 09:30 - 12:00 - A : Open : Replace memory in phoenix 22:00 - 23:40 - A : Completed : Adjust time zones for legacy ip telephony phones

November 02All Day - A : Open : Install 400kVA UPS All Day - A : Open : Exchange: Move mailboxes (no voicemail) 12:00 - 16:00 - A : Completed : Upgrade Raiser 's Edge Software on Blackbaud 16:00 - 16:30 - A : Completed : Change Wireless Firewall NAT/PAT Assignments

November 03All Day - A : Open : Install 400kVA UPS All Day - A : Open : Exchange: Move mailboxes (no voicemail)

November 04All Day - A : Open : Install 400kVA UPS All Day - A : Open : Exchange: Move mailboxes (no voicemail) 18:00 - 18:30 - A : Completed : Reset TFTP service on CallManagers

November 05All Day - A : Open : Install 400kVA UPSAll Day - A : Open : Exchange: Move mailboxes (no voicemail)

November 08All Day - A : Open : Install 400kVA UPS All Day - A : Open : Exchange: Move mailboxes (no voicemail) 21:00 - 23:00 - A : Open : Adjust time zones for legacy ip telephony phones

November 09All Day - A : Open : Install 400kVA UPS All Day - A : Open : Exchange: Move mailboxes (no voicemail)

November 10All Day - A : Open : Install 400kVA UPS All Day - A : Open : Exchange: Move mailboxes (no voicemail)

November 11All Day - A : Open : Install 400kVA UPS All Day - A : Open : Exchange: Move mailboxes (no voicemail) 20:00 - 22:00 - A : Open : AnchorPoint Data Restore 21:00 - 23:59 - A : Open : MS Sec Updates UHPCI systems

November 12All Day - A : Open : Install 400kVA UPSAll Day - A : Open : Exchange: Move mailboxes (no voicemail)04:00 - 04:15 - A : Open : Create Static ARP entry for UH Mail servers05:00 - 06:00 - A : Open : Reset CRS Engine (ACD services) 12:00 - 15:00 - A : Open : MS Sec Updates UHS DCs 21:00 - 23:59 - A : Open : MS Sec Updates Cougarnet DCs 21:00 - 23:59 - A : Open : MS Sec updates ER DCs

November 15All Day - A : Open : Install 400kVA UPS 04:00 - A : Open : Exchange: Move mailboxes (no voicemail) 04:00 - 08:00 - A : Open : MS Patch install on SYSDATA 06:00 - 12:00 - A : Open : MS Patches for Data Warehouse Tst/Dev 06:00 - 14:00 - A : Open : Apply RHEL Updates on Hershey Singularity System 06:00 - 14:00 - A : Open : PSUH1 - Extend c:\

November 16All Day - A : Open : Install 400kVA UPS

November 17All Day - A : Open : Install 400kVA UPS 05:00 - A : Open : Vmware Virtual Center upgrade to Vsphere 4 Vcenter

November 18All Day - A : Open : Install 400kVA UPS 08:00 - A : Open : Vmware Virtual Center upgrade to Vsphere 4 Vcenter

November 19All Day - A : Open : Install 400kVA UPS