Systems And Operations

52
Systems and Systems and Operations Operations CCDs Presentation December 12, 2007

Transcript of Systems And Operations

Page 1: Systems And Operations

Systems and OperationsSystems and Operations

CCDs Presentation

December 12, 2007

Page 2: Systems And Operations

Jam es BohnsackMike G arciaJason Lowe

Mark BodensteinProject Leader

Mainframe System s Program m ing

R andy Sm ithPeggy R oberts

R ick PolcaroManage r

Facilities & Laser P rinting Services

Moe ArifMary C ronkD oug FlanaganSolom on W elchTom W aldenD avid ShirkKent R ossG len H offm an

Mariann C arpenterManage r

System s Adm inistration

Scott SorrentinoMike HeislerJim YangJohn W obusAndrew H eathJavier S trebVacant

Laurie C ollinsw orthManage r

Systems Engineering

Bob TaldaD avid Beardsley

Paul Zarnow skiManage r

S torage Services

Michael Hojnow skiManage r

Special P rojects

Brian MessengerAssistant Director

Systems

Jim H ow ellSenior Technical Lead

Jennifer MooreG ail ShaffG eorge MedlarTodd O lsonVacant

Messaging Services

R ick C ochranMichelle MogilLee BrinkD an Bartholom ewJoanne Button

C lient Services

D on Mac LeodAssistant DirectorSystem s Services

Jim ConleyJohn R yanBrian W itchey

Juan Salom onTeam Leader

N OC First Shift

Mark A llenJay How ellLillian Isacks

Bryan BenningTeam Leader

N O C Second Shift

R uth BurroughsC arl MoravecTheresa N orm anMaureen Q uillinan

C huck Thom asManage r

Production O perations(Evening Shift)

D an MillerR ich FraboniG reg Marvin

VacantManage r

Production O perations Services

Jam es R eedJohn BeckerBarbara Van E tten

Ken FrostTeam Leader

N O C Third Shift

Jenny S ignorTechnical Lead

V icky D eanAssistant Director

O perations

R ick MacD onaldD irector

System s and O perations

Page 3: Systems And Operations

Services:Services:Server FarmServer FarmSystems Administration Systems Administration Storage FarmStorage FarmEZ-BackupEZ-BackupVMware (In Development)VMware (In Development)

Page 4: Systems And Operations

Server FarmServer Farm

Manager: Rick Polcaro

Staff: Peggy Roberts

Randy Smith

Page 5: Systems And Operations

Server FarmServer Farm

The CIT Server Farm provides a secure environment for housing departmental servers. There are more than 500 servers in the facility today. The service includes 24 hour system monitoring and reboot, network connectivity, and uninterruptible power with generator backup. Servers are mounted in CIT provided racks. It is the customer’s responsibility to provide hardware and software maintenance.

Page 6: Systems And Operations

Server Farm HistoryServer Farm History The server farm began in CCC in the mid-1980s with a

few Unix systems for data warehousing and research. Our first departmental user was Dining. CISER soon

followed. Major CIT servers into the 1990s

Instruct MachinesEZ-Remote serversPostoffice machinesP2K ServersLibrary Servers

Server farm moved from CCC to Rhodes Hall in 1997.

Page 7: Systems And Operations

Server Farm FacilitiesServer Farm Facilities

Raised Floor in Rhodes and CCC. Emergency power and generator backups. Standard electrical power consists of 120/208 vac, with 30

amp three prong receptacle. Special requests are directly billed.

Physical security to authorized personnel only, with security cameras.

Basic support, via the NOC, for 24/7 monitoring and basic troubleshooting, rebooting, and system administrator notification.

Page 8: Systems And Operations

Server Farm FacilitiesServer Farm Facilities Costs:

– Space is charged by the “U”, an industry standard measurement in a 19” rack – 1.75 vertical inches.

– The monthly server charge = (server “U’s” * $7.60)

– There will be a one time installation charge of $118 per server

The cost for a network connection(s) and NUBB will be charged directly to the user’s University account by CIT NCS.

Page 9: Systems And Operations

Server Farm GrowthServer Farm GrowthServer Farm Growth

135

153154

155168

179189

221234

255

280294

299316

331

351 338

374378

385

539543

557

586538

543542

456415

482479

0

50

100

150

200

250

300

350

400

450

500

550

600

4QFY00

2QFY01

4QFY01

2QFY02

4QFY02

2QFY03

4QFY03

2QFY04

4QFY04

2QFY05

4QFY05

2QFY06

4QFY06

2QFY07

4QFY07

12/3/2007

Ser

ver

Co

un

t

0

50

100

150

200

250

300

350

400

450

500

550

600

CIT Servers Customer Servers Total

456

130

Page 10: Systems And Operations

Server Farm CustomersServer Farm Customers Distribution (number of servers):

CIT Owned: 456Lab of O: 17Library: 29Comp Sci: 14CCE: 10Others: 60 (representing 29 depts)================================Total: 586 (12/07)

Page 11: Systems And Operations

QuestionsQuestions

Information / Contact: Server Farm:

http://www.cit.cornell.edu/services/serverfarm [email protected]

???

Page 12: Systems And Operations

Systems Administration SupportSystems Administration Support

Manager: Mariann Carpenter

Staff: Muhammad (Moe) Arif

Mary Cronk

Doug Flanagan

Glen Hoffman

Kent Ross

David Shirk

Thomas Walden

Solomon Welch

Page 13: Systems And Operations

Systems SupportSystems Support Systems Support provides systems administration and

systems programming support for projects that span the 400+ servers and 75 terabytes of storage located in our machine rooms.

The Systems Administration team provides the hardware and operating system support for CIT’s servers and storage assets.

Systems Engineering provides and enhances tools for our production servers and services, such as NetVigil, as well as specific support for the DNS/DHCP service and its ancillary systems.

Page 14: Systems And Operations

Client Systems ServicesClient Systems Services Systems Administration Services

Quote, OS Install, OS Maintenance, Backup Configuration, Userid Maintenance, Application setup assistance, Retirement

The following operating systems are currently supported: SUN Solaris IBM AIX (no new installations)Windows 2000 (no new installations) Windows 2003 (standard and enterprise editions) RedHat Linux Enterprise AS 4.0

Installation into the Server farm facility. Monitoring with 24/7 Support as required.

Page 15: Systems And Operations

Client Systems ActivitiesClient Systems Activities Routine Administration. Coordination of hardware maintenance contracts. Lifecycle management of servers, working with our

customers. Automation projects to leverage our staff investment.

Page 16: Systems And Operations

Systems Administration DemographicsSystems Administration Demographics

We support the OS on approximately 400+ servers. Distribution by Operating System

74% Solaris

1% AIX

25% Wintel, Linux

Page 17: Systems And Operations

QuestionsQuestions

Information / Contact: Mariann Carpenter 255-7707 [email protected] ???

Page 18: Systems And Operations

Storage Farm Service

CIT’s Storage Farm service provides storage

capacity, connectivity, and management

services to servers housed in CIT’s Server

Farm and managed by CIT’s Systems

Administration Support group.

Page 19: Systems And Operations

Storage Farm Staffing

• Manager of Storage Services– Paul Zarnowski

• Manager of Systems Administration– Mariann Carpenter

• Storage Engineers– David Shirk– Kent Ross

Page 20: Systems And Operations

Storage Farm Historical Growth

Storage Network Ports

0

100

200

300

400

500

Jul-05 Jul-06 Jul-07

Storage (TB)

0

20

40

60

80

100

Jul-05 Jul-06 Jul-07

Page 21: Systems And Operations

Storage Farm: Efficiencies

• Centralized Management– Small set of devices to support & maintain– Common management tools– Fewer resource baskets to manage

• Centralized Procurement Process– Fewer purchases (hardware, maintenance, etc)– Streamlined process

• Higher Storage Utilization

Page 22: Systems And Operations

Storage Farm: Advantages

• Easy to Use:– Ease of storage allocation– Faster storage allocation– Maintenance handled by Storage Farm staff– Technology upgrades

• Financial Advantages:– No up-front capital expense or depreciation– Grow storage as needed (pay as you go)– Risk reduction– Lower overall costs (considering all components)

Page 23: Systems And Operations

Storage Farm: Components

• Storage Connectivity

• Storage Tiers

• Storage Management

Page 24: Systems And Operations

Storage Farm:Connectivity Options

Fibre Channel Storage Area Network (FC-SAN)– Single-attached, to one fabric (red or green)

• Redundancy for Storage Array connectivity only– Dual-attached, for higher availability (red AND green)

• Redundancy for both Array and Server connectivity

SANFabric 1

SANFabric 2

HBA1HBA2 Storage

Array

Ctlr 1Ctlr 2

Page 25: Systems And Operations

Storage Farm Services: Storage Tiers

Tier 1 • Highest performance & availability• Only tier providing mainframe connectivity• Examples: High-usage operational databases

Tier 2 • Performance, cost & availability all have some level of importance• Examples: average usage databases & applications

Tier 3 • Lowest cost of online storage• Performance & high availability less important• Examples: Development & Test systems; moderate-activity file servers.

Page 26: Systems And Operations

Storage Farm Services:Management Services

• Storage Network connection setup

• LUN* allocations & expansions

• LUN tier migration

• Performance & Capacity planning

• Storage device firmware upgrades

• Maintenance procedures• 24x7 Health monitoring of storage subsystem

(out-board from server’s FC HBA)

• Troubleshooting of storage hardware & software

• Procurement

• Life-cycle upgrades*LUN = Logical Unit Number (aka logical disk)

Page 27: Systems And Operations

Recent Storage RFP: Goals

• Improved Storage Management– Storage hot spots– LUN resizing effort– Reporting

• Cheaper Storage Connectivity– iSCSI* support

• Lower Storage Costs– Maintenance– “Right-placing” data– Decrease “head room” costs

*iSCSI = SCSI over TCP/IP

(Spring, 2007)

Page 28: Systems And Operations

Storage RFP: Process

• Storage RFP Review Committee:– Paul Zarnowski, David Shirk, Kent Ross, Don MacLeod,

Tony Damiani, Ken Friedman, Mike Hojnowski, Brian Messenger

• 15 Vendor Proposals reviewed• 8 On-site Vendor Presentations• 4 Finalists• Compellent vetted & selected

Page 29: Systems And Operations

Storage Center

• Stable performance– All I/O spread & balanced across many disk spindles– No more hot spots

• Reduce capital expenditures– Thin provisioning– Automated Tiered Storage

• Supports any open-systems server without agents– Connects via FC, iSCSI or both– Scales from 500 GB to over 300 TB

• Cut operating expenses– Disk virtualization simplifies administration– Eliminate the need for 3rd party software

• Enhanced data availability– Unlimited snapshots without full copies– Replication flexibility for low-cost DR

• Reduce server costs– Reliable boot from SAN

Scalable Enterprise-class SAN solutionwith block-level intelligence

Page 30: Systems And Operations

Storage Farm Roadmap

• SAN Fabric simplification– Elimination of older Brocade switches

• iSCSI connectivity– Initially deploy within Server Farm environment

• Dual path FC / iSCSI connectivity– Windows initially

• Improved performance– Higher spindle count

• Snapshot capabilities

Page 31: Systems And Operations

Potential Future EnhancementsRecovery Options

Distance

Local(Rhodes)

Campus(CCC)

Wide Area(Weill)

Snapshot Synchronous Asynchronous

Technology

Data Instant Replay Remote Instant Replay – Synchronous

Remote Instant Replay – Asynchronous

Page 32: Systems And Operations

Storage FarmPotential Future Enhancements

• Tier 4 storage service– Very large capacity– Low use– Low management– Long-term commitment– Low cost

• Network-Attached Storage (NAS) services– File-level storage (e.g., CIFS, NFS)

• iSCSI storage for non-managed servers

Page 33: Systems And Operations

Storage Farm: Rates

Effective:July, 2007

Storage Management Fee (per system) $64 / month

SAN Connection Fee(per FC connection) $52 / month

Tier 1 storage (mainframe) $3.43 / GB-mo

Tier 2 storage $0.49 / GB-mo

Tier 3 storage $0.29 / GB-mo

Page 34: Systems And Operations

Questions ???– Contact:

Paul Zarnowski

[email protected]

Flash demo of Automated Tiered Storage and Data Progression:

http://www.compellent.com/products/demo/demo_ats.html

Page 35: Systems And Operations

EZ-Backup is . . .

A Centralized Backup solution that offers:• Network-based • Automated backups• Automated management of backup data• User-driven restore• Off-site backups (but still on-Campus)

• Same solution used to back up CIT’s server farm

• Available to all Cornell departments

Page 36: Systems And Operations

EZ-Backup Staff

• EZ-Backup Support Team– Robert Talda– David Beardsley– Randy Barron– Joanne Button– Michelle Mogil– Ron Seccia– and many others throughout CIT

– Paul Zarnowski (Service Manager)

Page 37: Systems And Operations

EZ-Backup Supported Platforms

•Operating Systems:– MacOS

• Leopard support imminent– Windows

• Including Vista– Unix (most flavors)

• Linux (RH, SuSE)• Sun Solaris• AIX• Others

– Netware

•Apps / Databases:– Oracle– Microsoft SQL Server– Microsoft Exchange– Microsoft SharePoint– Others

Available for most platforms in use at Cornell.

Page 38: Systems And Operations

EZ-Backup Growth(14 years)

Page 39: Systems And Operations

EZ-Backup Recent Changes

• New Client Support– Windows Vista– Macintosh Leopard (imminent)

• New TSM Client Software– New Version 5.4– Older versions upgraded to latest patch levels

(for older OSes)– Recommend upgrade to latest software levels

• Merging of CTC TSM service into EZ-Backup– 2nd tape library will give EZ-Backup two-site capability

Page 40: Systems And Operations

EZ-Backup Less-recent Changes

• Dynamic Sub-file backup*– Very useful for backing up over slower network

connections, such as Dial-up, DSL, RoadRunner• Journal-Based Backup*

– Speeds backups for large fileservers w/ low change rate• Ability to delete individual backup files• Include/Exclude Preview Capabilities• Open File Support*• Encryption Enhancements• Server Split• CIT On-Site Solutions support available

*Available on Windows clients only

Page 41: Systems And Operations

EZ-Backup Roadmap

• Upgrade EZ-Backup/TSM Storage Subsystem– RFP Goals:– Technology refresh– Lower storage costs– Provide faster restores for servers with high object count– Consider use of low-cost disk storage

and data de-duplication

• Training Classes• Off-site Capability (Weill Medical Center)

– Pending funding decision

• Enhanced Reporting & Management Tools

Page 42: Systems And Operations

Current Rates (per Month; effective July, 2007)

Base Rate: $6.50 per system

Base Rate (>50th system): $4.50 per system

Base Storage included: 6.0 GB

Extra Storage rate < 15GB: $0.60/GB

Extra Storage rate > 15GB: $0.40/GB

Static data rate (>100GB): $0.20/GB

Next Rate Change Due: July, 2008

Page 43: Systems And Operations

EZ-Backup Pricing History

Page 44: Systems And Operations

Questions ???– Contact:

Paul Zarnowski

[email protected]

More information is available at:

http://ezbackup.cornell.edu

EZ-Backup

Page 45: Systems And Operations

Virtual OS Hosting ProjectVirtual OS Hosting Project

Mike HojnowskiManager, Special Projects

Page 46: Systems And Operations

Project DeliverablesProject Deliverables

Provide Windows hosts under VMware supported by S&O Sysadmins. Note that Linux is formally not in scope at this time.

Develop a financial model to support to support the service for CIT internal customers.

Develop process improvements to minimize the effort involved in creating virtual instances.

Implement the system in a way that facilitates Emergency Preparedness.

Document this service.

Page 47: Systems And Operations

Current EnvironmentCurrent Environment

Canada

Mexico

Jamaica

Panama Peru

VirtualMachine

Containers

VirtualMachine

Containers

Rhodes Hall CCC

PiratesVirtual

MachineContainers

Prod

Test

Dev

Page 48: Systems And Operations

Current EnvironmentCurrent Environment Storage: Compellent via SAN

Thin ProvisioningFast full copy, and Snap copy capabilitiesAutomated Tiered Storage

Production Servers (3)IBM 36502-Socket, 4-Core, 2.66 Ghz24G Ram4 Gbit NICs2 SAN HBAs

Expected capacity: 50-60 Virtual Machines

Page 49: Systems And Operations

Project StatusProject Status

Project Plan approved 10/05/07. Project implementation has been repeatedly

delayed, due to the Exchange project. Staff resources are freeing up, and we’re

ramping up our efforts.

Page 50: Systems And Operations

TimelineTimeline

Development5/1/07 – 12/31/07 (Lengthened due to resource

constraints, cutting the early adopters phase shorter than originally planned).

Early Adopters1/1/08 – 2/28/08

Full Production3/1/08

Page 51: Systems And Operations

IssuesIssues

SAN failoverPresently we don’t replicate storage from Rhodes

to CCC.

We won’t have full “building failure” protection on go-live.

We will address this with either storage technology, or new features in a coming release of VMware after go-live.

Page 52: Systems And Operations

FuturesFutures

Linux (First half 2008) Solaris (First half 2008) Virtual “Co-lo” as a Designated Service?

(FY09?)