Open stack china_201109_sjtu_jinyh

21
© jinyh@sjtu [email protected] Yaohui Jin ([email protected]) Network & Information Center Shanghai Jiao Tong University

Transcript of Open stack china_201109_sjtu_jinyh

Page 1: Open stack china_201109_sjtu_jinyh

© jinyh@sjtu

[email protected]

Yaohui Jin ([email protected])

Network & Information Center

Shanghai Jiao Tong University

Page 2: Open stack china_201109_sjtu_jinyh

© jinyh@sjtu

About Me and Team

Professor, Deputy Director of NIC.SJTU

Email: [email protected]

My research interests: Data Center Network, Big Data Analysis, Converged Broadband Network

Team: Engineers: Xuan Luo (Ph. D), Jianwen Wei (M. Eng.), Qiang Sun

(M. Eng.)

Ph.D Students: Jianxiong Tang, Xiaming Chen, Pengfei Zhang, Siwei Qiang

Master Students: Wei Ye, Xin Yang, Xiujie Feng, Xiaosheng Zuo, Zhaohui Zhang

Interns: Hongbo Fan, and other 10+ undergraduate

2

Page 3: Open stack china_201109_sjtu_jinyh

© jinyh@sjtu

Agenda

Hardware configuration

Performance monitoring and measurement

Potential applications

3

Page 4: Open stack china_201109_sjtu_jinyh

© jinyh@sjtu

OpenStack Architecture

courtesy of Dell

4

Page 5: Open stack china_201109_sjtu_jinyh

© jinyh@sjtu

Our Testbed: Sept. 2011

5

Page 6: Open stack china_201109_sjtu_jinyh

© jinyh@sjtu

Testbed Photo

6

Page 7: Open stack china_201109_sjtu_jinyh

© jinyh@sjtu

Server Details

Name Vendor Configuration Purpose

Nova-controller SuperCloud-R6210-S2

2 *E5620/48GB RAM/2*1TB SATA (Raid 1)/GE

Nova-api, Nova-scheduler, Nova-objectstore, RabbitMQ, MySQL, euca2ools, Dashboard, VNC server, Ganglia

Nova-network SuperCloud-R6210-S2

2 *E5620/48GB RAM/2 *1TB SATA (Raid 1)/2 *10GE

Nova-network

Nova-volume IBM x3650 IBM DS3512 + EXP3512

2 *E5620/48GB RAM/2 *146GB SAS (Raid 1)/2 *10GE + 96TB SATA (Raid 10)

Nova-volume

Nova-compute IBM dx360 M3 2 *E5650/96GB RAM/2 *146GB SAS (Raid 1)/2 *10GE

Nova-compute

Glance Dell R610 2 * E5620/8GB RAM/2 * 146GB SAS (Raid 1) /2 *10GE + 320GB SSD

Glance-api, Glance-registry, Image Store, puppet server

Proxy node SuperCloud-R6210-S2

2 * E5620/48GB RAM/2 * 1TB SATA (Raid 1)/2 *10GE

Swauth, Proxy server

Storage node SuperCloud-RE436

2 * E5620/48GB RAM/2 *146GB SAS (Raid 1) /10GE+ 34 * 2TB SATA desktop disks

Account server, Container server, Object server

7

Page 8: Open stack china_201109_sjtu_jinyh

© jinyh@sjtu

Network Details

Data Center Network: 10 GE Switch (BNT&H3C) in 2 domains

Control and Manage: GE Switch (DCRS)

10GE connect to campus network

Fat tree topology; L3: VRRP; L2: LACP+VLAG+MSTP

Security control: SSH, NAT, ACL, VLAN

NIC: Intel X520-DA2; Chelsio T420E-CR

L2-L7 Network tester: IXIA XM2

L2-L3 Network impairment emulator: Apposite Netropy 10G

8

Page 9: Open stack china_201109_sjtu_jinyh

© jinyh@sjtu

Nova Network Traffic

9

courtesy of Vishvananda Ishaya

Page 10: Open stack china_201109_sjtu_jinyh

© jinyh@sjtu

Swift Details

10

Raw storage capacity: 400T Bytes

Storage node configuration: No Raid, JBOD, 3 Replicas, 6 Zones

Hardware cost: ~ 1000 RMB/TB (Raw, including servers and switches)

Collaboration with StorBridge and SkyCloud Shanghai

Page 11: Open stack china_201109_sjtu_jinyh

© jinyh@sjtu

Nova Cluster Monitoring

Monitor by Ganglia

11

Page 12: Open stack china_201109_sjtu_jinyh

© jinyh@sjtu

VM Provisioning Time

VM: Windows7; Image Size: 20GB

12

Page 13: Open stack china_201109_sjtu_jinyh

© jinyh@sjtu

NOVA I/O Throughput

Tested by ATTO

13

Page 14: Open stack china_201109_sjtu_jinyh

© jinyh@sjtu

VM Network Throughput

Co-Located in a single physical machine (CSM)Distributed in multiple physical machines (DMM)

Connected by a single switch (CSS)Connected by multiple switches (CMS)

14

Page 15: Open stack china_201109_sjtu_jinyh

© jinyh@sjtu

Swift Testing

Scalability Adding a storage node to an existing zone

Adding a storage node as a new zone

No influence on the functions of swift

Reliability Disk failure/recovery

Storage node failure/recovery

Fault duration: 10 min & 1 hour

No influence on the functions of swift

Performance testing (ongoing) Throughput

Response time

Concurrency

15

Collaboration with Intel Shanghai

Page 16: Open stack china_201109_sjtu_jinyh

© jinyh@sjtu

Nova Potential Applications

Infrastructure as a service (either private or public)

VM management for DevOps in IT service department

Big data analysis and tools, e.g. noSQL and Map/Reduce

Elastic provisioning of web service, particularly for burst requests of web 2.0 or mobile applications

Next generation high performance computing, virtual cluster provisioning with middleware

16

Page 17: Open stack china_201109_sjtu_jinyh

© jinyh@sjtu

Syslog Analysis

17

RAW mirrored traffic into DPI: ~6Gbits/s

syslog into MongoDB: ~4MBytes/s ( 12000records/s )MongoDB increases ~400GBytes/day

Page 18: Open stack china_201109_sjtu_jinyh

© jinyh@sjtu

MongoDB Components

• Actual data• Needs RAM + Disk IO

• Stores sharding configuration • Stores small amounts of data• Infrequently queried/updated

by MongoS

• Stateless router• Typically run on App Servers

• Can run as Arbiter• No data• Just votes to elect primary

courtesy of 10gen

18

Page 19: Open stack china_201109_sjtu_jinyh

© jinyh@sjtu

MongoDB Dataset Provisioning and Primary Results

Cluster in OpenStack:

1 conf server (2CPU + 8GB MEM + 100GB HDD)

1 mongos (2CPU + 8GB MEM + 10GB HDD)

4 mongod (2CPU + 24GB MEM + 2TB HDD)

NO replication

Both volume size and compute nodes can be dynamically changed

No service interruption, no significant performance degradation when data increases

Primary Performance (To be significantly improved)

Aggregate 10min traffic (~7M records)

• MongoDB Map/Reduce takes less than 4 minutes

Query “time + 5 tuples” in 900M records

• MongoDB returns results in 10 seconds

Target (hopefully not dream )

Query dataset of 30 days, less than 1 sec.

19

Page 20: Open stack china_201109_sjtu_jinyh

© jinyh@sjtu

Swift Potential Applications

Similar to Amazon S3, so there are many potential applications, such as: Dropbox, Slideshare, Netflix,…

Sector related, such as medicine, education, media, …

Korean Telecom Commercial Deployment (CloudScaling)

Lack of monitoring; no quota restriction; less auto deployment

Our testing:

20

Page 21: Open stack china_201109_sjtu_jinyh

© jinyh@sjtu

Acknowledgement

Network & Information Center; State Key Lab of Optical Communication

Intel; IBM/BNT; H3C; Dell; Skycloud; Storbridge; IXIA; Apposite; Netronome; Chelsio; Fusion-IO

OpenStack Community

21