How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13

Post on 13-Nov-2014

5.654 views 1 download

Tags:

description

 

Transcript of How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13

TLUG Meeting 2008/09/13

Gosuke Miyashita

My company

paperboy&co.Web hosting, blog, ec hosting and so on for

indivisualsAbout 1,000 Linux serversMany single servers ...

My goal of a scalable storage system Storage system for a web hosting service

High resource availabilityFlexible I/O distributionEasy to extend Mountable by multiple hostsNo SPoFWith OSSWithout expensive hardwares

Now I’m trying technologies for these purposes

cman CLVM GFS2 GNBD DRBD DM-MP

Technologies

cman

Cluster Manager A component of Red Hat Cluster Suit Membership management Messaging among cluster nodes Needed for CLVM and GFS2

CLVM

Cluster Logical Volume Manager Cluster-wide version of LVM2 Automatically share LVM2 metadata

among all cluster nodes So logical volumes with CLVM available

to all cluster nodes

CLVM

Logical volumeon shared storage

LVM2Metadata

clvmd

LVM2Metadata

clvmd

LVM2Metadata

clvmd

clvmd distributes metadata among cluster nodes

Logical volumes presented to each cluster node

GNBD

Global Network Block Device Provides block-device access over

TCP/IP Similar to iSCSI Advantage over iSCSI is built-in fencing

GNBD

GNBD client

GNBD client

GNBD client

GNBD Server

Exported block device

TCP/IP network

GFS2

Global File System 2 One of cluster-aware file systems Multiple nodes can simultaneously

access this filesystem Uses DLM(Distributed Lock Manager) of

cman to maintain file system integrity OCFS is another cluster-aware file

system

GFS2

GNBD Server

GFS2

GNBD client

cman

GNBD client

cman

GNBD client

cman

These nodes can access to the GFS2 file system simultaneously

DRBD

Distributed Replicated Block Device RAID1 over a network Mirrors a whole block device over

TCP/IP Available Active/Active with cluster file

systems

DRBD

Server

Block Device

Server

Block Device

Replication

DM-MP

Device-Mapper Multipath Bundles I/O paths to one virtual I/O path Can choose active/passive or

active/active

DM-MP with SAN storage

Node

HBA1 HBA2

SAN swtich 1 SAN swtich 2

Storage

CNTRLR1 CNTRLR2

/dev/sda1 /dev/sdb1

Seen as one device

/dev/mapper/mpath0

active/passiveor

active/active

A scalable storage system

cmanGNBD

cmanGNBD

/dev/VG0/LV0 (CLVM)

GNBD

Server

GFS2

GNBD

Server

GFS2

Replication(DRBD)

/dev/mapper/mpath0(DM-MP)/dev/gnbd0 /dev/gnbd1

mount /dev/VG0/LV0 /mnt

GNBD

Server

GFS2

GNBD

Server

GFS2

Replication(DRBD)

/dev/mapper/mpath1(DM-MP)/dev/gnbd2 /dev/gnbd3

How to extend

cmanGNBD

cmanGNBD

/dev/VG0/LV0 (CLVM)

mount /dev/VG0/LV0 /mnt

GNBD

Server

GFS2

GNBD

Server

GFS2

/dev/mapper/mpath0

/dev/gnbd0 /dev/gnbd1GNB

DServe

rGFS

2

GNBD

Server

GFS2

/dev/mapper/mpath1

/dev/gnbd2 /dev/gnbd3GNB

DServe

rGFS

2

GNBD

Server

GFS2

/dev/mapper/mpath2

/dev/gnbd4 /dev/gnbd5

I wonder ...

Many components cause troubles? How about overhead and performance? How about stability? More better way? How about other than Red Hat Linux?