Post on 13-Jan-2016
gStore: GSI Mass Storage
ITEE-Palaver GSI
26.6.2007
Horst Göringer, Matthias Feyerabend, Sergei Sedykh
storage-group@gsi.de
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 2
Overview
1. adsmcli: ended after 10 years of operation
2. tsmcli: modern concept
3. gStore (gstore): unified user interface
4. rearrangement of storage
5. gStore projects
6. final remarks
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 3
adsmcli: initial system
• Software:
1. ADSM: Adstar Storage manager • commercial• handles ATL and tapes
2. GSI Software:– Interface to users – API to ADSM
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 4
adsmcli: initial system
• Hardware 1996:– AIX server– ATL: IBM 3494
• 8 tape drives IBM 3590:• 14 MB/s, 10 GB/volume• max 23 TByte
– few GB disk (write) cache ADSM– 80 GB read cache (1998)
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 5
adsmcli: overview
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 6
adsmcli: early usage
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 7
adsmcli: the best year
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 8
adsmcli limitations
Restrictions: – bottleneck server– no scalability
• data capacity (cache)• I/O bandwidth
– missing write cache
frozen since 2001– only read cache upgrade 2003: 1.2 TB
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 9
tsmcli: concepts
Concepts: • separation of control and data flow:
– data flow: Data Mover– control flow: TSM Server, Entry Server
• many DMs => many parallel data streams• SAN: Storage Area Network • Cache Manager: read and write cache• direct DAQ connection to gStore
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 10
tsmcli concept
writecacheserver
readcacheserver
DB
DB
TSMserver
tape tape
tape tape
... servercache
AgentStor.
TSM
cacheread write
cache
client
data mover i (of n)
serverentry
DB
SAN
moverserver
archiveserver
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 11
tsmcli: storage view
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 12
tsmcli: usage
• tsmcli in production since January 2003 – in parallel to adsmcli– initially only for 'large' experiments
• write cache: since February 2005– for 'normal' clients:
• command tsmcli• RFIO API
– for DAQ clients (RFIO, write only)
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 13
tsmcli hardware 2007
• server: Windows 2000 cluster
• ATL: Sun StorageTek L700– 9 tape drives LTO2:– 35 MByte/s, 200 GByte/vol– max 140 TByte
• data mover:– 10 Windows (gsidm0-9), 4 TB disk cache– 5 Linux (slxdm01-5), 13 TB disk cache
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 14
tsmcli usage 2006
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 15
tsmcli usage 2007
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 16
gStore top load
top data transfer in 2006: Dec 31
• overall: 9.6 TB in 24 h – 111 MB/s on average
• slxdm01: 2.9 TB in 24 h – 33.6 MB/s on average
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 17
common mass storage interface
coexistence of 2 mass storage systems:
intermediary solution (ca 4 years)
=> common new interface gstore: replacing adsmcli and tsmcli
• successfully in operation since May 23• (considerable) enhancement of tsmcli SW:• access to 2 independent TSM servers and
attached DMs/disk caches– further scalability aspect!
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 18
storage status mid 20063 ATLs:
1. IBM 3494 (3590 tapes):– 50 TB experiment data (adsmcli)– 15 TB backup data– nearly filled
2. Sun StorageTek L700 (LTO2 tapes):– 120 TB experiment data (tsmcli)– max 140 TB => nearly filled
3. Sun StorageTek L700 (LTO1 tapes):– 38 TB backup data– max 70 TB
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 19
requirements
1. substantially more data capacity– 4 new tape drives IBM 3592 for 3494 ATL
2. separate experiment and backup data:• experiment data -> IBM 3494 • backup data -> LTO2 ATL
3. safe long term storage• upgrade LTO1 ATL -> LTO3• deploy in 'remote RZ'
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 20
gstore hardware 2007
1. server: AIX– ATL: IBM 3494
• 4 tape drives IBM 3592:• 100 MByte/s, 700 GByte/vol• max 1.6 PB
– data mover:• 5 Linux (slxdm01-5), 13 TB disk cache• 3 Linux (slxdm06-8), 17 TB disk cache
2. server: Windows 2000...
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 21
actions:
• move all existing experiment data to IBM
3592 tapes in 3494 ATL – 50 TB from 3590 media: finished
(adsmcli data)
=> old 3590 hardware/media replaced– 130 TB from LTO2 media: 40 TB done
(tsmcli data)• write all new experiment data to 3494 ATL:
– since May 23
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 22
actions:
• redirect all new backup data to LTO2 media– new pair of Linux TSM servers– in work
• move actual backup data to LTO2 media
– mainly user archives– from LTO1 and 3590 media– still to be done
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 23
open projects
• xrootd: – in test environments gStore access for
xrootd clients available– still open:
• stability xrdcp, • functionality Posix ls
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 24
open projects
• Grid SRM (Storage Resource Manager): – several types of SRMs installed
worldwide– common: no general mass storage
interface– currently under investigation for
connection with gStore: Berkeley SRM
('BeStMan')
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 25
open projects
• 2nd level DM:
– no SAN connection
– filled via LAN from 1st level DM
– inexpensive extension gStore read cache: • for data needed online for longer time
scales (weeks/months)
– no NFS: use gstore query/retrieve
• e.g. xrootd: enable full file information
• for new /d file servers !?
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 26
user requests
• gStore enhancements: – staging large sets of files: equal
distribution on all DMs (1st or 2nd level)• stage –distr• stage –distr –L2
– recursive access
• query/stage/retrieve –r path
– rename path/file
– files > 2 GB
– ...
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 27
Final Remarks I
• currently ca. 180 TB of exp. data on tape(+50 TB raw data backup)
• 1.6 – 2 PB max tape capacity
• I/O bandwidth
– > 1 GB/s cache <-> clients– < 0.4 GB/s cache <-> tape
• Hades DAQ end 2008: 200 MB/s• => more tape drives needed
• 35 TB disk cache (1st level)
ITEE-Palaver 26.6.07 gStore - GSI Mass Storage 28
Final Remarks II• gStore fully scalable in data capacity and
I/O bandwidth
• supports several TSM servers
• gStore fully flexible in hardware (TSM)
• in the past years: – managed growth of > order of magnitude– handled various hardwares and platforms
• gStore prepared for further growth (FAIR)• gStore adaptable for cooperation with
external software packages