2
IT Technical Forum - 12.06.2015
Luca Mascetti & Hugo Gonzalez Labrador Data & Storage Services (IT-DSS)
Outline
• Description and Architecture
• User Community and Service Numbers
• Technical Aspects & Integration
• Success Stories & Future Use-cases
• Overview and Summary
3
Description and Architecture
4
What is CERNBox ?CERNBox provides a cloud synchronisation service
• Available for all CERN users • Synchronise files ( data at CERN ) • Offline data access • Easy way to sharing with other users • All major platforms supported • Based on ownCloud
• Use EOS as storage backend • EOS is the disk storage for physics data • 70 PB installed of usable capacity
• 50% in Wigner5
What is CERNBox
6powered by
7
HTTPS LB
OC SharesHTTPS LBHTTPS LB
Sync Client Web Access
FUSE, xroot, gridftp, http, S3
Direct Data Access
https, webdav
CERNBox Architecture
Synchronisation from other EOS instances with Sync Client is also possible
EOSUSER
Available Access Methods
8
Web AccessSync Client
Mobile AppWebDAV
Directly from the storage backend
EOSUSER (xroot, http, s3, …)
User Community and Service Numbers
9
Physicists
CERNBox User Community
10
User community very active • Very positive feedback • Several useful suggestions • Important contributions • Users happy to help testing
new features
Service numbers • ~2350 users • ~3000 shares • ~20 million files stored
20% 60% 20%
Engineers
Services & Administration
1E+10%
1E+11%
1E+12%
1E+13%
1E+14%
1E+15%
1E+16%
Mar.14%
Apr.14%
May.14%
Jun.14%
Jul.14%
Aug.14%
Sep.14%
Oct.14%
Nov.14%
Dec.14%
Jan.15%
Feb.15%
Mar.15%
Apr.15%
May.15%
Deployed%Space% Used%Space%
CERNBox Service Numbers
11
1 PB
10 PB
100 TB
10 TB
1 TB
100 GB
10 GB
Users 2350
# files 20 Million
# dirs 1.8 Million
Quota 1TB/user
Used Space 55 TBDeployed
Space 1.3 PB
Migration from NFS to EOS
EOS offers “virtually unlimited” cloud-storage for our end-users
The total EOS installation at CERN is around 70 PB usable with the primary role of storing physics data
0"
500"
1000"
1500"
2000"
2500"
Mar)14"
Apr)14"
May)14"
Jun)14"
Jul)14"
Aug)14"
Sep)14"
Oct)14"
Nov)14"
Dec)14"
Jan)15"
Feb)15"
Mar)15"
Apr)15"
May)15"
Users%
0"
5"
10"
15"
20"
25"
30"
35"
4:00"
6:00"
8:00"
10:00"
12:00"
14:00"
16:00"
18:00"
20:00"
22:00"
0:00"
2:00"
4:00"
Hz#
Daily#User#Access#Pa0ern#
0"
10"
20"
30"
40"
50"
1"
Hz#
CERNBox#Weekly#User#Access#Pa7ern#
Current System Usage
12
Sun SunMon Tue Wed Thu Fri Sat
Dinner
Lunch Break
Late night work?
0"Hz"
400"Hz"
800"Hz"
1200"Hz"
7:42:40&
7:42:45&
7:42:50&
7:42:55&
7:43:00&
7:43:05&
7:43:10&
Peak Requests at 1.1kHz
and the system can sustain much more
Technical aspectsand integration
13
Technical aspects• Innovative integration of user environments & huge
data repositories
• Integration of CERNBox with • CERN SSO (in testing) • E-Groups (in testing) • Root Viewer (in testing)
• Architecture: OwnCloud vs. CERNBox • Cool features: Trash & Versions • Testing CERNBox: SmashBox
14
Integration with
15
CERN SSO
16
17
Nested e-groups
Embedded ROOT Viewer
18
The viewer is based on the ROOT data analysis framework developed at CERN by PH-SFT.
Integration done by CERNBox team.
19
DEMO
Vanilla OwnCloud
Web Application Server Database
Storage
NAMESPACEACLs
DATA
Metadata ops
Data ops
back
grou
nd s
can
and
popu
latio
n
- ACL consistency
OwnCloud Namespace
21
CERNBox
Web Application Server
ACLs DATANAMESPACE- ACLs consistency
Metadata ops
Data ops
Primary Object Store
CERNBox metadata plugin
Real owner (no apache/www-data)
23
Cool features
24
Versions feature
25
Trash bin feature
26
• Extensive Test Framework • Developed by CERNBox team • Validate integration + operational state • Avoid regression • Successful outside CERN
• External contributions • Other sites use it (e.g.SWITCH) • Part of QA cycle of OwnCloud
SmashBox
27
https://github.com/cernbox
Success Stories
28
E-Science
29Thanks to Mauro Arcorace, members of UNITAR/UNOSAT and CIMA foundation for the material provided
Problem:
- how to get data at CERN easily? - how to easily use CERN resources? (e.g. non-physicists)
CERNBox is an easy way to integrate our storage resources for non-expert end-user that may use different OSs
Using EOS as backend allow to access the data from batch nodes or from other location via https or xroot
…and it’s very simple to share results with collaborators…
Run2 Event: Photos Sharing
30
ALICE
ATLAS
CMS
CCC
LHCb
DG-CO
CERNBox was used to synchronise photos between photographers and the communication team
After the selection the photos were uploaded in CDS
Videos: Pre-Release to Press Office
31
Big video files uploaded on EOS with xrdcp
CERNBox used by the Press Office to share immediately videos to the media for download
After encoding published on CDS and archived
xrdcp
CERNBox was tweaked to support large files (~30GB) and redirect downloads directly to our storage nodes on EOS (with replication 3) to sustain peak requests
Future use-cases:lxplus and lxbatch integration
32
WORK IN PROGRESS
33
batch farmlazy output synchronisation
input synchronisation
JOB Processing
“Kerberised” FUSE mount
Choose What to Sync
DEMO
Overview and Summary
34
• Direct access to EOSUSER (and not only…) • not only Sync Client & Web • xroot, fuse, http/WebDAV
• e-group and SSO integration
• Access to Physics Data • synchronise experiment’s data
• ROOT files viewer
• Shared kerberised fuse access from lxplus & lxbatch
Features Overview
35
Summary• New service
• Fast growing • Very good feedback
• Full integration with petabyte storage • Integration with existing workflows
• Bring data closer to our users • New way to interact with your data
• We believe CERNBox is an innovative platform for scientific computing
36
Not yet a user? Try out!login with your NICE account https://cernbox.cern.ch
38
Download the Client or the App https://cern.ch/cernbox-resources
Top Related