A Multidisciplinary Computer Centre … is it possible? John Gordon CCLRC eSC CHEP March 2003.
-
Upload
david-griffin -
Category
Documents
-
view
221 -
download
0
Transcript of A Multidisciplinary Computer Centre … is it possible? John Gordon CCLRC eSC CHEP March 2003.
A Multidisciplinary Computer Centre
… is it possible?John GordonCCLRC eSC
CHEP March 2003
John Gordon
eScience Centre
The Problem?
A UK Colleague, quoted a few years ago when linux for physics was just becoming common:“We have four Linux systems: one for users to
login, one for CERN Linux, one for DESY Linux, one for Fermilab linux. And I think we will need one for BaBar Linux soon”
• Things have changed but by how much?• Many of the talks in this session describe
implementing a solution for one experiment but the staff requirements of this solution scale with number of experiments supported and the fragmentation of resources is inefficient.
• Can we run a single centre for everyone?
John Gordon
eScience Centre
LHC Hierarchical Model
London
Online System
Offline Processor Farm
~20 TIPS
CERN Computer Centre
UK Regional Centre
US Regional Centre
Italy Regional Centre
Germany Regional Centre
InstituteInstituteInstituteInstitute ~0.25TIPS
Physicist workstations
~100 MBytes/sec
~100 MBytes/sec
~622 Mbits/sec
~1 MBytes/sec
There is a “bunch crossing” every 25 nsecs.
There are 100 “triggers” per second
Each triggered event is ~1 MByte in size
Physicists work on analysis “channels”.
Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server
Physics data cache
~PBytes/sec
~622 Mbits/sec or Air Freight (deprecated)
Tier2 Centre ~1 TIPS
London ~1 TIPS
NorthGrid ~1 TIPS
ScotGrid ~1 TIPS
~622 Mbits/sec
Tier 0Tier 0
Tier 1Tier 1
Tier 2Tier 2
Tier 4Tier 4
1 TIPS is approximately 25,000
SpecInt95 equivalents
lcg.web.cern.ch/lcg
LCG
John Gordon
eScience Centre
Future LHC Experiments
Running US Experiments
A site like ours sits between many
experiments and grids
Sitting in the Centre
LCG
John Gordon
eScience Centre
The multi-experiment centre
• So what does a big centre look like these days?
• A big linux cluster and lots of disk?• Many types of hardware • All flavours of unix (still VMS!!)• All uses from desktop to supercomputer• Different disks (SCSI, IDE, RAID, SAN)• Different tapes• Different user communities
John Gordon
eScience Centre
The multi-experiment centre
• Unlikely to be able to run a centre for all disciplines if we cannot even run one for all HEP experiments
• This talk focuses on the problems of supporting many different HEP experiments
John Gordon
eScience Centre
Not a problem
• Lots of hardware problems, but the same ones as big and small centres
• Lots of anecdotes about hardware problems but sharing between experiments hasn’t been an issue recently.– Apart from Suns for Babar– and we backed away from AMD once
because an experiment wouldn’t accept them.
John Gordon
eScience Centre
The problems
• Software levels• ‘experts’ • Local rules• Security• Firewalls• The accelerator centres
John Gordon
eScience Centre
Software Levels
• Experiment A must upgrade the OS (or compiler, etc), Experiment B cannot.
• Linux brings more hardware dependencies– ExperimentA needs one kernel, fiberchannel
driver only available in another
• Now we have middleware too!!– Experiments can disagree over middleware
and OS.– And the middleware might not match the
OS
John Gordon
eScience Centre
‘experts’
• A 200GB disk costs $100 in Best Buy
• Therefore 100TB should cost $50K• If you pay more, you are profligate
and are wasting HEP funds!!!• … and you should probably be able
to negotiate a further discount for bulk purchase!
John Gordon
eScience Centre
Local Rules
• A responsible site probably has a policy for who can use its resources, with forms, acceptable use conditions and other safeguards.
• Most countries have legal obligations to trace users in case of law-breaking.
• Do we really want them to throw these away for the grid?
• Even if we want to, only a purely HEP lab can overrule the rules themselves– Even they usually have masters (DoE……)
John Gordon
eScience Centre
Security -Why Do We Care?
• Illegal use of resources (stolen software, child pornography ..)
• Base for high bandwidth attack on other targets (commercial, government ..)
• Unauthorised access to local data (data protection, financial info …)
• Health and safety: eg beam-line control• Destruction of local data, disruption of
local service• Gain passwords, keys to attack peer sites
John Gordon
eScience Centre
Security
• Most security issues are common to all sites
• Issues especially relevant here are:– Accelerator Centres (see earlier)– Distributed computing crosses security
boundaries• Authentication models, trust
– Remote users less attached to your integrity• Shared usernames – how can you trace?
– Software often under active development • Smaller user community and many less
developers than (eg) Apache
John Gordon
eScience Centre
Why Do We Need a Firewall?
You do not need a firewall if:• Either: you have perfect (bug free)
operating systems and you have infallible system administrators AND users
• Or: you don’t care if you have security incidents (unauthorised access to resources)
John Gordon
eScience Centre
How Do Hackers Break in
• Coding errors in server software: – Buffer overflows: give more than expected
(poor bounds checking)– Provide unexpected control info (eg append
unexpected commands)
• Trojans and viruses – backdoors• Inadequate access control. Eg:
– NFS export root filesystem R/W to world)– https server allows googlebot access to control
menus …file … delete …really delete … !!!
• Scanning rate: hundreds per minute
John Gordon
eScience Centre
Common Firewall Policies
• Don’t bother! Very unlikely…disasters!• Simple exclusion of some protocols. Eg
prevent SNMP off site.• Only allow some protocols
– eg only allow kerberised or encrypted protocols.
• Protected host ranges– eg keep some hosts/networks safe
• Protect large ranges of ports– eg privileged port range.
• Access control by host/port• Different sites – probably different policy!
John Gordon
eScience Centre
The accelerator centres
You will run• Our Linux• Our software• Our middleware• Our applications• Our security model• Don’t bother us with your local
restrictions or firewallsOh, and by the way, you’ll give us root
access to your machines to install it and sort out any problems
John Gordon
eScience Centre
The Answers
• …… so far• I hope I can learn more this week
John Gordon
eScience Centre
Software levels
• Will never get hardware vendors to remove dependence on OS
• Lobby middleware developers to be OS independent – and to keep up reasonably quickly
with latest releases• Experiment developers should
code to support multiple versions of everything– Don’t run to use new features
John Gordon
eScience Centre
‘experts’
• Ignore• Politely tell them to ‘go away’• Explain the realities of 24x365 use• Ask them to demonstrate their
solutions– And be prepared to accept if they are
correct
• Evaluate the most likely of their suggestions
John Gordon
eScience Centre
Local Rules (BaBar/RAL example)
• RAL is a TierA centre for BaBar• BaBar users have already signed up to
conditions for SLAC, BaBar, & Objectivity• They get an X.509 certificate• Sign EDG accceptable use conditions• Users are made aware of RAL-specific
issues – network traffic might be monitored
• RAL is happy that they know who the users are and can trace them.
• They are allowed to run as grid users
John Gordon
eScience Centre
Local Rules
• Use other sites as examples• Common acceptable use policies
– The more sites involved in writing them, the more likely they are to be ‘acceptable’
• Get ACs to act as legal entity for a VO– Need to trust the integrity of the VO– Local admins feel better if they can sue
someone• Don’t tell them they have no chance of suing
CERN
John Gordon
eScience Centre
Security
• Educate users through their sysadmins. Make them aware of the risks and responsibilities
• PKI and Grid offers ‘roles’ and ‘groups’ so someone can act as production simulation manager but still be identifiable.
John Gordon
eScience Centre
Firewalls
• One can often persuade local network admin to make an exception once.– But not many times
• Establish trust of your network admin– Convince them that you take security
seriously.– Less likely to achieve this if your machines
are regularly broken into.
• Experiment and middleware developers need to address firewall issues in their design– Security Group of LCG might help here.
John Gordon
eScience Centre
The accelerator centres
• They are not used to being questioned.– Put them face to face to resolve
clashes• HEPiX is a good forum for this.
Successes so far…..– AFS, profiles– Large Cluster Workshop– Surveys on firewalls support….
• But the grid has been a step back– Different centres, different grids.
John Gordon
eScience Centre
The accelerator centres
• This problem works against experiment’s interests.
• Experiments should take more control over their software environments, take their own compilers and libraries with them.
• Lobby for standard distributions– and use them
John Gordon
eScience Centre
Summary
• It is possible to take the first steps towards a truly multidisciplinary computer centre– Starting with HEP
• Labs and experiments need to talk and adopt new/common practices– Need a culture of collaboration in many
dimensions – Lab-lab, experiment-experiment, and
experiment-labs
• Don’t forget that your experiment/ software/ middleware is not the only one and some poor ****** is having to cope with them all.