CE+WN+siteBDII Installation and configuration

50
www.epikh.eu The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) CE+WN+siteBDII Installation and configuration Bouchra RAHIM([email protected]) Africa 6 2010 - Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators Rabat, 01.06.2011

description

The EPIKH Project. (Exchange Programme to advance e-Infrastructure Know-How). CE+WN+siteBDII Installation and configuration. Bouchra RAHIM([email protected]) Africa 6 2010 - Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators Rabat, 01.06.2011. www.epikh.eu. Outline. - PowerPoint PPT Presentation

Transcript of CE+WN+siteBDII Installation and configuration

Page 1: CE+WN+siteBDII  Installation and configuration

www.epikh.eu

The EPIKH Project(Exchange Programme to advance e-Infrastructure Know-How)

CE+WN+siteBDII Installation and configuration

Bouchra RAHIM([email protected])

Africa 6 2010 - Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators

Rabat, 01.06.2011

Page 2: CE+WN+siteBDII  Installation and configuration

2

Outline

• Computing Element overview• Worker Node overview• CE CREAM overview• gLite stack overview• gLite CE siteBDII• gLite CE cream and WN

Page 3: CE+WN+siteBDII  Installation and configuration

3

gLite stack overview

Page 4: CE+WN+siteBDII  Installation and configuration

4

gLite overview

worker node

Page 5: CE+WN+siteBDII  Installation and configuration

5

glite overview• User Interface: it’s the point of access for users to

glite grid services• WMS: it’s the component that optimize resource

usage.• CE: the machine who manage worker nodes• WN: the machines who actually execute applications• SE: machines where files are stored• LFC: used to “find” files on the grid• BDII: services responsible to publish all info of your

sites• Logging and Bookkeping: as it’s name says it’s a

logger and alert user when job is finisched

Page 6: CE+WN+siteBDII  Installation and configuration

6

Computing Element Overview

• Computing Element provides some of main services of a site.

• Main functionalities:– job management (job submission, job control)– job status updated for WMS– Communicate with BDII site that publishes all information

regarding the computing element

• It can runs several kinds of batch system:– Torque + MAUI– LSF– SGE– Condor

Page 7: CE+WN+siteBDII  Installation and configuration

7

Torque + MAUI

• Torque server service:– pbs_server provides basic batch services such as

receiving/creating a batch job.

• Torque client service:– psb_mom places jobs into execution. It’s is also

responsible for returning job’s output to the user.

• MAUI system service:– job_scheduler contains site’s policy to decide which job is

going to be executed and when.

Page 8: CE+WN+siteBDII  Installation and configuration

8

Site BDII*

• By default it was installed on CE but now it’s better to install it on a dedicated server, physical or virtual.

• It collect all site GRISes* (for example SE,RB,LFC,etc...)

• Service is named bdii

• Log file: /opt/bdii/var/bdii.log

• *BDII = Berkeley Database Information Index• **GRIS = Grid Resouce Information Service

Page 9: CE+WN+siteBDII  Installation and configuration

9

Worker Node Element Overview

• They are machines which really execute your job.

• User can only access their services by a Computing Element.

• Their characteristics are collected by Computing Element that publishes all information by BDII services

Page 10: CE+WN+siteBDII  Installation and configuration

• Computing Resource Execution And Management

• Accept job submission requests belonging from a WMS and other job management request.

• It exposes a web services interface

10

CE Cream overview

Page 11: CE+WN+siteBDII  Installation and configuration

11

Requirements

• Three or more machine:– One will be used to perform CE installation;– One will be used to perform site BDII installation;– Others will be used to perform WN installation;

• Architecture: 64 bit• Operating System: Scientific Linux 5• Two machines with a public ip address, direct and

reverse address resolution on a DNS (CE and BDII ) • The CE machine must be equipped with an X509

certificate

Page 12: CE+WN+siteBDII  Installation and configuration

1212

BDII Installation)

Page 13: CE+WN+siteBDII  Installation and configuration

13

Preparing the Linux machine

• Network Time Protocol settings

# yum install ntp• Copy the ntp.conf file and the ntp directory from

ftp://repo.magrid.ma/pub/CE_WN_BDII/ to /etc/ (Winscp)• Synchronize the date

# /etc/init.d/ntpd stop# ntpdate ntp.marwan.ma

# /etc/init.d/ntpd start# chkconfig ntpd on

• Start the ntpd service and configure it to start on boot

Page 14: CE+WN+siteBDII  Installation and configuration

14

Preparing the Linux machine• Disable Selinux: make sure /etc/selinux/config contains line:

SELINUX=disabled

# /etc/init.d/iptables stop# chkconfig iptables off

• Stop iptables

• Please check If you have a valid hostname

#hostname –f# cat /etc/hosts

• Reboot

Page 15: CE+WN+siteBDII  Installation and configuration

15

Repository set up-BDII

• Add to system repository ones specific for middleware to install

# cd /etc/yum.repos.d/# mv dag.repo dag.repo.stopexport MREPO=http://repo.magrid.ma/yumrepo/glite32

# REPOS="dag lcg-CA glite-BDII_site"# for name in $REPOS;do wget $MREPO/$name.repo –O /etc/yum.repos.d/$name.repo; done

Page 16: CE+WN+siteBDII  Installation and configuration

16

package installation-BDII

• Use yum to install needed packets

# yum install lcg-CA ca-policy-egi-core ca-policy-lcg# yum install glite-BDII_site

Page 17: CE+WN+siteBDII  Installation and configuration

17

Yaim Configuration• All the configuration samples files are located in /opt/glite/yaim/examples/siteinfo directory

• it’s better to make a copy of the original files

Page 18: CE+WN+siteBDII  Installation and configuration

18

Yaim Configuration• You can find some template files in : ftp://repo.magrid.ma/pub/CE_WN_BDII/• Edit the site-info.def file and change the following variables:

– SITE_NAME=MA-ZZ-School (Name of the site)– CE_HOST=pcXX.magrid.ma (XX the machine that will be a CE)– SITE_BDII_HOST=pcYY.magrid.ma(the current machine)

• Edit the services/glite-bdii_site file and change the following variables:– SITE_NAME=MA-ZZ-School– SITE_DESC="MA-ZZ-School" 

Page 19: CE+WN+siteBDII  Installation and configuration

19

Yaim Configuration-BDII• Run the configuration Command:

• if everything is OK, run a basic test– ldapsearch -x -h pcYY.magrid.ma -p 2170 -b "mds-vo-name=local,o=grid"

/opt/glite/yaim/bin/yaim -c -s /opt/glite/yaim/etc/siteinfo/site-info.def -n glite-BDII_site

Page 20: CE+WN+siteBDII  Installation and configuration

20

CE Cream Installation(on Torque/PBS)

20

Page 21: CE+WN+siteBDII  Installation and configuration

21

Preparing the Linux machine

•Network Time Protocol settings

# yum install ntp• Copy the ntp.conf file and the ntp directory from

ftp://repo.magrid.ma/pub/CE_WN_BDII/ to /etc/ (Winscp)• Synchronize the date with an ntp server

# /etc/init.d/ntpd stop# ntpdate ntp.marwan.ma

# /etc/init.d/ntpd start# chkconfig ntpd on

• Start the ntpd service and configure it to start on boot

Preparing the Linux machine

Page 22: CE+WN+siteBDII  Installation and configuration

22

Preparing the Linux machine

• Disable Selinux: make sure /etc/selinux/config contains line:

SELINUX=disabled

# /etc/init.d/iptables stop# chkconfig iptables off

• Stop iptables

• Please check If you have a valid hostname

#hostname –f# cat /etc/hosts

Preparing the Linux machine

• Reboot

Page 23: CE+WN+siteBDII  Installation and configuration

23

Repository set up-CE

• Add to system repository ones specific for middleware to install

# cd /etc/yum.repos.d/# mv dag.repo dag.repo.stopexport MREPO=http://repo.magrid.ma/yumrepo/glite32

# REPO="dag lcg-CA glite-CREAM glite-TORQUE_server glite-TORQUE_utils"# for name in $REPOS;do wget $MREPO/$name.repo –O /etc/yum.repos.d/$name.repo; done

Page 24: CE+WN+siteBDII  Installation and configuration

24

package installation-CE

• Use yum to install needed packets# yum clean all # yum install lcg-CA ca-policy-egi-core ca-policy-lcg# yum install glite-CREAM# yum install glite-TORQUE_server glite-TORQUE_utils

• Due to a dependency problem within the Tomcat distribution in SL5 first install xml-commons-apis:

yum install xml-commons-apis

Page 25: CE+WN+siteBDII  Installation and configuration

25

Before configuration-HostCertificates• Some preliminary steps before configuration:

- copy host certificate in default path:

# cd# mv /root/pcXXcert.pem /etc/grid-security/hostcert.pem# mv root/pcXXkey.pem /etc/grid-security/hostkey.pem# chmod 400 /etc/grid-security/hostkey.pem# chmod 600 /etc/grid-security/hostcert.pem

Page 26: CE+WN+siteBDII  Installation and configuration

26

YAIM configuration-CE• Main file to edit is site-info.def, where you specify some

general settings and other component’s parameters (CE Cream)

• Other file to be edited are: wn-list.conf, users.conf,groups.conf, services/glite-creamce

• Set variables with corrected values replacing example ones.

# vi services/glite-creamceCEMON_HOST=pcXX.$MY_DOMAINCREAM_DB_USER=eumedCREAM_DB_PASSWORD=grid2011BLPARSER_HOST=pcXX.$MY_DOMAIN

Page 27: CE+WN+siteBDII  Installation and configuration

27

YAIM configuration-CE

# vi wn-list.conf pcAA.magrid.ma pcBB.magrid.ma

Declare the worker nodes in wn-list.conf

Page 28: CE+WN+siteBDII  Installation and configuration

28

YAIM configuration-CECE_HOST=pcYY.magrid.maCE_CPU_MODEL=XEON #cat /proc/cpuinfoCE_CPU_VENDOR=IntelCE_CPU_SPEED=2230CE_OS=ScientificSL CE_OS_RELEASE=5.5 #cat /etc/redhat-releaseCE_OS_VERSION="Boron"CE_OS_ARCH=x86_64CE_MINPHYSMEM=512 #cat /proc/meminfo on WNCE_MINVIRTMEM=512 CE_PHYSCPU=1 #total cpu in site CE_LOGCPU=4 CE_SMPSIZE=4CE_OUTBOUNDIP=TRUECE_INBOUNDIP=FALSECE_OTHERDESCR="Cores=4,Benchmark=6.5-HEP-SPEC06”

http://gkswiki.fzk.de/index.php5/Configuration_of_the_CREAM_CE

Page 29: CE+WN+siteBDII  Installation and configuration

29

YAIM configuration-CE• How to set CE_SI00, CE_SF00, CE_CAPABILITY, CE_OTHERDESCR

?

• Try to search for you value in this link:• http://www.italiangrid.org/grid_operations/site_manager/HEP-SPEC0

6

• https://hepix.caspur.it/benchmarks/doku.php?id=bench:results_sl5_x86_64_gcc_412

• https://hepix.caspur.it/processors/dokuwiki/doku.php?id=benchmarks:results

• For example if you have an Intel XEON 5520 2.23 GHz with no Hyper Threading will find in the table of previous link a value of 95 and a conversion factor of 1HS06=40 so:

• CE_SI00 = 3800

• CE_SF00 = 3800

• CE_CAPABILITY="CPUScalingReferenceSI00=3800”

• CE_OTHERDESCR="Cores=4,Benchmark=23.75-HEP-SPEC06”

• Where (3800/40)/4= 23.75

Page 30: CE+WN+siteBDII  Installation and configuration

30

YAIM configuration-CE

BATCH_SERVER=$CE_HOSTJOB_MANAGER=lcgpbsCE_BATCH_SYS=pbsBATCH_LOG_DIR=/var/spool/pbsAPEL_DB_PASSWORD=grid2011DGAS_ACCT_DIR=/var/spool/pbs/server_priv/accountingVOS="eumed"QUEUES=“eumed"EUMED_GROUP_ENABLE="eumed"

Page 31: CE+WN+siteBDII  Installation and configuration

31

YAIM configuration-CE

#/opt/glite/yaim/bin/yaim -c -s /opt/glite/yaim/etc/siteinfo/site-info.def -n creamCE -n TORQUE_server -n TORQUE_utils

#/opt/glite/yaim/bin/yaim -r -s /opt/glite/yaim/etc/siteinfo/site-info.def -n creamCE -f config_cream_blparser

• After editing you can launch command:

http://igrelease.forge.cnaf.infn.it/doku.php?id=doc:guides:devel:install-cream32

Page 32: CE+WN+siteBDII  Installation and configuration

32

Check the CE

• http://grid.pd.infn.it/cream/field.php?n=Main.CheckYourCREAMCEConfiguration

• Download the script wget

http://grid.pd.infn.it/cream/CheckCreamConf/current/CheckCreamConf.pl

chmod +x CheckCreamConf.pl • Run it:./CheckCreamConf.pl • Check output :

• CheckCreamConf.log

Page 33: CE+WN+siteBDII  Installation and configuration

33

WN Cream Installation(on Torque/PBS)

33

Page 34: CE+WN+siteBDII  Installation and configuration

34

Preparing the Linux machine

•Network Time Protocol settings

# yum install ntp• Copy the ntp.conf file and the ntp directory from

ftp://repo.magrid.ma/pub/CE_WN_BDII/ to /etc/ (Winscp)• Synchronize the date

# /etc/init.d/ntpd stop# ntpdate ntp.marwan.ma

# /etc/init.d/ntpd start# chkconfig ntpd on

• Start the ntpd service and configure it to start on boot

Preparing the Linux machine

Page 35: CE+WN+siteBDII  Installation and configuration

35

Preparing the Linux machine

• Disable Selinux: make sure /etc/selinux/config contains line:

SELINUX=disabled

# /etc/init.d/iptables stop# chkconfig iptables off

• Stop iptables

• Please check If you have a valid hostname

#hostname –f# cat /etc/hosts

Preparing the Linux machine

• Reboot

Page 36: CE+WN+siteBDII  Installation and configuration

36

Repository set up-CE

•Add to system repository ones specific for middleware to install

# cd /etc/yum.repos.d/# mv dag.repo dag.repo.stopexport MREPO=http://repo.magrid.ma/yumrepo/glite32

# REPOS="dag lcg-CA glite-WN glite-TORQUE_client "# for name in $REPOS;do wget $MREPO/$name.repo –O /etc/yum.repos.d/$name.repo; done

Repository set up-WN

Page 37: CE+WN+siteBDII  Installation and configuration

37

package installation-CE

•Use yum to install needed packets

# yum clean all # yum install -y lcg-CA ca-policy-egi-core ca-policy-lcg# yum groupinstall glite-WN# yum install glite-TORQUE_client

package installation-WN

Page 38: CE+WN+siteBDII  Installation and configuration

38

WN - YAIM Configuration• You can use same configuration file edited on CE:

- this can be done on all worker node of a site;

- so you don’t neet to re-edit anything!

• Copy configuration files from CE machine using scp command:mkdir /opt/glite/yaim/etc/siteinfo/

mkdir /opt/glite/yaim/etc/siteinfo/services

#Copy the following files site-info.def ,users.conf,groups.conf and wn-list.conf from ceroot@pcYY:/opt/glite/yaim/etc/siteinfo/site-info.def#copy the glite-wn from examples/services

• Ready to configure now

# /opt/glite/yaim/bin/yaim -c -s /opt/glite/yaim/etc/siteinfo/site-info.def -n glite-WN -n TORQUE_client

Page 39: CE+WN+siteBDII  Installation and configuration

39

WN - YAIM Configuration

• Ready to configure now

# /opt/glite/yaim/bin/yaim -c -s /opt/glite/yaim/etc/siteinfo/site-info.def -n glite-WN -n TORQUE_client

• A basic test:

• Check the status of pbs_mom• pbsnodes –a

Page 40: CE+WN+siteBDII  Installation and configuration

40

• Ready to configure now

# /opt/glite/yaim/bin/yaim -c -s /opt/glite/yaim/etc/siteinfo/site-info.def -n glite-WN -n TORQUE_client

• A basic test:

• Check the status of pbs_mom• pbsnodes –a

Page 41: CE+WN+siteBDII  Installation and configuration

4141

Testing installation

Page 42: CE+WN+siteBDII  Installation and configuration

42

Tests on CE• SSH access to CE to test if CE can see WN and to test if all main

service are up & running

# pbsnodes # /etc/init.d/gLite status

Page 43: CE+WN+siteBDII  Installation and configuration

43

Tests on CE

• SSH access to CE and then become a gilda user:

# su – eumed001

$ vi test.sh#!/bin/sh sleep 20 #(it's useful to see the job status) hostname

• Create a file and add the following:

• Set right permission to be executable:

$ chmod 700 test.sh

Page 44: CE+WN+siteBDII  Installation and configuration

44

Tests on CE

• Launch job locally on CE

$ qsub –q eumed test.sh

• Then check list of job in execution on CE

$ qstat –a

ce.localdomain: Req'd Req'd ElapJob ID Username Queue Jobname SessID NDS TSK Memory Time S Time--------------- -------- -------- ---------- ------ --- --- ------ ----- - ----0.pc22.magrid.ma eumed001 short test.sh 5839 -- -- -- 00:15 R --

• In case you want to abort a job execution:

$ qdel 3 #that is jobid

• In case you want to more info:

$ qstat -f 3

Page 45: CE+WN+siteBDII  Installation and configuration

45

Tests on CE

• If typing “qstat -a” command you didn’t get no output, no jobs are being executed on CE and this means your previous job terminated so now you can list output.

$ ls test.sh.e3 test.sh.o3$ cat test.sh.e3 #error file$$ cat test.sh.o3 #output filewn.localdomain

Page 46: CE+WN+siteBDII  Installation and configuration

46

JDL example

$ vim hostname-cream.jdl

Type = "Job";JobType = "Normal";Executable = "/bin/hostname";StdOutput = "hostname.out";StdError = "hostname.err";OutputSandbox = {"hostname.err","hostname.out"};Arguments = "-f";OutputSandboxBaseDestUri = "gsiftp://localhost/tmp“;

Page 47: CE+WN+siteBDII  Installation and configuration

47

Working test• SSH access to UI to test if CE can receive and execute

simple job$ ssh [email protected] #password: gridXX#set up the certificate

mkdir /home/grid01/.globus

[root@ui01 ~]# cp /root/user_cert/usercert.pem /home/grid01/.globus/usercert.pem

[root@ui01 ~]# cp /root/user_cert/userkey.pem /home/grid01/.globus/userkey.pem

[root@ui01 ~]# chown grid01 /home/grid01/.globus/usercert.pem

[root@ui01 ~]# chown grid01 /home/grid01/.globus/userkey.pem

[root@ui01 ~]# chmod 400 /home/grid01/.globus/userkey.pem

[root@ui01 ~]# su – grid01

[grid01@ui01 ~]$ voms-proxy-init --voms eumed

Enter GRID pass phrase: [grid2011]$ voms-proxy-init --voms eumedpassword[grid2011]#glite-ce-job-submit –r pc22.magrid.ma:8443/cream-pbs-eumed –o ID hostname-cream.jdl#glite-ce-job-status –i ID

Page 48: CE+WN+siteBDII  Installation and configuration

48

Troubleshooting

• Which logs are supposed to be open if something goes wrong?:–/var/log/message, for general errors–/opt/glite/var/log (especially glite-

ce-cream.log)–/var/spool/pbs/server_priv/

accounting/<data>, if even local submission on batch system doesn’t work.

Page 49: CE+WN+siteBDII  Installation and configuration

49

References• INFNGRID generic installation guide:

– http://igrelease.forge.cnaf.infn.it/doku.php?id=doc:guides:install-3_2

• YAIM configuration variables

– https://twiki.cern.ch/twiki/bin/view/LCG/Site-info_configuration_variables

• CE Cream installation guide:

– GLITE Cream CE 3.2 SL5 Installation Guide [INFNGRID Release Wiki]

• YAIM system administrator guide:

– https://twiki.cern.ch/twiki/bin/view/LCG/YaimGuide400

• EUMEDGRID wiki:

– http://wiki.eumedgrid.eu/bin/view

• EuMedGRID sites installation and setup tips

– http://wiki.eumedgrid.eu/twiki/bin/view/InfrastructureStatus/EumedSiteInstallation

• How To Check And Test Your CREAMCE

– http://grid.pd.infn.it/cream/field.php?n=Main.HowToCheckAndTestYourCREAMCE

Page 50: CE+WN+siteBDII  Installation and configuration

50

Thank you for your kind attention !

Any questions ?