Platform OCS 5 - Queen's Universitywiki.phy.queensu.ca/hughes/images/0/03/OCS5-Day2-module1.pdf ·...
Transcript of Platform OCS 5 - Queen's Universitywiki.phy.queensu.ca/hughes/images/0/03/OCS5-Day2-module1.pdf ·...
Day 2, module 1
Platform OCS 5Installer node setup and configuration
Module Objectives
Upon completion of this module, you will be able to: Install and configure an Installer Node Verify that the Installer Node is properly
configured
Pre-installation
Installing Red Hat HPC Software check Hardware check Network check Network information
Installing OCS Software check Hardware check Network check Network information
Public/CorporateNetwork
Pre-installation
We’ll spend the next few minutes covering the pre-requisites to installing a Platform OCS 5 cluster.
Building a OCS 5 or RH HPC Installer Node
Platform OCS 5 can be installed two different ways
1. From a OCS 5 CD/DVD + Red Hat Media
2. From a RHEL 5.1 machine via Red Hat Network
The preferred installation mechanism for Red Hat is via Red Hat network
All other Platform OCS partners use option 1
Option 1: Installing via Red Hat Network or Satellite
Server
Pre-installation check: RH HPC
Minimal hardware requirements for installer node 512 MB of physical memory (RAM) 40 GB disk DVD/CD drive One Ethernet interface (Two interfaces for a Beowulf cluster)
Software Configuration Required Red Hat Enterprise Linux 5.1 Red Hat installed on the machine At least one statically configured Ethernet Interface
Pre-installation network check – RH HPC
All compute nodes should be configured for PXE boot (change the BIOS settings)
The Ethernet switches need to be configured properly spanning tree should be disabled PortFast should be enabled (if supported) Switch should be tested prior to the install
Everything connected properly No bad ports
PXE refers to the Pre-boot Execution Environment supported on most NIC cards – Kusu and Platform OCS rely on this functionality to ease node management
Pre-installation network information – RH HPC
The following network related information will required during the install process: Installer node public interface can be configured DHCP Installer node host information:
Hostname (Fully Qualified Domain Name) Static IP address Subnet mask Gateway address (to public internet) DNS server IP address(es) (optional)
Installer node private interface must be static
Private Network information: Private IP Address Subnet Mask
Don’t guess on any of these settings messing up means a re-install!
Installing RHEL 5.1
•Standard RHEL 5.1 install
•Software tested on RHEL 5.1 not RHEL 5.0
Installing RHEL 5.1•Minimum Network config:
•1 Static Defined Ethernet Interface
•Other interfaces can be static or DHCP
•1 network interface must have access to the corporate network, or internet for RHN or Satellite Server.
•Manual hostname is optional for the DHCP network interface
•Manual hostname is required for the static private interface.
•Choose a private hostname and domain that will not conflict with any names/domains in your environment.
•Network configuration mistakes are the most common problem during OCS installation
Installing RHEL 5.1
•Choose:
•Web Server
•Software Development
•RH HPC requires a web server
•Software Development while not required is useful in a HPC cluster
Installing RHEL 5.1
Disk Partitioning:
•Minimum 35 Gb disk is required
•OCS stores OS repositories which are typically 4Gb each.
•The more disk space the better
•OCS installs OS repositories in /depot by default
•/depot should have a minimum of 15Gb allocated.
•The default partitioning works with OCS
•LVM partitioning recommended for the OCS Installer Node.
RHEL 5.1 Firstboot
Text box
RHEL 5.1 Firstboot
Text box
RHEL 5.1 Firstboot
•Connect the Server to RHN
•RHN setup can be skipped if needed
•RH HPC will be configured to update from Satellite Server in our Demo
RHEL 5.1 Firstboot
•Additional users can be created if needed
•Network Login is not supported in this release of RH HPC
Firstboot – Network Check
•Minimum Network config:
•1 Static Defined Ethernet Interface
•Other interfaces can be static or DHCP
•1 network interface must have access to the corporate network, or internet for RHN or Satellite Server.
•Network configuration mistakes are the most common problem during OCS installation
RHEL 5.1 Adding GPG keys and RHN Register
Keys for this demo are dummy keys
in the real product release Red Hat keys will be used.
Platform dummy Key:
wget -q http://intel7.lsf.platform.com/pub/PLATFORM-GPG-PUB-KEY
rpm --import PLATFORM-GPG-PUB-KEY
Fedora EPL Key:
wget -q http://download.fedora.redhat.com/pub/epel/RPM-GPG-KEY-EPEL
rpm --import RPM-GPG-KEY-EPEL
Add Server to RHEL Satellite Server:
rpm -Uvh http://intel7.lsf.platform.com/pub/rhn-org-trusted-ssl-cert-1.0-1.noarch.rpm
rhnreg_ks --username admin --password <yourpass> --serverUrl https://intel7.lsf.platform.com/XMLRPC --sslCACert /usr/share/rhn/RHN-ORG-TRUSTED-SSL-CERT
Installing RH HPC on RHEL 5.1
•Once the machine is properly registered install ‘OCS’
•# yum install ocs
•50 or more packages should be downloaded and installed.
•It will take ~5-10 minutes for install
Configuring RH HPC
Source the OCS environment
# source /etc/profile.d/kusuenv.sh
Run the OCS setup script:
# /opt/kusu/sbin/ocs-setup
If the machine is configured correctly the static network interface will be used for provisioning.
The static network interface will run a DHCP server – ensure this DHCP server will not conflict with other servers.
If everything is okay…continue with the script.
Enter a private DNS domain for the cluster.
The private domain will be used for all nodes provisioned in the cluster
It is the cluster private domain - *NOT AN EXISTING DOMAIN* in your organization or the Internet
Configuring RH HPC
•Continue with the ‘ocs-setup’ script:
•Choose a location for the OS repositories and Images:
•The default is /depot
•The installation script will ask for the RHEL 5.1 media
•The media is used to build the OS repository with all RHEL packages
• Copying the OS media takes approximately 10-15 minutes
•Using ISO media is the fastest way of installing
•CD/DVD media works as well
Red Hat HPC Lab 4
For the instructor and students Now that you have seen a Install…. Install the software on a cluster
first install RHEL 5.1 configure the server to connect to the Satellite Server install RH HPC configure RH HPC
Option 2: Installing with OCS Media and RHEL 5.1
Pre-installation check: OCS
Minimal hardware requirements for installer node 512 MB of physical memory (RAM) 35 GB disk DVD drive One Ethernet interface (Two interfaces for a Beowulf cluster)
Software Required Platform OCS 5 (or Kusu) Installation DVD The Platform OCS media contains the kits you need One OS Kit must be added during installation – make sure that
you have OS installation media (or ISOs if using VMware) on hand: Fedora Core 6 for x86 or x86_64 Centos 5.X for x86 or x86_64 Red Hat Enterprise Linux 5.x for x86 or x86_64
Pre-installation: network check
All compute nodes should be configured for PXE boot (change the BIOS settings)
The Ethernet switches need to be configured properly spanning tree should be disabled PortFast should be enabled (if supported) Switch should be tested prior to the install
Everything connected properly No bad ports
PXE refers to the Pre-boot Execution Environment supported on most NIC cards – Kusu and Platform OCS rely on this functionality to ease node management
Pre-installation network information - OCS
The following network related information will required during the install process: Installer node host information:
Hostname (Fully Qualified Domain Name) Static IP address Subnet mask Gateway address (to public internet) DNS server IP address(es)
Private Network information: Private IP Address Subnet Mask Private Network Name (can be anything)
Don’t guess on any of these settings messing up means a re-install!
After preliminary data gathering is done, and the installation host is physically connected to required networks, you are ready to begin
OCS 5 by default uses a Class ‘B’ address for the cluster network – this can be changed to meet your needs.
You do not need compute hosts available to install the installer, however it is handy to have at least one compute host to validate the installation
A step-by-step process for installing the installer node follows:
Installing the OCS 5 Installer host
Public/CorporateNetwork
One more Important Pre-requisite..
These are actually DVDs and CDs – not donuts! – since the installation procedure can take some time, you may want some donuts as well!
Building an OCS 5 Installer Node
Boot from CD Choose Language Configure Networks DNS, & Gateway
Root Password Partitioning & LVM Adding Kits Install Summary
Installing Packages Installer Node Boot
Installation complete. Installer node is ready to use.
Installation: installer – OCS 5
Insert the OCS 5 media and power up the frontend At splash screen: press enter
Installation: installer – OCS 5
The installer will guide you through the process of installing the installer node. It is designed to gather information “up front” so that (with the exception of OS kit installation) you can “walk away” after the installation has started.
Installation: installer – OCS 5
Use Tab to navigate from field to field, space bar to toggle choices active and enter to select a choice.
Installation: installer- OCS 5
Installation: installer – OCS 5
Configure eth0 (our private cluster network interface) and eth1 (the public network interface)
Installation: installer – OCS 5
Our private network should be set to the network type “provision” – the network name can be any descriptive name.
Installation: installer – OCS 5
Similarly the public interface should be set to “public”. These network names will appear in the Kusu database and will be used when generating system files such as dhcpd.conf
Installation: installer – OCS 5
Indicate the gateway IP and the IP address of one or more DNS servers.
Installation: installer – OCS 5
The private name will be used by OCS 5 to identify the private cluster domain.
Installation: installer – OCS 5
Set the timezone and provide an NTP server.
Installation: installer – OCS 5
This refers to the Red Hat Network (RHN) installation number – if you have this number you should enter it here – otherwise select skip. The RHN is used to guide the installer to select components appropriate for a users subscription – it is also used for support incidents to verify support status.
Installation: installer – OCS 5
Enter the root password for the installer that will be replicated to the compute hosts (via cfm)
Installation: installer – OCS 5
Next step is to setup our partition table – if you are unsure how to size the partitions, it is recommended that you accept the defaults
Installation: installer – OCS 5
The default partition setup places four volumes on the volume group “VolGroup00” – a small /boot partition is created at the beginning of the disk as well as a SWAP partition.
Installation: installer – OCS 5
Before proceeding with the installation, we’re asked to confirm the values that we’ve entered.
Installation: installer – OCS 5
Next we are presented with the list of default kits to install included with OCS 5 – we normally need to add an operating system kit – in our example we will add RHEL 5 as an OS kit as we’ll plan to install this on our compute nodes.
Installation: installer – OCS 5
The OS DVD (or CDs) will be copied to /depot/kits/rhel where the constituent RPMs will be used for package based installs or to create images for image based installs
Installation: installer – OCS 5
If installing from CD there may be multiple disks required.
Installation: installer – OCS 5
The initial repository will be created called “Repo for rhel 5 x86_x64” (located in /depot/repos/1000) in our example – the repository will contain symbolic links to the various packages included in the kits included in the repository definition.
Installation: installer – OCS 5
Next, Anaconda will begin installing RHEL on our installer node.
Installation: installer – OCS 5
Packages will be installed on the installer node.
Once completed, the installer node will reboot
Installation: installer – OCS 5
Installation: installer – OCS 5
The installer scripts in /etc/rc.kusu.d initialize Kusu loading the table schema for the database from /opt/kusu/sql – “mysqlrunner” is called to run generated SQL statements against the kusudb MySQL database tables reflecting the installation.
Installation: installer – OCS 5
Images for “imaged” and diskless nodes will be created from the repository definitions and placed in /depot/images – creating these images (each approx 4 GB in size) will take several minutes. – you can start up a separate session (Alt-F2) to monitor progress)
Install Complete:
Verifying your Installer Node
Verifying the Installer Node – OCS 5 & RH HPC
Check for hardware issues Check the kernel logs for any hardware driver errors:
dmesg
Check the system logs for any startup issues: less /var/log/messages
Log files for the entire Kusu installation process may be found under /root including “install.log” and “kusu.log”
kusu.log logs activity of the python based scripts that make up the installer
The kickstart file generated by the anaconda installer is also preserved in /root
Install node Test – OCS 5 & RH HPC
Network: Check that both eth0 and eth1 interfaces are up:
ifconfig
Verify the routing table is correct: (route or netstat –r) Traffic for the private network is routed over eth0 Traffic for the public network is routed over eth1 The default route will go through the gateway server pecified
during installation Multicast packets will be routed over eth0 (using 224.0.0.0
network) External hosts can be reached with the ping command
Installer node – OCS 5 & RH HPC
High performance interconnect driver for the interconnect hardware Use vendor diagnostic tools
RH HPC is not automatically configured to manage infiniband
Additional Kits for infiniband exist from vendors: Qlogic Cisco
Installer test – OCS 5 & RH HPC
Make sure all required services are running
Service How to check
Web Server service httpd status
DHCP service dhcpd status
DNS service named status
Xinetd service xinetd status
MySQL database service mysqld status
NFS service nfs status
AutoFS service autofs status
Plone/Zope service zinstance status
Check the Platform OCS 5 infrastructure repoman –l (ensure OCS commands can query the
database)
Check the kits: kitops –l | more
X Windows System Start: startx Start a browser to check the cluster homepage verify installed kits
http://localhost (admin/admin) http://localhost/nagios (admin/admin) http://localhost/cacti (admin/admin)
Installer test – OCS 5 & RH HPC
Installer test – OCS 5 & RH HPC
By default, http://localhost/Plone should show the start page for the Cluster – Plone is built on the zope application server. zope is in the zinstance directory under /opt/Plone-3.0.3
verify web-server - nagios
The nagios management component will be installed on the installer node by default – accessible at http://localhost/nagios - the default username password is admin / admin
Post-installation: verify web-server - Cacti
Similarly, cacti will be installed at http://master/cacti - the default username password is admin / admin
Platform OCS 5/ RH HPC Lab 5
Notes to Students and Instructor verify that RH HPC or OCS is working properly
check the web pages: Nagios, Cacti check the repositories repoman check the installed kits kitops run ngedit and examine the node groups run genconfig and generate some cluster configuration files.
Post installation
Congratulations!
You are ready to install the compute nodes !
Thank You!