WestGrid Handbook for University of Manitoba Researchers

24
WestGrid Handbook for Researchers at the University of Manitoba January 2010

Transcript of WestGrid Handbook for University of Manitoba Researchers

 WestGrid

Handbook for Researchers at the University of Manitoba

January  2010  

 

 

 

 

 

 

 

 

 

 

 

 2  

  3  

Table  of  Contents  

Table  of  Contents.............................................................................................................................3  1   Overview......................................................................................................................................5  1.1   This  Guide ......................................................................................................................................... 5  1.2   WestGrid............................................................................................................................................ 5  

2   Information  for  Grant  Applicants .......................................................................................6  3   Information  for  Prospective  Users .....................................................................................8  3.1   Who  should  consider  using  WestGrid? ................................................................................... 8  3.2   Am  I  eligible  for  an  account?....................................................................................................... 8  3.3   How  do  I  get  an  account?.............................................................................................................. 8  3.4   Is  there  a  charge  for  using  WestGrid?...................................................................................... 8  3.5   What  hardware  facilities  does  WestGrid  have? ................................................................... 8  3.6   What  software  is  available  on  WestGrid  systems?.............................................................. 9  3.7   How  much  computing  power  is  available  to  me? ................................................................ 9  3.8   What  is  the  WestGrid  computing  environment  like? ......................................................... 9  3.9   Is  parallel  programming  required?.......................................................................................... 9  3.10   What  experience  do  I  need  to  use  WestGrid?..................................................................... 9  

4   Quick  Start  Guide  for  New  Users ...................................................................................... 11  4.1   Getting  Started...............................................................................................................................11  4.2   Choosing  Which  System  to  Use ................................................................................................11  4.3   Setting  up  Your  Computer .........................................................................................................12  4.3.1   Terminal  client  supporting  SSH........................................................................................................12  4.3.2   File  transfer  client  supporting  scp  and  sftp.................................................................................12  4.3.3   X  Window  display  server  for  graphics...........................................................................................13  

4.4   Connecting  and  Logging  In ........................................................................................................13  4.5   Working  Interactively.................................................................................................................14  4.5.1   The  UNIX  environment.........................................................................................................................14  4.5.2   File  systems ...............................................................................................................................................15  4.5.3   Transferring  files.....................................................................................................................................15  4.5.4   Editing  files ................................................................................................................................................16  4.5.5   Running  interactive  programs ..........................................................................................................16  4.5.6   Restrictions  on  interactive  jobs ........................................................................................................17  

4.6   Software...........................................................................................................................................17  4.6.1   Locating  installed  software.................................................................................................................17  4.6.2   Installing  your  own  software.............................................................................................................18  4.6.3   Requesting  software  installation .....................................................................................................18  4.6.4   Software  licensing...................................................................................................................................18  

4.7   Programming.................................................................................................................................18  4.8   Running  Batch  Jobs......................................................................................................................19  4.8.1   The  batch  environment ........................................................................................................................19  4.8.2   Batch  job  scripts ......................................................................................................................................19  4.8.3   Commands  for  submitting,  monitoring  and  deleting  jobs ....................................................20  

4.9   Post-­Processing.............................................................................................................................21  4.9.1   Managing  files...........................................................................................................................................21  

 4  

4.9.2   Visualization..............................................................................................................................................21  4.10   Usage  Guidelines ........................................................................................................................21  4.10.1   Job  limits ..................................................................................................................................................21  4.10.2   How  much  is  reasonable? .................................................................................................................21  4.10.3   Job  priorities  and  the  fairshare  policy.........................................................................................22  4.10.4   Resource  Allocation  Committee.....................................................................................................22  4.10.5   Accounting...............................................................................................................................................22  

5   More  information .................................................................................................................. 23  5.1   Getting  Help....................................................................................................................................23  5.2   Local  WestGrid  Contacts ............................................................................................................24  

 

  5  

1 Overview  

1.1 This  Guide  The  purpose  of  this  guide  is  to  provide  an  easy  and  quick  way  of  accessing  

information  about  WestGrid  for  researchers  at  the  University  of  Manitoba.  This  information  and  much  more  are  available  at  the  WestGrid  website,  www.westgrid.ca.  The  guide  is  divided  into  three  sections:  

• Information  for  Grant  Applicants  This  section  is  intended  for  researchers  who  are  considering  applying  for  grants  for  HPC  equipment  and  outlines  the  policies  set  forth  by  CFI  and  Compute  Canada.  

• Information  for  Prospective  Users  This  section  contains  answers  to  a  set  of  common  questions  designed  to  help  prospective  users  decide  if  they  should  use  WestGrid.    

• Quick  Start  Guide  for  New  Users  This  section  is  intended  to  help  new  WestGrid  users  getting  started  with  using  WestGrid.  It  contains  the  basic  information  needed  to  start  running  jobs  on  the  WestGrid  systems.  

1.2 WestGrid  WestGrid  is  a  consortium  of  Western  Canadian  universities  and  other  

partners  that  provides  high  performance  computing  resources  for  Canadian  research  projects.  It  encompasses  14  partner  institutions  in  British  Columbia,  Alberta,  Saskatchewan,  and  Manitoba.    

 

 6  

2 Information  for  Grant  Applicants  

Since  CFI's  creation  of  the  National  Platforms  Fund  (NPF)  competition,  new  rules  have  been  created  that  affect  researchers'  ability  to  apply  for  High  Performance  Computing  (HPC)  equipment,  including  clusters,  shared  memory  multiprocessors,  etc.  CFI's  goal  in  doing  this  was  to  avoid  unnecessary  replication  of  resources  and  inefficient  use  of  resources  and  to  try  to  ensure  the  effective  management  of  allocated  HPC  resources.  Unlike  some  other  research  equipment,  HPC  equipment  is  generally  more  easily  shareable.  Remote  access  is  also  possible,  and  effective,  automated  tools  for  sharing  resources  (i.e.  scheduling  systems)  are  readily  available  and  deployed.  Also,  given  the  relatively  short  lifetime  of  HPC  equipment,  idle  time  is  exceptionally  wasteful.  Since  sharing  HPC  resources  is  simple,  in  most  cases,  it  therefore  makes  sense  to  have  multiple  researchers  share  a  few  large  HPC  resources  (like  those  provided  by  WestGrid)  rather  than  support  many  smaller,  dedicated  ones.  Finally,  providing  effective  management  of  HPC  resources  is  very  problematic  and  this  has  a  direct  impact  on  research  effectiveness.  Given  the  level  of  support  available  through  NSERC  and  other  granting  agencies  for  technical  staff,  it  makes  more  sense  to  provide  support  "centrally"  than  attempt  to  do  so  for  many  small  research-­‐lab  sized  HPC  systems.  Using  graduate  students  to  perform  system  support  as  is  often  done  for  smaller  systems  is  not  only  ineffective,  but  it  prevents  them  from  making  timely  progress  on  their  own  research.  

As  a  result,  in  2008,  CFI  introduced  certain  restrictions  on  applications  for  HPC  equipment.  Quoting,  with  some  minor  additions  enclosed  by  "[]",  from  the  Compute  Canada  website  (www.computecanada.org):  

"For  projects  requesting  computing  equipment  in  excess  of  $500,000,  the  CFI  will  require  a  letter  from  the  applicant's  institution  detailing  the  reason(s)  why  the  current  high  performance  computing  resources  offered  by  Compute  Canada  [(i.e.  the  member  consortia,  including  WestGrid)]  are  unable  to  meet  the  needs  of  the  project.  The  CFI  expects  this  letter  to  be  submitted  only  after  there  have  been  substantial  discussions  between  the  applicant  and  Compute  Canada.  The  CFI  will  then  make  a  final  decision  regarding  the  need  for  the  proposed  computing  equipment.  The  CFI  may  seek  expert  advice  to  assist  with  difficult  or  ambiguous  cases."  

Compute  Canada  also  provides  a  somewhat  more  detailed  document  describing  the  process  used  in  2008  to  deal  with  non-­‐consortia  requests  for  HPC  equipment  valued  over  $500,000.  It  may  be  found  at:  

https://computecanada.org/__groups__/local.researchers/CFI-­HPC_proposals_en.pdf  

  7  

In  general,  the  facilities  provided  by  Compute  Canada  via  the  various  HPC  consortia  in  Canada  are  capable  of  supporting  the  vast  majority  of  HPC  researchers'  needs.  As  such,  the  success  of  non-­‐consortia  requests  for  HPC  equipment  has  been  low.  There  are,  however,  a  number  of  limitations  to  what  WestGrid  and  the  other  consortia  can  provide  due,  primarily,  to  the  shared  nature  of  the  facilities.  These  might  provide  the  basis  for  a  successful  argument  for  CFI  support  for  non-­‐consortia  based  HPC  infrastructure.    

Currently,  the  consortia  use  batch-­‐style  job  submission  systems.  Since  most  HPC  problems  run  for  days  or  weeks  before  producing  results,  this  does  not  matter.  Further,  it  supports  much  more  effective  use  of  the  hardware  itself.  There  are  only  very  limited  HPC  facilities  available  supporting  interactive  use.  If  a  researcher  can  make  an  argument  that  they  need  direct,  interactive  control  over  their  running  HPC  applications,  then  this  might  be  seen  as  a  reason  for  a  dedicated  system.  While  direct  access  to  facilities  is  certainly  convenient,  it  is  seldom  required.  Further,  interactive  "steering"  of  long-­‐running  HPC  applications  has  personnel  implications.  A  strong,  very  special  argument  would  be  necessary.  

Being  shared  facilities,  the  systems  provided  by  the  consortia  must  be  highly  reliable.  This  provides  another  possible  reason  for  requesting  a  dedicated  HPC  facility.  If  a  researcher  is  creating  their  own  custom  software  that  would  make  an  HPC  system  unstable  for  other  users,  then  a  WestGrid  style  share  facility  would  be  inappropriate.  Making  this  argument  would  normally  require  that  the  custom  software  include  modifications  to  the  operating  system  kernel.  Otherwise  an  argument  for  possible  instability  would  be  very  weak.  Further,  such  software  development  would  commonly  not  require  a  large  HPC  system  and  therefore  would  be  unlikely  to  cost  over  $500,000.    

Shared  HPC  facilities  must  also  be  generally  useful  to  a  wide  range  of  HPC  users.  This  means  that  special  purpose  architectures  (e.g.  those  based  on  the  use  of  graphical  processing  units)  are  unlikely  to  be  available.  A  researcher  with  a  strong  and  clear  need  for  such  a  special  HPC  architecture  might  request  a  dedicated  facility.  Given  the  unique  nature  of  such  systems,  however,  programming  them  is  typically  very  challenging.  As  such,  any  such  application  would  need  to  be  able  to  argue  that  specialized  technical  support  staff  would  be  available  to  effectively  use  the  equipment.  

In  rare  circumstances,  an  argument  for  a  dedicated  facility  might  also  be  made  based  on  issues  related  to  software  availability.  This  would  require  that  it  be  impossible,  for  example  due  to  strict  licensing  restrictions,  to  run  software  on  a  shared  facility.  Almost  all  software  in  current  use,  however,  provides  licensing  options  for  shared  facilities.  

Finally,  any  request  for  dedicated  HPC  infrastructure,  regardless  of  reason,  will  need  to  make  a  strong  argument  related  to  effectiveness  of  use.  It  will  need  to  be  clear  to  CFI  that  human  resources  will  be  available  to  support  the  equipment  and  ensure  that  it  is  readily  available  for  use.  

 8  

3 Information  for  Prospective  Users  

This  section  is  adapted  from  www.westgrid.ca/support/prospective_users  and  contains  answers  to  a  set  of  common  questions  to  help  prospective  users  to  judge  whether  to  consider  using  WestGrid.  More  information  can  be  obtained  by  contacting  the  support  (see  page  23  for  contact  information)  or  exploring  the  WestGrid  website.  

3.1 Who  should  consider  using  WestGrid?  The  WestGrid  project  is  intended  to  facilitate  research  that  depends  on  access  

to  computing  resources  that  are  beyond  the  means  of  the  local  resources  of  the  individual  researcher  and  to  relieve  the  researcher  of  the  burden  of  maintaining  his  or  her  own  machine  room.  You  should  consider  using  WestGrid  if  your  local  computing  environment  presents  fundamental  barriers  to  advancement  of  your  projects,  due  to  such  factors  as  limited  numbers  of  machines,  limited  memory,  inadequate  disk  space  etc.  In  some  cases,  access  to  parallel  processing  to  allow  faster  turnaround  of  individual  jobs  or  more  aggregate  memory  to  enable  larger  jobs  to  be  completed  is  the  motivation.  

3.2 Am  I  eligible  for  an  account?  WestGrid  facilities  are  designated  for  Canadian  researchers  or  those  

collaborating  on  Canadian  research  projects.  In  general,  any  academic  researcher  from  a  Canadian  research  institution  with  significant  high  performance  computing  requirements  to  support  his  or  her  research  may  apply  for  an  account  on  WestGrid.  Students  and  research  assistants  require  sponsorship  from  a  faculty  supervisor.  

3.3 How  do  I  get  an  account?  Conditions  of  use  and  other  details  about  accounts,  including  an  online  

application  form  are  found  at:    

http://www.westgrid.ca/support/accounts    

3.4 Is  there  a  charge  for  using  WestGrid?  There  is  currently  no  charge  for  routine  use  of  the  WestGrid  facilities.  Charges  

may  apply  for  backup  tapes  or  specialized  software.  

3.5 What  hardware  facilities  does  WestGrid  have?  A  variety  of  commodity  and  high-­‐performance  clusters,  large  shared-­‐memory  

computers,  and  specialized  storage,  visualization  and  collaboration  facilities  are  available.  For  detailed  and  up  to  date  information  about  the  various  systems  and  advice  on  which  one  to  use  see:    

http://www.westgrid.ca/resources_services  

  9  

A  single  account  application  gives  access  to  all  but  a  few  of  the  more  specialized  resources,  for  which  a  separate  request  may  be  required.  

3.6 What  software  is  available  on  WestGrid  systems?  Up  to  date  lists  of  installed  system  software,  compilers,  mathematical  and  

other  libraries,  and  application  software  are  available  on  the  WestGrid  software  page:  

http://www.westgrid.ca/support/software  

3.7 How  much  computing  power  is  available  to  me?  The  answer  is  not  straightforward,  as  there  are  many  variables  involved,  

including  which  WestGrid  system  is  being  used,  whether  you  have  a  small  number  of  large  runs  or  a  large  number  of  small  runs,  whether  your  project  has  been  assigned  special  priority  by  the  Resource  Allocation  Committee,  etc.  For  an  active  user  without  unusually  large  memory  or  processor  requirements,  10-­‐30  processors  may  be  obtained  on  a  fairly  regular  basis.  

3.8 What  is  the  WestGrid  computing  environment  like?  All  the  WestGrid  computers  use  a  UNIX  variant  or  Linux  operating  system.  

Work  such  as  job  preparation,  compilation,  testing  and  debugging  may  be  done  interactively,  but,  the  majority  of  the  WestGrid  computing  resources  are  available  only  for  production  batch-­‐oriented  computing.  A  job  script  to  run  your  program  is  written  using  a  UNIX  shell  scripting  language  and  submitted  to  the  batch  job  handling  system  for  assignment  to  a  machine  for  running.  

3.9 Is  parallel  programming  required?  Although  some  of  the  WestGrid  computers  are  reserved  for  parallel  

computing,  there  are  legitimate  reasons  to  run  serial  jobs.  So,  you  are  welcome  to  run  serial  code  on  those  systems  where  it  is  permitted.  WestGrid  support  staff  can  assist  you  in  selecting  the  most  appropriate  systems  on  which  to  run  your  jobs  and  with  parallelization  of  your  code.  

3.10 What  experience  do  I  need  to  use  WestGrid?  As  a  production  high-­‐performance  computing  environment,  researchers  have  

a  certain  responsibility  to  use  the  WestGrid  systems  effectively.  You  are  expected  to  learn  the  basics  of  UNIX  file  handling,  how  to  transfer  files,  submit  and  monitor  batch  jobs,  monitor  your  disk  storage,  etc.  Many  WestGrid  users  come  from  a  Microsoft  Windows  background  and  so  are  not  expected  to  be  UNIX  experts.  WestGrid  support  analysts  are  happy  to  help  you  get  started  and  assist  you  in  learning  to  use  the  systems  more  effectively.  You  should  be  aware  of  the  memory  requirements  of  your  job  and  be  able  to  estimate  such  things  as  how  long  a  job  will  take  and  how  much  disk  space  it  will  require.  If  you  are  developing  code  yourself,  you  are  expected  to  optimize  your  code  through  appropriate  choice  of  algorithm,  compiler  flags,  and,  in  many  cases,  using  optimized  numerical  

 10  

libraries.  If  you  are  using  a  discipline-­‐specific  package,  you  are  expected  to  know  how  to  prepare  the  input  files,  choose  the  appropriate  options  to  apply  the  software  to  your  particular  problem,  etc.  The  WestGrid  environment  is  not  particularly  good  for  learning  how  to  use  software.  

  11  

4 Quick  Start  Guide  for  New  Users  

This  section  is  adapted  from  www.westgrid.ca/support/quickstart/new_users  and  is  intended  to  help  new  WestGrid  users  find  basic  information  needed  to  start  using  WestGrid.  It  consists  primarily  of  links  to  other  pages  on  the  WestGrid  web  site.  If  you  do  not  have  a  WestGrid  account  yet,  please  read  the  previous  section  for  prospective  users.  

4.1 Getting  Started  After  an  application  for  a  WestGrid  account  has  been  approved,  an  email  is  

sent  to  the  new  user  to  direct  him  or  her  to  some  of  the  key  pages  on  the  WestGrid  web  site.  This  guide  gives  a  more  extensive  list.  We  recommend  that  you  go  through  all  of  the  topics  on  this  page,  exploring  the  links  to  more  detailed  information  on  those  particular  subjects  that  are  most  relevant  to  you.  It  is  useful  to  try  things  as  you  go  along  and  ask  questions  to  the  support  staff  (see  page  23  for  contact  information)  if  you  encounter  difficulties.  

4.2 Choosing  Which  System  to  Use  After  a  WestGrid  account  is  received,  there  is  typically  a  flurry  of  email  

advising  the  account  holder  that  his  or  her  account  is  ready  to  use  on  particular  WestGrid  systems.  One  of  these  emails  may  include  comments  from  the  analyst  who  screened  your  account  application  about  the  special  requirements  section  on  the  application  form.  Sometimes  these  comments  include  advice  on  which  WestGrid  system  to  use.  

There  are  a  number  of  factors  to  consider  when  choosing  a  system,  the  most  important  typically  being  whether  or  not  your  job  runs  in  parallel  and  the  amount  of  memory  required  per  process.  If  the  job  can  be  run  in  parallel,  an  important  criterion  to  use  in  determining  where  to  run  it  is  whether  it  can  make  use  of  multiple  cluster  nodes  (such  as  when  using  MPI)  or  has  to  be  run  on  a  shared-­‐memory  machine  (such  as  when  OpenMP  is  used).  Some  general  guidelines  are:  

• Small-­‐memory  serial  jobs  or  undemanding  parallel  jobs  are  typically  run  on  the  Glacier  and  Robson  clusters.  

• OpenMP-­‐based  parallel  jobs  and  large-­‐memory  (>  4GB)  serial  jobs  may  be  run  on  the  shared-­‐memory  architectures  (Cortex  and  Nexus  systems,  with  Cortex  being  the  preferred  starting  point).  

• For  MPI-­‐based  parallel  programs  requiring  a  high-­‐performance  interconnect  try  the  Matrix  cluster.  You  may  also  like  to  compare  to  the  systems  available  through  Cortex.  

 12  

• Jobs  that  require  access  to  graphics/visualization  hardware  and  software  are  typically  run  on  Hydra.  

• A  commercial  license  for  the  Gaussian  Chemistry  software  is  only  available  on  the  Lattice  cluster.  

In  other  cases,  availability  of  certain  software  libraries  may  dictate  the  system  to  use,  but,  it  may  be  possible  to  work  around  such  issues  by  installing  additional  software  or  substituting  one  library  for  another.  Up  to  date  lists  of  installed  system  software,  compilers,  mathematical  and  other  libraries,  and  application  software  are  available  on  the  WestGrid  software  page:  

http://www.westgrid.ca/support/software  

For  detailed  and  up  to  date  information  about  the  various  systems  and  advice  on  which  one  to  use  see:    

http://www.westgrid.ca/resources_services  

4.3 Setting  up  Your  Computer  To  connect  to  and  work  with  WestGrid  systems  you  may  have  to  install  one  or  

more  software  packages  on  your  own  computer.  Although  web  browser-­‐based  tools  may  become  available  for  accessing  WestGrid  in  the  future,  especially  as  grid  services  are  developed,  most  users  will  continue  to  log  in  and  work  directly  on  remote  systems  for  some  time  to  come.  

4.3.1 Terminal  client  supporting  SSH  

The  most  important  piece  of  software  you  will  need  is  a  terminal  (client)  program  that  supports  the  secure  shell  (SSH)  protocol  for  network  communications  to  remote  servers.  Linux  and  Mac  OS  X  users  can  typically  use  the  built-­‐in  terminal  programs,  whereas  Microsoft  Windows  users  often  install  an  additional  SSH  client,  such  as  PuTTY.  PuTTY  can  be  obtained  from    

http://www.chiark.greenend.org.uk/~sgtatham/putty/  

There  is  an  extensive  list  of  SSH  clients  at:  

http://en.wikipedia.org/wiki/Comparison_of_SSH_clients  

4.3.2 File  transfer  client  supporting  scp  and  sftp  

You  will  also  need  software  that  supports  secure  transfer  of  files  between  your  computer  and  the  WestGrid  machines.  The  command  line  programs  scp  and  sftp  can  be  used  from  within  terminal  programs  on  Linux  or  Mac  OS  X  computers.  On  Microsoft  Windows  platforms,  similar  programs,  pscp  and  psftp  come  with  PuTTY.  

  13  

4.3.3 X  Window  display  server  for  graphics  

To  use  graphical  programs  on  WestGrid  computers  and  show  the  results  on  your  monitor,  you  will  need  to  run  an  X  Window  display  server  (X  server)  program  on  your  local  computer.  You  start  up  such  a  program  and  leave  it  running  in  the  background  while  using  your  SSH  terminal  program.  When  graphics  commands  are  relayed  by  your  SSH  client  from  the  remote  WestGrid  computer  to  the  X  Window  display  server,  it  will  display  the  appropriate  graphics  on  your  screen.  Your  keyboard  and  mouse  commands  can  be  relayed  in  the  other  direction  and  passed  from  your  SSH  client  to  the  graphics  program  running  on  the  remote  system.  The  process  is  called  X11  tunnelling  or  forwarding.  For  this  to  work,  you  should  look  for  an  option  in  the  settings  or  preferences  of  your  SSH  client  program  to  turn  on  X11  tunnelling.  

Commercial  X  Window  display  servers  are  available,  but,  most  users  can  get  by  with  free  programs.  Linux  users  will  find  the  X  Window  support  already  installed  with  most  distributions.  Modern  versions  of  Mac  OS  X  ship  with  a  program  called  X11,  which  is  not  installed  by  default  but  is  on  the  system  disks.  One  option  for  Microsoft  Windows  users  is  to  install  Xming.  If  installing  Xming,  you  should  also  install  the  optional  font  package.  

4.4 Connecting  and  Logging  In  To  successfully  connect  to  WestGrid  systems,  your  computer's  IP  address  

must  be  correctly  registered  in  the  Domain  Name  System  (DNS).  To  test  whether  your  IP  address  is  suitable,  visit:  

http://westgrid.ca/iptest  

To  connect  to  a  WestGrid  system,  start  your  SSH  client  and  specify  the  host  name  of  the  chosen  system  and  your  user  name  in  the  connection  dialogue  box  or  on  the  SSH  command  line,  depending  on  what  type  of  SSH  program  you  are  using.  Each  WestGrid  machine  to  which  you  can  connect  has  an  Internet  address  of  the  form  machine_name.westgrid.ca.  So,  for  example,  to  connect  to  Matrix  from  a  command-­‐line  SSH  program,  you  could  type:  

ssh [email protected]

If  your  user  name  on  your  local  system  is  the  same  as  on  WestGrid,  you  may  omit  it  and  simply  type:  

ssh matrix.westgrid.ca

To  start  a  session  with  X11  forwarding  turned  on,  one  can  typically  use  

ssh -X matrix.westgrid.ca

although  from  Mac  OS  X  systems,  you  may  have  to  use  

ssh -Y matrix.westgrid.ca

 14  

If  you  have  successfully  connected  to  one  of  the  WestGrid  login  servers,  you  will  be  prompted  for  a  user  name  and  password.  The  user  name  is  not  your  full  name,  nor  your  email  address,  but,  is  the  2-­‐  to  8-­‐character  name  that  was  entered  in  the  "Requested  Username"  box  when  you  applied  for  a  WestGrid  account.  The  password  to  use  is  the  one  you  specified  on  that  form  also.  The  same  password  is  used  for  all  WestGrid  systems.  For  security,  it  is  stored  in  an  encrypted  form.  Consequently,  if  you  have  forgotten  your  password,  WestGrid  administrators  will  not  be  able  to  tell  you  what  it  is.  Also  for  security  reasons  new  passwords  are  not  sent  via  email.  Instead,  you  choose  your  own  new  password  and  enter  it  on  a  web  form  that  is  validated  using  a  temporary  password  given  to  you  by  telephone.  To  request  a  new  password,  write  to  [email protected]  and  you  will  be  given  instructions  on  who  to  telephone.  

4.5 Working  Interactively  The  hardware  at  most  of  the  WestGrid  sites  is  set  up  with  one  or  more  servers  

to  which  users  have  direct  login  access,  with  the  main  computational  clusters  being  accessed  indirectly,  by  submitting  batch  job  scripts.  The  batch  jobs  run  non-­‐interactively  when  the  scheduling  system  is  able  to  find  a  time  slot  with  the  computational  resources  needed  for  the  job.  However,  interactive  sessions  are  typically  needed  to  prepare  the  batch  scripts  and  input  files,  compile  and  debug  programs,  manage  data  and  post-­‐process  results.  Some  guidelines  for  working  interactively  are  given  in  this  section.  

4.5.1 The  UNIX  environment  

Each  of  the  WestGrid  computers  runs  some  version  of  the  UNIX  (or  Linux)  operating  system.  The  program  that  responds  to  your  typed  commands  and  allows  you  to  run  other  programs  is  called  the  UNIX  shell.  Examples  of  a  UNIX  shell  are  bash  and  tcsh.  It  is  useful  to  have  some  knowledge  of  the  shell  and  a  variety  of  other  command-­‐line  programs  that  you  can  use  to  manipulate  files.  If  you  are  new  to  UNIX  systems,  we  recommend  that  you  work  through  one  of  the  many  online  tutorials  that  are  available,  such  as  the  UNIX  Tutorial  for  Beginners  provided  by  the  University  of  Surrey:  

http://www.ee.surrey.ac.uk/Teaching/Unix/index.html  

The  tutorial  covers  such  fundamental  topics,  among  others,  as  creating,  renaming  and  deleting  files  and  directories,  how  to  produce  a  listing  of  your  files  and  how  to  tell  how  much  disk  space  you  are  using.    

The  UNIX  man  command  (man  for  "manual")  can  be  used  to  get  information  about  other  commands.  For  example,  a  reference  page  about  the  ls  command,  for  listing  file  names  and  properties,  can  be  displayed  by  typing:  

man ls

The  default  environment  varies  from  one  WestGrid  system  to  another  and  also  depends  on  which  UNIX  shell  you  selected  on  your  WestGrid  account  

  15  

application  form.  The  working  environment  is  partially  determined  by  the  commands  in  one  or  more  startup  files  that  are  automatically  executed  every  time  you  log  in.  For  bash  shell  users,  these  files  may  include  .bashrc  and  .bash_profile.  For  tcsh  users,  .login  and  .cshrc  are  executed.  You  can  customize  your  environment  by  editing  these  files  to  change  such  things  as  the  appearance  of  the  shell  prompt  (the  characters  that  appear  at  the  start  of  the  line  when  the  shell  is  waiting  for  you  to  type  a  command)  and  the  command  path  (a  list  of  directories  in  which  the  shell  will  search  for  commands).  Use  caution  when  modifying  these  files,  as  inappropriate  changes  may  prevent  you  from  being  able  to  work  on  the  system.  

Please  note  that  binary  executable  files  from  Microsoft  Windows  PCs  will  not  run  on  the  WestGrid  systems.  In  order  to  work  with  such  programs,  you  must  obtain  the  source  code  and  recompile  for  use  on  UNIX  or  Linux.  Not  all  programs  will  have  such  source  code  available.  

4.5.2 File  systems  

As  on  other  computer  systems,  in  a  UNIX  environment  there  is  a  file  system  that  provides  a  hierarchy  of  directories  (called  folders  on  some  other  systems)  for  storing  files.  When  you  log  in,  you  are  working  in  part  of  the  file  system  called  your  home  directory.  You  may  create  files  and  subdirectories  in  your  home  directory,  although  on  some  WestGrid  systems  there  is  a  quota  limiting  the  amount  of  space  you  can  use.  How  you  organize  your  files  is  up  to  you,  but,  it  might  be  helpful  to  create  a  separate  subdirectory  for  each  job  that  you  submit  and  to  have  a  separate  directory  for  program  source  code.  

When  naming  files  and  directories,  you  will  find  it  easier  to  navigate  the  file  hierarchy  and  to  reference  files  in  UNIX  commands  if  you  do  not  use  spaces  in  file  names.  Also,  keep  in  mind  that  UNIX  is  case  sensitive  in  most  situations,  so,  for  example,  Nobel_Prize.exe  and  nobel_prize.exe  refer  to  different  files.  Another  difference  between  UNIX  and  Microsoft  Windows  environments  is  that  a  file  suffix,  if  present,  is  of  no  particular  significance  to  the  basic  UNIX  file  manipulation  commands.  So,  for  example,  there  is  no  requirement  for  executable  programs  to  have  an  ".exe"  suffix.  

Besides  your  home  directory,  on  most  of  the  WestGrid  systems  there  are  additional  places  (/tmp,  /scratch  and  /global/scratch  among  others)  where  you  can  store  files  and  from  which  you  can  run  programs.  Some  file  systems  have  more  space  than  others.  Sometimes  there  are  performance  reasons  for  choosing  one  location  vs.  another.  There  may  also  be  different  usage  policies  (how  long  you  can  keep  files  and  how  big  they  can  be)  for  the  various  file  systems.    

4.5.3 Transferring  files  

When  just  starting  out  on  WestGrid  systems,  you  will  likely  have  source  code  or  data  to  be  transferred  from  your  own  computer  or  one  at  your  own  institution.  Similar  to  the  requirement  for  a  terminal  program  supporting  SSH  (Secure  Shell),  

 16  

WestGrid  requires  that  you  use  file  transfer  SSH  that  supports  SCP  (Secure  Copy)  or  SFTP  (SSH  File  Transfer  Protocol).  Most  ssh  packages  come  with  additional  programs  to  support  these  secure  file  transfer  methods.  

Once  you  have  files  on  a  WestGrid  system,  you  may  move  them  between  directories  using  the  UNIX  mv  command,  or  to  other  WestGrid  sites  using  scp  or  sftp.  We  also  provide  a  utility  called  gcp  (grid  copy)  that  efficiently  transfers  files  between  WestGrid  systems.  

For  long  term  storage  of  large  files,  consider  using  the  WestGrid  storage  facility.  

One  thing  to  be  aware  of  when  transferring  files  is  that  there  are  different  conventions  for  the  characters  that  terminate  each  line  in  a  text  file  on  UNIX/Linux,  Microsoft  Windows  and  Macintosh  computers.  File  transfer  software  typically  has  a  transfer  mode  in  which  line-­‐ending  conversion  is  done  automatically.  For  example,  in  Microsoft  Windows-­‐based  programs,  files  that  have  a  .txt  suffix  would  be  treated  as  text  files  for  which  conversion  would  likely  be  done,  but,  C  or  Fortran  source  code  files  having  names  ending  in  .c  or  .f,  respectively,  might  not  be  recognized  as  text.  You  may  have  to  configure  your  file  transfer  software  to  correctly  handle  files  that  you  commonly  use.  

4.5.4 Editing  files  

One  choice  for  creating  and  editing  files,  to  prepare  batch  scripts  or  input  for  your  programs,  for  example,  is  to  transfer  files  to  your  own  computer  to  use  a  local  editor  with  which  you  are  familiar.  However,  a  better  choice  for  most  users  is  to  edit  the  files  directly  on  the  WestGrid  system  on  which  they  will  be  used.  There  are  several  editors  available  for  you  to  use,  as  shown  on  the  software  page.  Two  editors  commonly  used  on  UNIX  systems  are  emacs  and  vi.  However,  if  you  are  coming  from  a  Microsoft  Windows  background  and  have  set  up  your  computer  with  X  Windows  software,  as  described  above,  then,  you  may  prefer  the  nedit  editor.  This  is  a  graphical  editor,  with  keyboard  shortcuts  similar  to  what  would  be  found  on  PCs.  See  the  next  section  for  comments  about  running  nedit  and  other  interactive  programs.  

There  are  also  a  number  of  UNIX  commands  available  for  looking  at  the  contents  of  files.  For  example,  to  page  through  an  output  file,  test.pbs.o31416,  the  more  command  can  be  used:  

more test.pbs.o31416

4.5.5 Running  interactive  programs  

To  run  a  program  on  a  UNIX  system,  type  the  name  of  the  corresponding  executable  file  on  the  command  line  at  the  shell  prompt.  The  UNIX  shell  searches  for  the  command  only  in  the  directories  in  a  list  stored  in  a  variable,  PATH.  You  can  see  this  list  by  typing:  

  17  

echo $PATH

If  you  get  a  "command  not  found"  error,  check  for  a  spelling  mistake  or  a  letter  typed  in  the  wrong  case,  or  confirm  that  the  directory  containing  the  executable  file  is  in  your  command  path.  On  some  WestGrid  systems,  the  current  working  directory  is  not  part  of  the  default  command  path.  In  such  a  case,  you  can  either  change  the  PATH  or  type  "./"  in  front  of  the  command,  as  in:  

./my_command

Many  programs  (including  UNIX  commands),  take  additional  arguments,  such  as  numerical  parameters  or  file  names,  which  are  listed  on  the  command  line  after  the  name  of  the  executable  program.  Often  the  command-­‐line  arguments  are  preceded  by  a  dash.  For  example,  to  list  the  last  40  lines  of  the  file,  geometry.in,  you  could  use  the  UNIX  tail  command:  

tail -40 geometry.in

4.5.6 Restrictions  on  interactive  jobs  

Since  the  servers  to  which  you  log  in  are  shared  by  many  users,  interactive  work  on  those  machines  should  be  limited  to  activities  such  as  editing  files,  compiling  programs  or  running  small,  short,  tests  of  your  program.  The  memory  and  number  of  processors  varies  among  the  login  servers,  so,  the  exact  policy  on  the  length  and  size  allowed  for  test  runs  varies  from  machine  to  machine.  On  some  systems  there  are  special  queues  with  short  time  limits  that  are  intended  for  batch  jobs  for  testing  and  debugging.  It  is  also  possible  to  submit  a  placeholder  batch  job  to  reserve  one  or  more  dedicated  processors,  which  may  then  be  used  for  interactive  work,  without  interfering  with  other  users'  jobs.    

4.6 Software  

4.6.1 Locating  installed  software  

Installed  software  on  WestGrid  systems  includes  the  UNIX  or  Linux  operating  system  and  a  number  of  standard  utilities  that  often  come  with  such  systems.  A  number  of  major  commercial  and  free  software  packages  are  also  available,  as  well  as  compilers  and  a  variety  of  numerical,  graphics  and  file-­‐manipulation  libraries  for  researchers  compiling  their  own  codes.  Refer  to  the  main  WestGrid  software  page  for  details  on  which  packages  have  been  installed  on  each  of  the  main  computational  systems:  

http://www.westgrid.ca/support/software  

 The  installation  directories  have  not  been  standardized,  so,  please  refer  to  the  table  at  the  top  of  the  software  page  for  a  list  of  the  directories  where  software  is  typically  installed  on  each  system.  

 18  

4.6.2 Installing  your  own  software  

You  are  welcome  to  install  software  under  your  home  directory  (if  the  software  license  allows  the  software  to  be  used  on  remote  machines  that  are  not  under  your  direct  control  and  which  may  not  be  at  your  home  institution).  If  you  need  to  share  a  software  package  with  other  members  of  your  group,  a  corresponding  UNIX  group  can  be  created  to  control  access  to  the  software.  Write  to  [email protected]  for  details  on  how  to  do  this.  

See  the  programming  section  below  for  information  on  compiling  your  code.  

4.6.3 Requesting  software  installation  

If  a  package  was  installed  for  testing  or  at  the  request  of  a  limited  number  of  researchers,  it  may  not  be  listed  on  the  software  page.  So,  if  there  is  a  package  that  you  need,  there  is  a  chance  that  it  has  already  been  installed,  but,  not  announced.  In  any  case,  please  write  to  [email protected]  to  ask  whether  a  given  software  package  is  available  or  can  be  installed.  

4.6.4 Software  licensing  

Although  WestGrid  has  purchased  some  commercial  software,  such  as  the  Gaussian  chemistry  code,  there  are  other  packages,  such  as  ABAQUS  and  MATLAB  being  run  on  WestGrid  systems  using  licenses  provided  by  WestGrid  institutions,  rather  than  WestGrid  itself.  There  are  often  limitations  on  such  licenses,  in  terms  of  where  the  software  may  be  run  and  how  many  simultaneous  copies  may  be  used.  

4.7 Programming  A  general  introduction  to  programming  on  WestGrid  systems  is  available  at:  

http://www.westgrid.ca/support/programming  

That  webpage  includes  links  to  such  things  as  parallel  programming  tutorials  and  to  a  series  of  pages  giving  examples  of  using  the  main  compilers  on  all  the  WestGrid  systems.  Lists  of  all  the  compilers  and  the  numerical  (and  other)  libraries  are  available  at:  

http://www.westgrid.ca/support/software  

If  you  have  used  non-­‐standard  language  features  in  your  code  you  may  need  to  make  some  changes  in  order  to  get  it  to  run  on  WestGrid  systems.  Trying  your  code  with  more  than  one  compiler  is  recommended,  as  this  helps  identify  non-­‐portable  sections  of  your  code  that  should  be  improved.  Contact  support  staff  (see  page  23  for  contact  information)  if  you  would  like  help  with  porting,  debugging  or  optimizing  your  code.    

Sometimes  researchers  have  chosen  to  use  WestGrid  because  they  want  to  increase  the  size  of  the  problem  being  studied.  Running  the  code  on  larger  data  sets  can  sometimes  uncover  performance  issues  or  memory  access  problems.  If  

  19  

the  code  was  previously  run  only  on  a  32-­‐bit  system,  moving  to  a  64-­‐bit  environment  may  require  changes  if  inappropriate  assumptions  were  made  regarding  the  size  of  some  data  types,  for  example.  

Another  issue  that  arises  when  tackling  larger  problems  is  the  length  of  time  required  for  the  calculation.  Some  WestGrid  systems  have  job  time  limits  as  short  as  one  day.  It  is  recommended  that  you  design  your  program  to  include  a  checkpoint  and  restart  capability.  That  is,  you  should  periodically  write  out  enough  data  so  that  your  program  can  be  restarted,  if  necessary,  by  reading  in  that  data.  That  way  you  can  avoid  losing  the  entire  calculation  if  the  program  doesn't  finish  before  the  job  time  limit  is  reached.  

4.8 Running  Batch  Jobs  

4.8.1 The  batch  environment  

As  mentioned  above,  the  main  WestGrid  computational  clusters  are  accessed  by  submitting  batch  job  scripts  from  a  login  server.  It  is  usually  not  necessary  (and  in  some  cases  not  allowed)  to  log  on  to  the  compute  nodes  directly.  The  system  software  that  handles  your  batch  job  consists  of  two  pieces:  a  resource  manager  (TORQUE)  and  a  scheduler  (Moab).  Documentation  for  these  packages  is  available  through  Cluster  Resources.  However,  typical  users  will  not  need  to  study  those  details.  

4.8.2 Batch  job  scripts  

Batch  job  scripts  are  UNIX  shell  scripts  (basically  text  files  of  commands  for  the  UNIX  shell  to  interpret,  similar  to  what  you  could  execute  by  typing  directly  at  a  keyboard)  containing  special  comment  lines  that  contain  TORQUE  directives.  TORQUE  evolved  from  software  called  PBS  (Portable  Batch  System).  Consequences  of  that  history  are  that  the  TORQUE  directive  lines  begin  with  #PBS,  some  environment  variables  contain  "PBS"  (such  as  $PBS_O_WORKDIR  in  the  script  below)  and  the  script  files  themselves  typically  have  a  .pbs  suffix  (although  that  is  not  required).  

There  are  small,  but,  significant  differences  in  the  batch  job  scripts,  particularly  for  parallel  jobs,  among  the  various  WestGrid  systems.  Examples  for  each  system,  for  both  serial  and  parallel  jobs  are  given  on  the  WestGrid  website.  So,  if  you  begin  working  on  one  WestGrid  system  and  switch  to  another,  refer  to  the  documentation  before  submitting  jobs  on  the  second  system.  

Here  is  an  example  job  script,  diffuse.pbs,  for  a  serial  job  on  the  Glacier  cluster,  to  run  a  program  named  diffuse.  

#!/bin/bash #PBS -S /bin/bash # Script for running serial program, diffuse, # on glacier

 20  

cd $PBS_O_WORKDIR echo "Current working directory is `pwd`" echo "Starting run at: `date`" ./diffuse echo "Job finished with exit code $? at: `date`"

4.8.3 Commands  for  submitting,  monitoring  and  deleting  jobs  

To  submit  the  script,  diffuse.pbs,  to  the  batch  job  handling  system,  use  the  qsub  command:  

qsub diffuse.pbs

If  a  job  is  expected  to  take  longer  than  the  default  time  limit  (typically  three  hours)  or  uses  more  than  the  default  memory,  additional  arguments  may  be  added  to  the  qsub  command  line.  If  diffuse  is  a  parallel  program,  you  also  have  to  specify  the  number  of  nodes  on  which  it  is  to  run.  For  example:  

qsub -l walltime=72:00:00,mem=1500mb,nodes=4 diffuse.pbs

Please  see  the  Running  Jobs  pages  or  QuickStart  guides  for  the  individual  WestGrid  systems  for  more  information  about  the  walltime,  memory  and  node  limits  for  specific  machines.  

When  qsub  processes  the  job,  it  assigns  it  a  job  ID  and  places  the  job  in  a  queue  to  await  execution.  To  check  on  the  status  of  all  the  jobs  on  the  system,  type:  

showq

To  limit  the  listing  to  show  just  the  jobs  associated  with  your  user  name,  type:  

showq -u username

To  delete  a  job,  use  the  qdel  command  with  the  jobid  assigned  from  qsub:  

qdel jobid

On  some  WestGrid  systems  it  is  difficult  to  directly  monitor  some  aspects  of  a  job's  progress,  so,  it  is  a  good  idea  to  make  sure  that  your  program  periodically  writes  output  to  a  file.  You  can  then  check  the  contents  of  that  file  to  see  how  the  program  is  doing.  In  other  cases,  such  as  when  you  need  to  confirm  how  much  memory  your  job  is  using,  you  may  have  to  write  to  [email protected]  to  request  that  an  administrator  check  on  the  job  for  you.  

  21  

4.9 Post-­‐Processing  After  having  completed  some  calculations  on  the  WestGrid  machines,  most  

researchers  will  need  to  post-­‐process  some  output  files.  

4.9.1 Managing  files  

In  some  cases,  after  a  preliminary  examination  of  the  output,  there  be  a  way  to  reduce  the  volume  of  data  by  extracting  key  numbers  and  then  discarding  some  of  the  output.  The  UNIX  grep  utility  may  be  helpful  in  simple  cases.  A  more  elaborate  process  using  shell  scripts  or  other  programs  may  be  needed.  Once  the  data  have  been  consolidated,  files  should  be  backed  up,  either  by  transferring  them  to  your  own  computer  or  by  using  the  WestGrid  storage  facility,  as  mentioned  in  the  section  on  transferring  files,  above.  If  you  have  a  large  number  of  small  files,  you  should  consider  combining  and  compressing  them  with  the  tar  and  gzip  programs.  

4.9.2 Visualization  

For  most  types  of  calculation,  graphical  display  of  the  output  can  be  useful  for  identifying  bugs  in  programs,  to  help  interpret  the  data  and  to  summarize  the  results  for  others.  WestGrid  has  hardware  and  software  at  one  site  specially  geared  toward  remote  visualization,  however,  it  is  possible  to  use  visualization  tools  on  any  of  the  WestGrid  systems.  Graphical  data  analysis  needs  tend  to  be  quite  specific,  so,  you  are  encouraged  to  discuss  your  particular  project  with  WestGrid  support  analysts.  In  some  cases  it  may  be  feasible  to  produce  graphs  or  images  in  batch  mode  and  in  other  cases,  where  more  interactivity  is  required,  we  may  recommend  using  the  WestGrid  visualization  server  or  transferring  the  data  back  to  your  own  computer  for  visualization  there.  

4.10 Usage  Guidelines  

4.10.1 Job  limits  

WestGrid  is  comprised  of  a  wide  range  of  hardware  types,  from  single  node  large  shared  memory  machines  to  clusters  consisting  of  many  dual-­‐processor  small-­‐memory  nodes.  The  maximum  time  limit  allowed,  the  maximum  number  of  processors  that  may  be  requested,  the  maximum  number  of  jobs  that  can  run  simultaneously,  etc.  have  been  set  by  system  adminstrators  based  on  the  characteristics  of  the  machines  and  the  role  they  play  in  the  WestGrid  environment.  Generally  speaking,  jobs  that  request  more  resources  (processors  or  memory)  will  have  more  strict  limits  than  jobs  that  use  less.  

4.10.2 How  much  is  reasonable?  

In  general,  you  may  submit  as  many  jobs  as  you  like  as  the  batch  scheduling  system  will  restrict  the  number  that  are  run  at  any  given  time.  However,  so  as  not  to  unnecessarily  burden  the  scheduling  system  or  alarm  other  users,  in  most  cases  you  should  stage  job  submission  so  that  you  don't  have  many  weeks  of  work  waiting  to  run.  It  would  be  reasonable  to  submit  some  tens  of  jobs,  for  

 22  

example,  if  they  last  a  few  days  each,  or  hundreds  of  jobs  if  they  are  only  a  few  hours  long.  You  should  plan  to  monitor  your  runs  regularly.  

WestGrid  users  are  also  expected  to  take  some  responsibility  for  ensuring  that  their  jobs  are  running  efficiently,  through  the  use  of  appropriate  algorithms  and  compiler  optimization  options  and  linking  to  optimized  libraries  when  possible.  Programs  should  be  tested  on  small  problems  before  committing  to  longer  runs  using  more  resources.  In  general,  parallel  programs  run  more  efficiently  on  smaller  numbers  of  processors.  So,  study  how  the  performance  of  your  code  depends  on  the  number  of  processors  used  and  balance  the  need  for  quick  turnaround  of  your  jobs  with  overall  efficiency  (that  is,  use  small  numbers  of  processors  unless  you  have  a  good  reason  not  to).  

4.10.3 Job  priorities  and  the  fairshare  policy  Users  are  usually  concerned  that  their  jobs  may  not  be  progressing  in  the  

queue  relative  to  other  users.  There  are  a  number  of  factors  that  affect  the  priority  of  the  jobs  waiting  to  run.  The  basic  mechanism  for  determining  the  priority  is  called  fairshare  in  which  target  usage  amounts  are  assigned  to  each  project.  When  considering  which  jobs  to  run,  the  scheduling  software  takes  into  account  the  past  history  (typically  over  a  time  span  of  a  couple  of  weeks,  with  more  recent  usage  weighted  more  heavily)  and  compares  the  amount  of  processing  completed  to  the  target.  Priorities  of  the  jobs  are  raised  or  lowered  so  as  to  try  to  meet  the  fairshare  targets.  

4.10.4 Resource  Allocation  Committee  

In  spite  of  the  name,  everyone's  fair  share  is  not  the  same.  There  is  a  mechanism  for  requesting  enhanced  priority  if  a  project's  needs  for  computational  or  storage  resources  extend  beyond  the  average.  Periodically,  applications  are  solicited  for  awards  from  the  WestGrid  Resource  Allocation  Committee  and  the  National  Resource  Allocation  Committee  for  this  privilege.  More  information  about  the  resource  allocation  committees  is  available  at:  

http://www.westgrid.ca/support/accounts/rac  

4.10.5 Accounting  Project  usage  statistics  are  available  for  viewing  by  project  members  by  

logging  on  to  the  WestGrid  portal:  

http://portal.westgrid.ca  

  23  

5 More  information  

5.1 Getting  Help  WestGrid  has  a  team  of  technical  analysts  available  to  assist  researchers  with  

using  the  WestGrid  resources.  The  analysts  provide  a  wide  range  of  services  to  researchers,  for  example:  

• Assist  researchers  with  getting  started  with  HPC  

• Provision  of  training  courses  and  seminars  

• Assistance  with  code  development,  debugging,  optimization,  porting  and  parallelization  

• Assistance  with  code  performance  analysis  

• Assistance  with  scientific  visualization  

• Data  management  advice  

There  is  a  single  email  address,  [email protected],  which  is  read  by  all  analysts  at  all  the  WestGrid  sites.  Use  this  address  for  questions/assistance  related  to  any  of  the  WestGrid  facilities.  University  of  Manitoba  researchers  and  graduate  students  can  also  contact  their  local  HPC  analyst  directly  for  consultation  and  help:  

Jonatan  Aronsson  Office:  E2-­‐586  EITC  Building,  Fort  Garry  Campus  Email:  [email protected]    Phone:  (204)  474-­‐6912  

There  are  a  number  of  other  resources  that  you  can  also  use  to  get  help  with  HPC  and/or  WestGrid:  

• WestGrid  Website  The  WestGrid  web  site  (http://www.westgrid.ca)  contains  detailed  information  about  how-­‐to  run  jobs,  compile  codes,  policies,  etc  specific  to  the  WestGrid  systems.  

• Training  Seminars  During  the  fall  and  winter,  WestGrid  offers  a  series  of  seminars  through  video  conferencing  and,  in  some  cases,  by  web  streaming.  Past  topics  have  included  an  overview  of  WestGrid  facilities,  introduction  to  UNIX,  serial  and  parallel  (OpenMP  and  MPI)  programming,  submitting  jobs  and  data  visualization.  See  the  WestGrid  training  page  (http://www.westgrid.ca/support/training)  for  the  schedule  and  list  of  topics  in  the  next  seminar  series.  

 24  

• Online  Training  There  are  numerous  online  tutorials  on  topics  such  as  basic  UNIX  commands,  shell  scripting  and  parallel  programming.  Some  of  these  are  referenced  on  the  corresponding  WestGrid  web  pages  or  you  can  write  to  the  support  list  mentioned  above  for  recommendations  on  material  covering  specific  topics.  

5.2 Local  WestGrid  Contacts  For  support  related  inquires  please  see  section  5.1.  

Dr.  Byron  Southern  WestGrid  Principal  Investigator  [email protected]  

Dr.  Peter  Graham    Representative  to  the  WestGrid  Senior  Planning  and  User  Needs  Committees  [email protected]  

Mr.  David  Wyatt  WestGrid  Technical  Site  Lead  and  System  Administrator  [email protected]  

Mr.  Jonatan  Aronsson  HPC  Applications  Analyst  and  Collaboration/Visualization  Coordinator  [email protected]