Website in a Box or the Next Generation Hosting Platform

104
Copyright 2016 All Rights Reserved. Not for disclosure without written permission. Website in a Box The Next Generation Hosting Platform Slava Vladyshevsky Alex Kostsin

Transcript of Website in a Box or the Next Generation Hosting Platform

Page 1: Website in a Box or the Next Generation Hosting Platform

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.  

 

    Website  in  a  Box  

The  Next  Generation  Hosting  Platform    

             

                   

             Slava  Vladyshevsky  Alex  Kostsin    

Page 2: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       2  

Table  of  Contents    PLATFORM  OVERVIEW  ....................................................................................................................................................  4  INFRASTRUCTURE  OVERVIEW  .................................................................................................................................................................  4  NETWORK  SETUP  OVERVIEW  ..................................................................................................................................................................  6  

PLATFORM  USER  ROLES  ..................................................................................................................................................  7  PLATFORM  COMPONENTS  ..............................................................................................................................................  8  PLATFORM  SERVICES  ................................................................................................................................................................................  9  Stats  Collector  .........................................................................................................................................................................................  9  Stats  Database  .....................................................................................................................................................................................  11  Image  Registry  .....................................................................................................................................................................................  13  Image  Builder  .......................................................................................................................................................................................  15  Deployment  Service  ............................................................................................................................................................................  16  Container  Provisioning  Service  .....................................................................................................................................................  17  Reporting  Service  ................................................................................................................................................................................  19  Persistent  Volumes  .............................................................................................................................................................................  19  Volume  Sync-­‐Share  Service  .............................................................................................................................................................  20  Persistent  Database  Storage  ..........................................................................................................................................................  22  

Database  Driver  .....................................................................................................................................................................................................................  23  Percona  XtraDB  Cluster  Limitations  .............................................................................................................................................................................  24  

Secure  Storage  .....................................................................................................................................................................................  24  Identity  Management  Service  ........................................................................................................................................................  26  Load-­‐Balancer  Service  ......................................................................................................................................................................  29  SCM  Service  ............................................................................................................................................................................................  30  Workflow  Engine  ................................................................................................................................................................................  32  SonarQube  Service  ..............................................................................................................................................................................  35  Sonar  Database  ....................................................................................................................................................................................  36  Sonar  Scanner  ......................................................................................................................................................................................  36  

PLATFORM  INTERFACES  ........................................................................................................................................................................  40  API  Endpoints  .......................................................................................................................................................................................  40  Command  Line  Interfaces  ...............................................................................................................................................................  40  

Platform  CLI  .............................................................................................................................................................................................................................  40  Docker  CLI  ................................................................................................................................................................................................................................  48  

Web  Portals  ...........................................................................................................................................................................................  48  Stats  Visualization  Portal  ...................................................................................................................................................................................................  48  GitLab  Portal  ............................................................................................................................................................................................................................  49  Sonar  Portal  .............................................................................................................................................................................................................................  50  Platform  Orchestration  Portal  .........................................................................................................................................................................................  50  

OTHER  COMPONENTS  ............................................................................................................................................................................  51  Docker  Engine  .........................................................................................................................................................................................................................  51  Docker  Containers  .................................................................................................................................................................................................................  51  

PLATFORM  CAPACITY  MODEL  .....................................................................................................................................  52  PLATFORM  SECURITY  .....................................................................................................................................................  54  USER  NAMESPACE  REMAP  ....................................................................................................................................................................  54  DOCKER  BENCH  FOR  SECURITY  ............................................................................................................................................................  57  WEB  APPLICATION  SECURITY  ..............................................................................................................................................................  57  

PLATFORM  CHANGE  MANAGEMENT  ..........................................................................................................................  59  DRUPAL  HOSTING  ............................................................................................................................................................  60  DRUPAL  SITE  COMPONENTS  .................................................................................................................................................................  61  DRUPAL  CONTAINER  COMPONENTS  ....................................................................................................................................................  62  DRUPAL  CONTAINER  PERFORMANCE  .................................................................................................................................................  64  Sizing  Considerations  ........................................................................................................................................................................  64  Apache  vs.  NGINX  ................................................................................................................................................................................  66  Performance  Test  ................................................................................................................................................................................  67  Process  Size  Conundrum  ..................................................................................................................................................................  69  

Page 3: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       3  

DRUPAL  PROJECT  CREATION  ................................................................................................................................................................  73  DRUPAL  WEBSITE  DEPLOYMENT  ........................................................................................................................................................  76  Web  Project  Deployment  .................................................................................................................................................................  76  Web  Container  Deployment  ...........................................................................................................................................................  80  Website  Deployment  Workflow  ....................................................................................................................................................  80  

EDITORIAL  WORKFLOW  ........................................................................................................................................................................  81  CONTENT  PUBLISHING  ...........................................................................................................................................................................  83  

ACTIVE  DIRECTORY  STRUCTURE  ...............................................................................................................................  85  GITLAB  REPOSITORY  STRUCTURE  .............................................................................................................................  87  MANAGEMENT  TASKS  AND  WORKFLOWS  ...............................................................................................................  88  PLATFORM  STARTUP  .....................................................................................................................................................  91  BASE  OS  IMAGE  .................................................................................................................................................................  98  THE  OS  IMAGE  INSIDE  CONTAINER  ....................................................................................................................................................  98  ONE  VS.  MULTIPLE  APPLICATIONS  ......................................................................................................................................................  99  PROCESS  SUPERVISOR  ............................................................................................................................................................................  99  QUICK  SUMMARY  .................................................................................................................................................................................  100  

STORAGE  SCALABILITY  IN  DOCKER  ........................................................................................................................  101  LOOP  LVM  ............................................................................................................................................................................................  102  DIRECT-­‐LVM  ........................................................................................................................................................................................  102  BTRFS  ...................................................................................................................................................................................................  103  OVERLAYFS  ..........................................................................................................................................................................................  103  ZFS  .........................................................................................................................................................................................................  103  

CONCLUSION  ...................................................................................................................................................................  104    Figure  Register    Figure  1  -­‐  Infrastructure  Diagram  ....................................................................................................................................  5  Figure  2  -­‐  Foundation  Infrastructure  Diagram  ...........................................................................................................  6  Figure  3  -­‐  High-­‐level  Network  Diagram  .........................................................................................................................  7  Figure  4  -­‐  Platform  Components  .......................................................................................................................................  8  Figure  5  -­‐  cAdvisor  Web  UI:  CPU  usage  ......................................................................................................................  10  Figure  6  -­‐  InfluxDB  Web  Console  ...................................................................................................................................  12  Figure  7  -­‐  Image  Builder  UI  ..............................................................................................................................................  15  Figure  8  -­‐  Sonar  Project  Dashboard  .............................................................................................................................  38  Figure  9  -­‐  Sonar  Issue  Report  ..........................................................................................................................................  39  Figure  10  -­‐  Stats  Visualization  and  Analysis  Portal  ...............................................................................................  48  Figure  11  -­‐  GitLab  Portal  ...................................................................................................................................................  49  Figure  12  -­‐  Sonar  Portal  .....................................................................................................................................................  50  Figure  13  -­‐  Platform  Orchestration  Portal  .................................................................................................................  51  Figure  14  –  Platform  Capacity  Model  ...........................................................................................................................  52  Figure  15  -­‐  Drupal  CMS:  Configuration  Portal  .........................................................................................................  60  Figure  16  -­‐  Drupal  Site  Components  ............................................................................................................................  61  Figure  17  -­‐  Web  Container  Components  ....................................................................................................................  62  Figure  18  -­‐  Stress  Test  Results  ........................................................................................................................................  67  Figure  19  -­‐  Drupal  Project  Creation  Process  ............................................................................................................  73  Figure  20  -­‐  Drupal  Project  Deployment  Process  .....................................................................................................  77  Figure  21  -­‐  Website  Deployment  Workflow  .............................................................................................................  81  Figure  22  -­‐  Editorial  Workflow  .......................................................................................................................................  82  Figure  23  -­‐  Content  Publishing  Process  ......................................................................................................................  84  Figure  24  -­‐  Example:  MS  Active  Directory  Structure  ............................................................................................  85  

Page 4: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       4  

Platform  Overview    This  document  provides  in  depth  overview  for  the  Proof  of  Concept  project,  hereinafter  POC,  for  container-­‐based   LAMP   web   hosting.   This   POC   project   has   been   performed   to   verify   technical  feasibility   and   architectural   assumptions   as   well   as   to   demonstrate   the   prospect   customer   our  expertise   in   this   domain.   It’s   assumed   that   this   project   or   its   parts   will   be   adopted   and  productized.    No   clear   requirements   have   been   provided.   Therefore,   the   overall   design   and   architectural  decisions  have  been  mostly  governed  by  the  following  assumptions:  

• The  platform  must  provide  fully  managed  website  placeholders  that  will  be  populated  with  customer-­‐provided  code  and  assets;  

• The  platform  must  provide  LAMP  (Linux,  Apache,  PHP,  MySQL)  run-­‐time  environment;  • The  platform  architecture  must  be  similar  to  existing  Windows  hosting  platform;  • The  platform  must  guarantee  high-­‐availability  for  production  workloads;  • The   platform   must   prevent   the   noisy-­‐neighbors   effect,   i.e.   websites   sharing   the   same  

infrastructure  must  not  impact  each  other  performance;  • The  platform  must  support  different  website  sizes  and  resource  allocation  profiles;  • The  platform  must  guarantee  resources  and  be  able  to  report  on  their  usage;  

 From   early   project   stages   it’s   been   assumed   that   hosting   platform  will   utilize   Linux   containers  technology  popularized  by  Docker  and  often  referred  to  as  Docker  Containers.  Obviously,  Docker  is  a  good  fit  for  such  hosting  platform  since  Docker  Containers:  

• Allowing  for  much  higher  workload  density  than  VMs;  • Providing  enough  workload  isolation  and  containment;  • Enabling  granular  resource  management  and  reporting;  • Considered  the  future  of  PaaS.  

 Soon   it   became   apparent   that   there   is   much   more   required   than   Docker   alone   for   supporting  platform  requirements  and  some  additional  services  and  components  are  essential   for  providing  reliable  hosting  services.    Over  time,  the  set  of  Docker  containers  and  bunch  of  scripts  to  manage  them  evolved  into  the  real  platform   with   well-­‐defined   services,   components   and   interfaces   between   them.   Operational  procedures   and   workflows   have   been   automated   and   exposed   via   different   interface   to   enable  future  integration  and  instrumentation.    The  platform  architecture,  design  approach  and  processes  heavily  relying  on  Twelve-­‐Factor  App  principles.  For  more  details  see  https://12factor.net/.    

Infrastructure  Overview  Originally   the   platform   has   been   built   on   top   of   Kubernetes   cluster   for   simplified   container  scheduling   and   orchestration.   Due   to   the   lack   of   expertise   in   Support   Organization   and   little  acceptance   within   the   account   team,   this   approach   has   been   discontinued   and   Platform  Infrastructure  setup  followed  and  adopted  as  much  as  possible  existing  Windows  hosting  platform  architecture.  

Page 5: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       5  

 Figure  1  -­‐  Infrastructure  Diagram  

 The  POC  farm  infrastructure  is  mimicking  existing  web-­‐farm  setup  for  Windows  hosting:  

• All  inbound  network  traffic  is  passing  CDN/WAF;  • The  network  is  split  into  two  security  zones:  DMZ  and  TRUST;  • The  front-­‐end  services  and  service  components  are  hosted  in  DMZ  subnet;  • The  back-­‐end  and  secured  components  are  located  in  TRUST  subnet;  • When  coming  from  CDN/WAF,  the  network  traffic  is  passing  firewalls  and  load-­‐balancers;  • Production  HTTP/S  VIPs  are  passing  traffic  to  HA  pair  of  web  instances;  • Other  HTTP/S  VIPs,  e.g.  Staging  are  passing  traffic  to  a  singe  end-­‐point;  • The  TRUST   subnet   contains  DB   servers:   a   cluster   for   production  workloads   and   a   single  

instance  for  staging  use;  • All   platform   services   and   components   are   running   in   corresponding   containers   with  

exception  for  DB  instances,  which  are  running  directly  on  host  OS.    There  is  additional  shared  farm,  so  called  Utility  or  Foundation,  one  per  DC,  where  various  utility  services   shared  across  multiple   farms  and  websites  being  hosted.  For  production  deployment   it  may  be  beneficial  from  security  standpoint  to  place  some  foundation  services  into  TRUST  subnet.    

Page 6: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       6  

 

Figure  2  -­‐  Foundation  Infrastructure  Diagram    

It  is  envisioned  that  existing  foundation  farm  will  need  to  be  extended  with  at  least  two  additional  systems   for   providing   required   foundation   services.   This   is   assuming   that   the   rest   of   existing  foundation   services   such   as   Active   Directory,   DNS,   SMTP,   NTP,   …  will   be   shared   with   the   new  platform.    

Network  Setup  Overview  The   diagram   below   is   showing   a   logical   view   on   the   hosting   network   structure.   It’s   worth  mentioning   that   besides   TRUST   and   DMZ   VLANs,   the   Docker   is   adding   one   more   layer   of  indirection   by   creating   at   least   one   network   bridge   per   Docker   host   to   pass   traffic   between  containers  and  external  world.    There   are   number   of   solutions   emerged   over   past   couple   years,   bringing   SDN   and   network  virtualization  capabilities  to  container  eco-­‐system.  During  this  POC  project  we  won’t  be  exploring  these  network  abstraction  solutions  and  will  use  standard  network  stack  provided  by  Docker.    

Page 7: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       7  

 

Figure  3  -­‐  High-­‐level  Network  Diagram  

Platform  User  Roles  The  user  role  definition  is  tightly  bound  to  the  definition  scope.  The  following  scopes  defined:  

• Platform   Scope   –   platform-­‐wide   scope,   including   all   hosted   organizations   and  applications;  

• Organization  Scope  –  includes  organization  owned  objects  and  applications;  • Application  Scope  –  includes  objects  and  components  pertinent  to  a  given  application;  

 Specific   user   roles   and   their  mapping  will   be   dictated   by   the   particular   use-­‐case   and   processes  accepted  within  hosting  organization.      For  the  sake  of  simplicity  we’ll  assume  the  following  major  roles  defined  in  the  scope  of  proposed  hosting  platform:  

• Authorized  User  –  a  user  that  passed  authentication  and  has  been  assigned  corresponding  permissions:  

o Administrator  –  a  management  user  performing  administrations  tasks;  o Developer  –  a  developer,  an  individual  writing  and  testing  the  code;  o Content  Manager   –  an  editor,  an   individual  authoring  and  managing   the  web  site  

content;  • Anonymous  User  –  a  website  visitor  coming  from  the  public  Internet;  

 The   Identity   Management   (IdM)   Service   performs   mapping   between   user   identity   and   its  associated  roles.  This  is  implemented  using  LDAP  grouping  mechanisms.    

Page 8: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       8  

Things  to  keep  in  mind:  • User  role  depending  on  the  scope,  e.g.  one  user  may  be  Developer  in  one  organization  and  

act  as  Content  Manager  in  other  organization.  While  this  is  possible,  generally  such  cross-­‐organization  role  assignments  are  discouraged;  

• One  may  differentiate  Platform  User  Roles  and  Application  User  Roles  for  the  Applications  deployed   on   the   platform.   However,   both   user   types   are   authenticated   and   authorized  using   the  same   IdM  Platform  Service  and  as  such  making  no  real  difference.  For  example  Drupal  user  roles  are  subset  of  platform  user  roles;  

• Both  Applications  and  Platform  using  IdM  Service  currently,  however,  it’s  not  a  mandatory  requirement.   Additional   or   alternative   Authentication  Mechanisms  may   be   used   too.   For  example  many  Platform  services  have  local  user  database  and  local  administrative  accounts  in  order  to  be  able  to  act  autonomously  in  case  of  IdM  Service  failure  or  other  issues;  

• The  website  visitor  is  not  required  to  pass  authentication  and  granted  the  Anonymous  User  role  by  default.  

Platform  Components  Below   is   the   high-­‐level   diagram   of   the   Platform   components.   Connectors   depicting   major1  communication   channels   and   interactions   between   services   and   generally   may   be   seen   as   the  “using”  statement.  The  dotted-­‐line  connectors  are  showing  alternative  path.    

   

Figure  4  -­‐  Platform  Components  

                                                                                                               1  Major   is   referring   to   the   fact   that  some  dependencies  are  not  shown  to  avoid  diagram  clutter.  E.g.  pretty  much  all  platform  components  depending  on  Persistent  Volumes  and  this  is  not  depicted  here.  

Page 9: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       9  

Different  components  marked  with  different  colors  to  differentiate  their  types:  • Red  components  are  administrative  or  management  portals;  • Yellow  components  are  Platform  Services,  generally  speaking  –  containers;  • Blue  components  are  development  portals;  • Grey  components  are  general-­‐purpose  platform  building  blocks;  • Green  components  are  hosted  website  instances  the  user  interacting  with.  

 The  following  platform  Actors  defined:  

• Admin  –  platform  administrator;  • Dev  –  website  developer;  • Website  User  –  both  content  manager  and  public  Internet  user.  

 

Platform  Services  Below  is  a  short  overview  for  the  Platform  Services.  For  every  Service  it  is  providing  description  of  its  role,  dependencies  as  well  as  configuration  and  usage  examples.    The   service   startup   instructions   in   this   chapter   are   provided   for   demonstration   purposes   only.  Normally  services  are  expected  to  boot  in  automated  manner,  for  example  using  Docker  Composer  scripts.   By   using   Composer   we   can   ensure   repeatable   and   consistent   configuration   as   well   as  reliable  service  startup  and  recovery.  See  the  Platform  Startup  chapter  for  additional  details.  

Stats  Collector  Platform  Stats  Collector  is  a  stateless  service  implemented  as  container  running  on  every  Docker  host   and   collecting   resource   usage   stats   exposed   by   Docker   Engine   using   Google   cAdvisor  application  https://github.com/google/cadvisor.    The   quote   from   the   project   page:   “The   cAdvisor   (Container   Advisor)   provides   container   users   an  understanding  of  the  resource  usage  and  performance  characteristics  of  their  running  containers.  It  is   a   running   daemon   that   collects,   aggregates,   processes,   and   exports   information   about   running  containers.  Specifically,  for  each  container  it  keeps  resource  isolation  parameters,  historical  resource  usage,  and  histograms  of  complete  historical  resource  usage  and  network  statistics.  This  data  may  be  exported  either  by  container  or  machine-­‐wide.  The  cAdvisor  has  native  support  for  Docker  containers  and  should  support  just  about  any  other  container  type  out  of  the  box.”    Current  setup  assumes  that  Stats  Collector  is  using  Stats  DB  service  for  storing  metrics  collected  from  the  Docker  Engine.  Therefore  Stats  Collector  depends  on  Stats  DB  service  and  Docker  Engine  APIs  and  must  be  deployed  and  booted  accordingly.    Alternatively,   it’s   possible   to   use   https://github.com/kubernetes/heapster   for   stats   aggregation  and   resource  monitoring   for  more   complex   deployments   or   query   Docker   API   directly,   if  more  control  or  flexibility  is  required.    Although   cAdvisor   instances   may   be   accessed   directly   and   providing   Web   UI   for   metric  visualization,   the  more   practical   approach   is   to   export   collected   stats   to   external   database   that  may   be   used   for   arbitrary   data   aggregation,   reporting   and   analysis   tasks.   The   cAdvisor   does  provide  multiple  storage  drivers  out  of   the  box.  Current   implementation   is  using   InfluxDB  time-­‐series  database  for  storing  collected  measurements.    Below  is  an  example  of   the  chart  produced  by  cAdvisor   in  runtime.   It  has  quite   limited  practical  usage  if  at  all  and  provided  just  for  reference  purposes.    

Page 10: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       10  

 Figure  5  -­‐  cAdvisor  Web  UI:  CPU  usage  

Below  is  an  example  command  for  running  cAdvisor  container:  

 $  docker  run  -­‐-­‐name=cadvisor  -­‐-­‐hostname=`hostname`  -­‐-­‐detach=true  -­‐-­‐restart=always  \    -­‐-­‐cpu-­‐shares  100  -­‐-­‐memory  500m  -­‐-­‐memory-­‐swap  1G  -­‐-­‐userns=host  -­‐-­‐publish=8080:8080  \    -­‐-­‐volume=/:/rootfs:ro  -­‐-­‐volume=/var/run:/var/run:rw  -­‐-­‐volume=/sys:/sys:ro  \    -­‐-­‐volume=/var/lib/docker/:/var/lib/docker:ro  \    google/cadvisor:v0.24.0  \      -­‐storage_driver=influxdb  -­‐storage_driver_db=cadvisor  -­‐storage_driver_host=${INFLUXDB_HOST}:8086  \      -­‐storage_driver_user=${INFLUXDB_RW_USER}  -­‐storage_driver_password="${INFLUXDB_RW_PASS}"    

The  cAdvisor  is  still  an  evolving  project  and,  unfortunately,  having  own  shortcomings,  for  example  it’s  only  accepting  configuration  values  via  command  line  options.  Neither  configuration  files  nor  ENV   variables   currently   supported.   One   of   the   issues   directly   following   form   this   –   the   DB  credentials  passed  as  command  line  parameters  in  clear  text  and  can  be  seen  in  the  process  list.    There  are  several  things  to  keep  in  mind:  

• Unless  default  database  scheme  and  credentials  used,  they  must  be  provided  too  as  storage  driver  parameters.  The  database  scheme  must  be  created  prior  to  storing  collected  metrics;  

Page 11: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       11  

• The  cAdvisor  does  not  store  collected  metrics  for  more  than  120sec  by  default.  Therefore,  if  database   connection   is   interrupted,   the   resource   metrics   are   lost.   Depending   on   your  specific  environment  setup  and  requirements   it  may  be  a  good   idea   to  review  and  adjust  default  buffering  and  flushing  settings;  

• More-­‐less   obvious   observation:   the   more   containers   running   on   the   host,   the   more  resources  will  cAdvisor  consume  and  the  more  traffic  will  flow  between  cAdvisor  instance  and  storage  backend.  Consequently:  

o It’s   a   good   idea   to   limit   cAdvisor   resource   usage   to   avoid   impacting   production  workloads.  On  the  other  hand,  pulling  the  belt  too  tight  may  have  adverse  affects  on  metrics   collection   itself.   The   constraints   provided   in   example   above   are   for  demonstration   purposes   only   and   must   be   adjusted   for   specific   setup   and  environment;  

o For   busy   hosts   with   high   container   density   it’s   recommended   to   adjust   cAdvisor  buffering,  caching  and   flushing  parameters   for   the  best  performance.  For  example:  cAdvisor   is   collecting  metrics   during   the   1min   time   frame   and   flushing   them   in   a  single   transaction.   In   certain   scenarios   increasing   this   time   frame   may   improve  performance  without  impacting  monitoring  granularity;  

• The   cAdvisor   requires   elevated   permissions   (-­‐-­‐userns=host),   since   it   is   accessing   some  objects  in  the  Docker  host  namespace;  

• The   cAdvisor   project   does   not   enforce   security   by   default,   which   leaves   us   with   three  possible  options  for  running  this  service.  All  these  options  have  been  explored  during  the  POC  project  and  providing  the  balance  between  security  and  complexity:  

o Insecure:   using   default   credentials   for   storage   driver.   No   additional   options  required;  

o Kind-­‐of-­‐secure:  providing  storage  driver  credentials  as  command-­‐line  parameters,  so  they  will  show  up  in  the  process  list;  

o Secure:   creating  a   custom  build   and   image   for   cAdvisor   that  will   handle   and  pass  credentials  securely.  

• It’s   unlikely   that   cAdvisor  Web   UI   itself   is   going   to   be   used   for   production   deployment  monitoring,  therefore  it’s  recommended  to  avoid  publishing  cAdvisor  Web  UI  ports;  

• The   cAdvisor,   being   a   part   of   Kubernetes   project   is   quickly   evolving   and   new   versions  appearing  quite  often.  Although  common  practice   is   to  use   the   “latest”   image  version,   it’s  recommended  to  standardize  on  and  run  specific  cAdvisor  version  across  all  deployments  for  consistent  and  predictable  behavior  and  results.  

 

Stats  Database  All   metrics   gathered   by   Stats   Collector   service   are   passed   to   and   persisted   by   Stats   Database  service.  This  service  is  implemented  as  Docker  container  located  on  utility  host  in  foundation  farm  and  running  InfluxDB  time-­‐series  database  https://github.com/influxdata/influxdb.    Depending  on  specific  requirements  different  storage  back-­‐ends  may  be  used  in  place  of  InfluxDB.  The  choice  has  been  made  in  favor  of  InfluxDB  for  the  following  reasons:  

• Simple  and  self-­‐contained  database  without  external  dependencies;  • Purpose  made  database  for  time-­‐series  metric  storage  and  querying;  • Supported  by  and  integrated  into  many  modern  deployment  stacks  and  platforms;  • Provides  several  storage  engines  geared  towards  real-­‐time  data  processing;  • REST  API  driven  for  both  management,  data  ingestion  and  processing;  • Supporting  SQL-­‐like  InfluxQL  language  for  querying  database;  • Provides  flexible  controls  and  data  retention  policies;  • Scalable  and  supports  clustering;  

Page 12: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       12  

 The  Stats  Database  service  is  indirectly  depending  on  Image  Registry  service,  since  its  image  being  pulled   from  registry  by   the  Docker  Engine  during  the  service  container  startup.  Other   than  that,  assuming  standalone  (non-­‐clustered)  deployment,  the  Stats  Database  service  is  self-­‐sufficient  and  being  used  by  other  services  and  components  such  as:  

• The  Stats  Visualization  portal  –  is  querying  Stats  Database  for  visualized  resource  metrics;  • The  Reporting  service  –  is  querying  Stats  Database  for  compiling  various  usage  reports;  • The  Stats  Collector  –  is  periodically  storing  measurements  in  the  Stats  Database.  

 The  InfluxDB  is  also  providing  web  console  for  basic  management  and  querying  operations.    

   

Figure  6  -­‐  InfluxDB  Web  Console  

Here  is  an  example  for  running  InfluxDB  container:  

 $  docker  run  -­‐-­‐name=influxdb  -­‐-­‐detach=true  -­‐-­‐restart=always  \    -­‐-­‐cpu-­‐shares  512  -­‐-­‐memory  1G  -­‐-­‐memory-­‐swap  1G  \    -­‐-­‐volume=${VOL_DATA}/influxdb:/influxdb  -­‐-­‐publish  8083:8083  -­‐-­‐publish  8086:8086  \    -­‐-­‐expose  8090  -­‐-­‐expose  8099  \    -­‐-­‐env  ADMIN_USER="root"  -­‐-­‐env  PRE_CREATE_DB=cadvisor  \    ${REGISTRY}/influxdb    

In  some  cases  there  may  be  a  need  to  have  separate  user  accounts  with  varying  access  levels.  The  user  with  write  permissions  may  be  used   for  storing  stats   in   the  DB  and  read-­‐only  user  may  be  used  for  reporting  and  monitoring  activities.  Let’s  create  users  with  read  and  write  permissions:  

 $  cat  <<"EOT"  |  docker  exec  -­‐i  influxdb  /usr/bin/influx  -­‐username=root  -­‐password=root  -­‐path  -­‐  CREATE  DATABASE  cadvisor  CREATE  USER  writer  WITH  PASSWORD  '<writer  password>'  CREATE  USER  reader  WITH  PASSWORD  '<reader  password>'  GRANT  WRITE  ON  cadvisor  TO  writer  GRANT  READ  ON  cadvisor  TO  reader  EOT    

Page 13: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       13  

 Now,  we  will  list  available  databases  using  InfluxDB  client:  

 $  echo  "show  databases"  |  docker  exec  -­‐i  influxdb  /usr/bin/influx  -­‐username=root  -­‐password=root  -­‐path  -­‐    Visit  https://enterprise.influxdata.com  to  register  for  updates,  InfluxDB  server  management,  and  monitoring.  Connected  to  http://localhost:8086  version  0.10.3  InfluxDB  shell  0.10.3  >  name:  databases  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  name  cadvisor  _internal    

Things  to  keep  in  mind:  • For   the  sake  of   simplicity   InfluxDB   is  deployed  as   standalone   instance  and   therefore   it   is  

not   resilient   to   service   failures   resulting   in   data   loss   until   service   is   recovered.   It’s  recommended  to  deploy  InfluxDB  cluster  for  production  deployments;  

• The  database  size  on  disk  will  depend  on  retention  policies  and  amount  of  metrics  collected  over  time.  The  policies  and  retention  rules  will  need  to  be  adjusted  for  production  use  and  on  case-­‐by-­‐case  basis;  

• The   service   (container)  memory   consumption  will   depend   on   configured   storage   engine,  amount   of   metrics   collected   and   configuration   settings.   Those   settings   will   need   to   be  adjusted  for  production  use,  keeping  in  mind  resource  constraints;  

• InfluxDB  provides  multiple  interfaces  for  monitoring  and  data  querying,  including  database  client  application,  client  libraries  for  most  popular  languages  as  well  as  REST  API  endpoint;  

• This  project  is  using  custom  built  image  for  InfluxDB  for  automating  and  simplifying  basic  setup   and   management   tasks.   It   may   behave   differently   comparing   to   default   image  provided  by  the  vendor.  

 

Image  Registry  All  container  images  used  by  the  POC  project  are  stored  in  the  local  image  repository  provided  by  Image  Registry  service.  This  service  is  implemented  as  the  Docker  container  located  on  utility  host  in   the   foundation   farm  and  running  Docker  Distribution  https://github.com/docker/distribution  application.    Whenever  new  container  image  is  built  –  it  is  stored  in  the  Image  Registry.  Whenever  new  container  created,  its  image  being  pulled  from  this  repository.    More   details   and   examples   can   be   found   in   Docker   Distribution   project   documentation   on   the  following  link  https://github.com/docker/distribution/blob/master/docs/deploying.md.    Being  one  of  the  base  services,  the  Image  Registry  is  self-­‐contained  and  does  not  depend  on  other  Platform  services.  At  the  same  time  the  Image  Registry  is  not  used  directly  by  Platform  services.  Usually,   it   is  used   indirectly,  when  Docker  Engine  cannot   find  required   image   in   the   local   image  storage  on  particular  host.  In  this  case  the  image  is  being  queried,  validated  and  pulled  from  the  Image  Registry.    Here  is  an  example  for  setting  up  image  registry  service.  First  of  all  we’ll  setup  certificates.  The  SSL  keys  will  need  to  be  generated  only  once,  but  have  to  be  deployed  on  every  Docker  host:  

Page 14: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       14  

 #  executed  only  once:  generating  self-­‐signed  registry  certificate,  CN=registry.poc    $  mkdir  -­‐p  ~/certs  $  openssl  req  -­‐newkey  rsa:4096  -­‐nodes  -­‐sha256  -­‐x509  -­‐days  365  \    -­‐subj  "/C=DE/ST=HE/L=Frankfurt/O=VZ/OU=MH/CN=registry.poc/[email protected]"  \    -­‐keyout  ~/certs/registry.key  -­‐out  ~/certs/registry.crt    #  executed  on  each  Docker  host:    #  -­‐  deploying  certificates  to  the  Docker  certificate  store  $  mkdir  -­‐p  /etc/docker/certs.d/registry.poc\:5000  $  cp  certs/registry.crt  /etc/docker/certs.d/registry.poc\:5000/ca.crt    #  -­‐  restarting  docker  to  activate  certificates  $  systemctl  restart  docker.service    

Next,  we’ll  set  up  host  volumes  and  configuration  for  the  Image  Registry  service  container:  

 $  mkdir  -­‐p  /var/data/registry/{certs,config,data}  $  [  -­‐d  ~/certs  ]  &&  cp  ~/certs/*  /var/data/registry/certs  $  cat  <<EOT  >  /var/data/registry/config/config.xml    version:  0.1  log:      level:  info      formatter:  text      fields:          service:  registry          environment:  production  storage:          cache:                  layerinfo:  inmemory          filesystem:                  rootdirectory:  /var/lib/registry  http:          addr:  :5000          tls:                  certificate:  /certs/registry.crt                  key:  /certs/registry.key          debug:                  addr:  :5001  EOT    

Eventually,  we’ll  start  registry  service  and  validate  that  it  can  be  accessed  over  HTTPS:  

 #  starting  Docker  container  with  registry  service  $  docker  run  -­‐-­‐name  registry  -­‐-­‐hostname  registry.poc  -­‐-­‐detach=true  -­‐-­‐restart=always  \    -­‐-­‐env  REGISTRY_HTTP_TLS_CERTIFICATE=/certs/registry.crt  \    -­‐-­‐env  REGISTRY_HTTP_TLS_KEY=/certs/registry.key  \    -­‐-­‐volume  /var/data/registry/certs:/certs:ro  \    -­‐-­‐volume  /var/data/registry/data:/var/lib/registry:rw  \    -­‐-­‐volume  /var/data/registry/config:/etc/docker/registry:ro  \    -­‐-­‐publish  5000:5000  \    registry:2.5    #  verifying  registry  is  working,  registry.poc  name  should  resolve  to  IP  owned  by  the  registry  service  $  docker  tag  busybox  registry.poc:5000/poc/busybox:v1  $  docker  push  registry.poc:5000/poc/busybox:v1  $  curl  -­‐-­‐cacert  ~/certs/registry.crt  -­‐X  GET  https://registry.poc:5000/v2/poc/busybox/tags/list  {"name":"poc/busybox","tags":["v1"]}    

Page 15: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       15  

Things  to  keep  in  mind:  • Most   container   images   are   stored   in   the   locally   hosted   Image   Registry,   however,   some  

images   are   pulled   from   outside   repositories   to   avoid   circular   dependencies   during   the  service  startup:  

o The   Docker   Distribution   container   image   is   provided   by   Docker   and   pulled   from  external  registry  https://hub.docker.com/r/distribution/registry/  

o The   Google   cAdvisor   container   image   is   provided   by   Google   and   pulled   from   the  external  registry  https://hub.docker.com/r/google/cadvisor/    

o The  GitLab  container   image   is  provided  by  GitLab  community  and  pulled   from  the  external  registry  https://hub.docker.com/r/gitlab/gitlab-­‐ce/    

• For  the  sake  of  simplicity  the  Image  Registry  service  is  deployed  as  standalone  instance  and  therefore   is   not   resilient   to   service   failures.   The   HA   deployment   is   recommended   for  production  use;  

• Current  implementation  is  not  using  any  authentication  or  authorization  mechanisms,  thus  allowing   any   user   to   access   container   images.   Although   this   service   is   only   used   inside  internal  secure  perimeter,  it’s  recommended  to  implement  RBAC  policies  or  at  least  strong  authentication  mechanism  for  production  deployments;  

• Due   to   security   considerations   all   traffic   is   encrypted   and   service   access   is   only   possible  using  HTTPS  protocol  as  a  transport.  Depending  on  security  requirements  there  may  be  a  need  to  create  and  sign  service  SSL  keys  using  trusted  CA.  Current  implementation  is  using  self-­‐signed  CA  and  keys.  For  this  to  work,  those  self-­‐signed  keys  must  be  added  to  Docker  certificate  store  on  every  Docker  host  that  is  communicating  with  “Image  Registry”  service;  

• Obviously,   there   is   a   trade-­‐off   with   known   pro’s   and   contras,   when   implementing   local  registry   comparing   to   externally   hosted   container   registry.   For   this   project   it’s   been  decided   to   use   local   registry,   however,   nothing   prevents   using   external   Image   Registry  service.  This  is  assuming  that  service  integration  has  been  performed,  service  availability,  security  and  access  issues  have  been  addressed.  

 

Image  Builder  This   service   is   implemented   as   Platform  management.   Currently,   new   image   builds   have   to   be  triggered  manually   after  Docker   files   have   been  modified,   however,   nothing   is   speaking   against  automating  this  step  and  triggering  image  build  upon  certain  event,  for  example  container  image  code  or  configuration  changes.      

 Figure  7  -­‐  Image  Builder  UI  

 

Page 16: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       16  

There  are  no  services  depending  on  Image  Builder.  The  Image  Builder  itself  is  directly  depending  on   SCM   service   and   indirectly   on   Image   Registry   where   fresh   built   images   being   pushed   to.  Obviously,   some   secrets   such   as   keys   and   credentials  must   be   used  during   the   container   image  build  stage.  There  is  a  nice  write  up  providing  good  summary  for  available  solutions  and  options.  See  http://elasticcompute.io/2016/01/22/build-­‐time-­‐secrets-­‐with-­‐docker-­‐containers/.    Currently,  container  images  can  be  built  in  two  modes:  

• Build:  container  image  is  built  from  scratch  and  properly  tagged;  • Release:   after   performing   image   build   the   image   is   undergoing   tests   and,   if   successful,  

pushed  to  the  image  repository,  thus  becoming  available  for  deployment.    Things  to  keep  in  mind:  

• Although   container   build   workflow   does   include   the   step   for   executing   tests,   currently,  there  are  no  actual  tests  provided.  Special  care  should  be  taken  and  container  images  must  be  tested  manually  prior  to  deploying  and  using  them;  

• Sometimes,   when  memory   becomes   scarce   (multiple   SonarQube   analysis   running)   –   the  image   rebuild   process   may   fail   with   error   messages   indicating   lack   of   memory.   It’s  indicating   some   memory   leaks   in   Docker   and   hopefully   will   be   fixed   in   the   upcoming  releases.  This  should  not  occur  though  in  environments  with  sufficient  memory  allocation;  

• The   Docker   files   for   images   have   been   built   considering   image   caching,   therefore   often  image   rebuilds   must   not   create   significant   load.   At   the   same   time   image   caching   may  become  a  source  of  hard-­‐to-­‐track  issues,  therefore  administrators  may  need  to  pay  a  special  care   to   the   local   image   store   and   cached   images   on   the   systems   where   builds   are  performed.  

 

Deployment  Service  By   using   Deployment   service   we   can   ensure   that   all   projects   are   following   naming,   security,  configuration  and  deployment  standards  and  conventions.  They  can  be  easily  identified,  managed  and  recreated  in  a  standard  and  repeatable  way.  See  the  Drupal  Website  Deployment  chapter  for  additional  details  and  examples.    All  project  deployment  tasks  are  handled  by  this  service,  namely:  

• Checking  requested  parameters  against  naming  standards;  • Choosing  the  target  location  based  on  user  inputs  or  defaults;  • Validating  that  target  location  is  ready  for  deployment;  • Cloning  requested  project  version  from  the  code  repository;  • Cloning  required  add-­‐on  projects  from  the  code  repository;  • Deploying  code  to  the  target  location;  • Running  configuration  instructions  and  setup  procedures;  

 The   Deployment   service   is   completely   decoupled   from   containers   or   other   infrastructure  semantics.   From   a   high-­‐level   perspective   the   relationship   between   related   components   can   be  described  as:  

• Container  Provisioning  Service  is  deploying  well  defined  pre-­‐configured  containers;  • Containers  are  encapsulating  applications  and  are  immutable  or  read-­‐only.  All  volatile  and  

mutable  objects  such  as  content,  log  files,  temporary  files,  etc.  are  persisted  on  volumes  or  using  other  persistence  mechanisms  such  as  Database  Storage;  

• Deployment   Service   is   populating   host   volumes   with   application   objects   such   as   code,  configuration,  content,  etc.  Those  host  volumes  are  mapped  to  container  volumes  and  thus  becoming  available  to  execution  runtime  inside  corresponding  containers.  

Page 17: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       17  

 The  Deployment  service   is  used  by  Deployment  workflows  via  corresponding  Platform  CLI  calls.  The  service  itself  is  having  several  dependencies:  

• Secure  Storage  –  used  to  query  various  credentials  and  sensitive  information;  • SCM  Service  –  used  to  clone  requested  projects  and  their  dependencies;  • Persistent  Volumes  –  used  for  deployment  targets  to  store  project-­‐related  objects;  • Persistent   Database   Storage   –   may   be   indirectly   used   by   project   setup   scripts,   for  

example   for   creating   database   scheme   for   the   project   or   populating   required   database  objects.  

 Things  to  keep  in  mind:  

• The   Deployment   service   is   not   making   orchestration   decisions   and   therefore   must   be  provided   the   target   location   specification  by  upstream  caller.  This   is   done  on  purpose   to  keep  orchestration  logic  and  mechanisms  separate  from  deployment  semantics;  

• The  Deployment   service   is   a   part   of   Platform   CLI   component   and   as   such   uses   platform  configuration,  settings  and  naming  standards;  

• Since  provisioning  tasks  may  involve  multiple  hosts  or  be  invoked  remotely,  it  is  required  that   password-­‐less   (key-­‐based)   SSH   access   is   configured   between   the   master   and   slave  nodes;  

• Deployment   service   does   just   that   –   deploying   projects   to   target   locations   according   to  well-­‐defined  rules  and  naming  standards.  It  does  not  care,  nor  making  assumptions  about  the  applications,  custom  code  or  content  used  by  applications  deployed  inside  containers  as  long  as  projects  following  defined  project  structure.  

 

Container  Provisioning  Service  All   container   provisioning   and   de-­‐provisioning   operations   are   handled   by   this   service,  which   is  translating   requested   actions   into   corresponding   Docker   commands   and   API   calls.   It   is   still  possible   to   create   arbitrary   containers   using   Docker   client   or   APIs,   however,   for   the   sake   of  consistency  this  approach  is  discouraged.    This   can   be   best   explained   by   the   following   example.   Let’s   provision   new  web   container   using  Docker  CLI:  

 $  docker  run  -­‐-­‐name  d7-­‐demo  -­‐-­‐hostname  wbs1  -­‐-­‐detach=true  -­‐-­‐restart=on-­‐failure:5  \    -­‐-­‐security-­‐opt  no-­‐new-­‐privileges  -­‐-­‐cpu-­‐shares  16  -­‐-­‐memory  64m  -­‐-­‐memory-­‐swap  1G  \    -­‐-­‐publish  10.169.64.232:8080:80  -­‐-­‐publish  10.169.64.232:8443:443  \    -­‐-­‐volume  /var/web/stg/root/d7-­‐demo:/var/www  -­‐-­‐volume  /var/web/stg/data/d7-­‐demo:/var/data  \    -­‐-­‐volume  /var/web/stg/logs/d7-­‐demo:/var/log  -­‐-­‐volume  /var/web/stg/temp/d7-­‐demo:/var/tmp  \    -­‐-­‐volume  /var/web/stg/cert/d7-­‐demo:/etc/ssl/web  \    -­‐-­‐tmpfs  /run:rw,nosuid,exec,nodev,mode=755  \    -­‐-­‐tmpfs  /tmp:rw,nosuid,noexec,nodev,mode=755  \    -­‐-­‐env-­‐file  /opt/deploy/container.env  \    -­‐-­‐label  container.env=stg  -­‐-­‐label  container.size=small  \    -­‐-­‐label  container.site=d7-­‐demo  -­‐-­‐label  container.type=web  \    registry.poc:5000/poc/nginx-­‐php-­‐fpm    

You  may   have   noticed,   there   are   number   of   additional   options   and  parameters   required   by   the  platform   itself,   its   services   and  naming   standards.   Although,   Container   Provisioning   Service   has  made   exactly   this   same   call   to   a   Docker   engine,   there   is   lot  more   happening,   hidden   under   the  hood.    

Page 18: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       18  

Now,  let’s  provision  the  same  web  container  using  Container  Provisioning  Service.  In  addition  to  creating  Docker  Container  it  is  performing  the  following  essential  steps:  

• Checking  is  container  name  against  naming  standards;  • Checking  there  is  no  container  with  such  name  already  present;  • Validating  IP  address:  

o Checking  whether  provided   IP  belongs   to   address  pool   and  whether   this   IP   is   not  already  taken  by  other  container;  

o If  no  IP-­‐address  provided,  then  automatically  selecting  next  free  IP  from  the  pool;  • Checking  whether  container  host  volumes  present  and  creating  them  otherwise;  • Adding  container  labels,  specifying  web  site,  its  environment,  size  and  container  type;  • Adding  resource  constraints  and  security  related  options;  • Using  given  image  or  default  one  if  no  container  image  specified  for  creating  new  container.  

 

 $  /opt/deploy/web  container  create  -­‐-­‐farm  poc  -­‐-­‐env  stg  -­‐-­‐site  d7-­‐demo  -­‐-­‐image  nginx-­‐php-­‐fpm  web  container  create:  using  next  free  IP:  10.169.64.232  web  container  create:  checking  10.169.64.232  is  setup          inet  10.169.64.232/26  brd  10.169.64.255  scope  global  secondary  enp0s17:  web  container  create:  folder  /var/web/stg/root/d7-­‐demo  not  found,  creating  web  container  create:  folder  /var/web/stg/data/d7-­‐demo  not  found,  creating  web  container  create:  folder  /var/web/stg/logs/d7-­‐demo  not  found,  creating  web  container  create:  folder  /var/web/stg/cert/d7-­‐demo  not  found,  creating  web  container  create:  folder  /var/web/stg/temp/d7-­‐demo  not  found,  creating  web  container  create:  exporting  container  ENV  variables  from  /opt/deploy/container.env  web  container  create:  creating  container  d7-­‐demo  web  container  create:  |-­‐-­‐  image-­‐tag:  registry.poc:5000/poc/nginx-­‐php-­‐fpm  web  container  create:  |-­‐-­‐  resources:  small  (-­‐-­‐cpu-­‐shares  16  -­‐-­‐memory  64m  -­‐-­‐memory-­‐swap  1G)  web  container  create:  |-­‐-­‐  published:  10.169.64.232:8080:80  web  container  create:  |-­‐-­‐  published:  10.169.64.232:8443:443  web  container  create:  |-­‐-­‐  volume:  /var/web/stg/cert/d7-­‐demo:/etc/apache2/ssl  web  container  create:  |-­‐-­‐  volume:  /var/web/stg/logs/d7-­‐demo:/var/log  web  container  create:  |-­‐-­‐  volume:  /var/web/stg/root/d7-­‐demo:/var/www  web  container  create:  |-­‐-­‐  volume:  /var/web/stg/data/d7-­‐demo:/var/data  web  container  create:  |-­‐-­‐  volume:  /var/web/stg/temp/d7-­‐demo:/var/tmp  web  container  create:  |-­‐-­‐  volume:  tmpfs:/run  web  container  create:  |-­‐-­‐  volume:  tmpfs:/tmp  web  container  create:  |-­‐-­‐  label:  container.env=stg  web  container  create:  |-­‐-­‐  label:  container.size=small  web  container  create:  |-­‐-­‐  label:  container.site=d7-­‐demo  web  container  create:  \__  label:  container.type=web  web  container  create:  started  site  container  cb68618b84b4d3276a77ebd4a0635c5387a8319f1ffaac3759c74820fa32b258    

By   using   Container   Provisioning   service   we   can   ensure   that   all   containers   following   naming,  security,  configuration  and  resource  allocation  standards.  They  can  be  easily  identified,  managed  and  recreated  in  a  standard  and  repeatable  way.  

 $  /opt/deploy/web  container  list  -­‐-­‐farm  poc  -­‐-­‐env  stg  -­‐-­‐format  table  web  container  list:  CONTAINER  ID        NAMES            STATUS                      ENV    SIZE          PORTS  cb68618b84b4        d7-­‐demo        Up  16  minutes        stg    small        10.1.1.2:8080-­‐>80/tcp,  10.1.1.2:8443-­‐>443/tcp  c953adf92e09        d7                  Up  3  weeks              stg    small        10.1.1.2:8080-­‐>80/tcp,  10.1.1.2:8443-­‐>443/tcp    

Page 19: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       19  

The  Container  Provisioning  service  is  used  by  Deployment  workflows  via  corresponding  Platform  CLI  calls.  The  service  itself  having  no  specific  dependencies  and  is  using  Docker  CLI  for  performing  container  management  operations.    Things  to  keep  in  mind:  

• The   Container   Provisioning   service   is   not   making   orchestration   decisions   and   therefore  must   be   provided   the   target   location   specification   by   upstream   caller.   This   is   done   on  purpose  to  keep  orchestration  logic  and  mechanisms  separate  from  deployment  semantics;  

• The  Container  Provisioning  service   is  a  part  of  Platform  CLI  component  and  as  such  uses  platform  configuration,  settings  and  naming  standards;  

• Since  provisioning  tasks  may  involve  multiple  hosts  or  be  invoked  remotely,  it  is  required  that   password-­‐less   (key-­‐based)   SSH   access   is   configured   between   the   master   and   slave  nodes;  

• The   Container   Provisioning   service   does   just   that   –   provisions   properly   configured  containers.   It   does  not   consider,   nor  making   assumptions   about   the   applications,   custom  code  or  content  used  by  applications  deployed  inside  containers;  

• The   Container   Provisioning   service   is   the   only   component   that   has   to   be   adjusted,   if  different  mechanism  or  API  has  to  be  used  for  provisioning  containers,  for  example  CoreOS  rkt  or  LXD;  

• In   case   of   using   orchestration   engines   such   as   Kubernetes,   the   Container   Provisioning  service  can  implement  a  wrapper  for  provided  provisioning  functionality.  

 

Reporting  Service  Reporting   service   is   implemented   as  Docker   container   that   runs   queries   against   Stats  Database  and   compiles   reports   for   aggregated   resource   usage   according   to   specified   conditions   and  parameters.  There  are  no  services  depending  on  Reporting  service.  The  Reporting  service  itself  is  depending  on  Stats  Database  for  fetching  report  data.    

Persistent  Volumes  One  of  the  platform  design  paradigms  is  to  keep  containers  immutable  or  read-­‐only  and  all  volatile  and  modified  data  should  be  stored  outside  of  container  on  so  called  container  volumes.  Since  we  want  this  data  to  be  available  between  container  runs  these  volumes  must  be  persistent.  There  is  another  benefit   related   to   keeping   application  data   and   content   outside   of   container   –   it   allows  achieving   the   best   application   performance.   Since   there   is   not   COW   (copy-­‐on-­‐write)   indirection  layer  in  between,  all  I/O  operations  are  handled  effectively  by  Linux  kernel.    Things  to  keep  in  mind:  

• Current   platform   design   is   not   making   assumptions   about   underlying   technology   and  orchestration   layer.   For   the   sake   of   simplicity   the   container   host   volumes   are   used   as  persistent  volumes  implementation;  

• There  are  other  options   to  be  explored   for  mapping   container  volumes   to   corresponding  SAN   volumes,   NAS   volumes   or   iSCSI   targets.   This   would   allow   containers   to   take   their  volumes   along   with   them   if   restarted   on   a   different   Docker   thus   making   containers  “mobile”  and  allowing  container  migrations  across  available  hosts.  These  options  were  not  explored   during   this   project,   however,   using   them   may   be   essential   when   running  containers  on  platforms  like  Kubernetes.  

 

Page 20: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       20  

Volume  Sync-­‐Share  Service  Horizontal   scaling   and   high   availability   requirements   demand   that   application   span   multiple  application   instances,   or   containers   for   this   matter.   Although   session   state   is   kept   outside   of  containers,  the  static  content  still  has  to  be  shared  between  multiple  application  instances.    Generally   speaking,   there   are   two   possible   ways   for   resolving   this   issue:   share   file-­‐system   or  synchronize  file-­‐systems.  Every  solution  is  having  own  strong  and  weak  sides.  Both  options  have  been   explored   and   considered   viable.   The   choice   is   really   dictated   by   specific   infrastructure,  performance  and  support   requirements.  The   following  comparison  shall  help  selecting   the  most  appropriate  option  for  specific  deployment  scenario:       Shared  Content   Synchronized  Content  Implementation  approach  

Centralized   storage   holding   single  file-­‐system   with   many   nodes  performing  access.  

Share   nothing   architecture.  Many  nodes  with  multi-­‐master  replication   between   file-­‐systems.  

Storage  space  requirements  

Volume-­‐Size   Volume-­‐Size  x  N  (#  of  nodes)  

Storage  throughput   All   nodes   sharing   server   network  link  and  capped  by  its  throughput.  One  node  may  saturate  the  link  and  degrade  performance  for  others.    Limited   by   single   volume   IOPs,  quickly   degrades   with   number   of  nodes.  

Throughput   and   IOPs   scale  linearly  with  number  of  nodes.  

File-­‐system  locking   File-­‐system   locks   maintained   to  allow   concurrent   access   for  multiple   nodes   to   a   single   object.  Can   lead   to   stalled   I/O   operations  and,   as   result,   to   unresponsive  applications.  

No  file-­‐system  locks  required.  

Change  propagation   Instant   Little  latency  Implementation  complexity  

Low   Moderate  

Support  complexity   Moderate   Low  Known  limitations   SendFile   kernel   support  and  mmap  

must   be   disabled   on   shared  volumes.    Orphaned   file-­‐system   locks   may  need   to   be   identified   and   cleaned  manually.    Storage   volume   restart   may   have  unpredicted  effects  on  clients,   they  may  need  to  re-­‐mount  storage.    File-­‐system   caching   may   produce  inconsistent  results  across  clients.  

Large   file-­‐system  changes  may  take   some   time   to   propagate  on  all  clients.    In   rare   cases   file   may   be  modified   in   several   locations  producing  a  conflict  that  has  to  be   resolved   either  automatically  or  manually.  

Specific  application   NFS  4.x  server  and  clients   SyncThing  +  inotify  

Page 21: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       21  

 Given  overview  above,  one  may  still  wonder,  which  route  to  choose  and  whether  there  is  a  simple  rule  of  thumb  to  select  the  most  appropriate  option.  Here  we  go:    

• Implement  NFS:  o If  you  have  storage  array  capable  of  serving  files  using  NFS  4.x  protocol;  o If  your  applications  don’t  require  high  storage  throughput  and  concurrency;  o If  you  can  tolerate  noisy  neighbors  effect  at  times;  o If  storage  volume  size  (and/or  its  cost)  is  significant;  o If  you  already  have  expertise  in  house;  o If  other  parts  of  your  solution  using  NFS;  

• Implement  SyncThing:  o If  you  don’t  have  fault-­‐tolerant  NFS  server  and  can’t  afford  it  for  whatever  reason;  o If   your   applications   require   highest   storage   throughput   and   need   to   scale   as   they  

grow;  o If  you  absolutely  can’t  tolerate  noisy  neighbors  effect  or  NFS  server  downtime;  o If  you  can  tolerate  little  latency  required  to  propagate  changes;  o If  storage  volume  size  is  small  enough  to  have  redundant  copy  on  every  client.  

 Below  is  an  example  of  how  to  start  volume  sync  service:  

 $  docker  run  -­‐-­‐name  datasync  -­‐-­‐hostname  `hostname`  -­‐-­‐detach=true  -­‐-­‐restart=always  \    -­‐-­‐cpu-­‐shares  100  -­‐-­‐memory  100m  \    -­‐-­‐publish  22000:22000  -­‐-­‐publish  21027:21027/udp  -­‐-­‐publish  8384:8384  \    -­‐-­‐volume  /var/deploy/prd/data/:/var/sync  -­‐-­‐volume  /var/data/datasync:/etc/syncthing  \    -­‐-­‐tmpfs  /run:rw,nosuid,nodev,mode=755  -­‐-­‐tmpfs  /tmp:rw,nosuid,nodev,mode=755  \    registry.poc:5000/poc/syncthing    

This  service  has  to  be  started  on  all  Docker  host  nodes  having  data  volumes  that  must  be  kept  in  sync.  After  starting,  these  services  have  to  be  introduced  to  each  other  or  preform  handshake  and  mutual  changes  have  to  be  allowed  between  them.  It’s  one-­‐time  configuration.    All  file-­‐system  changes  will  be  tracked  via  inotify  subscription  and  updated  files  will  be  exchanged  between   nodes   using   efficient   block   exchange   protocol   similar   to   BitTorrent.   Thus,   the   change  propagation  speed  grows  with  the  number  of  nodes  participating  in  exchange.    Things  to  keep  in  mind:  

• SyncThing   is   relatively   young,   actively   developing   application.   There  may   be   side   effects  that  have  not  been  studied  yet;  

• SyncThing   configuration   can   be   generated   from   template   and   saved   to   the   configuration  file.  It  can  be  also  adjusted  using  APIs  and  Web  UI.  The  access  to  API  and  Web  UI  must  be  appropriately  secured;  

• SyncThing  protocol  is  ensuring  quick  delta  updates  and  high  performance.  During  the  tests  ~100+MB/s  sync  speed  has  been  measured;  

• Although   SyncThing   can   perform   dynamic   service   and   network   discovery,   the   static  configuration  has  been  used  for  this  project.  

 

Page 22: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       22  

Persistent  Database  Storage  Similar   to   file-­‐system  volumes  persistent  database  storage   is  used  by  applications   for  persisting  structured  data.  Current  design  is  assuming  RDBMS  type  of  database,  or  MySQL  database  flavor  to  be  more  specific,  which  is  very  common  DB  choice  for  lightweight  DB-­‐driven  web  applications.    Due  to  high-­‐availability  requirement,  the  production  DB  instance  is  hosted  on  the  MySQL  cluster.  Whereas   other   environments   such   as   staging   or   development   may   use   instances   deployed   on  standalone  MySQL  host.    The  MySQL   database   has  multiple   flavors   and   distributions,   each   having   own   strong   and  weak  sides:  

• MariaDB  –  Enterprise  Cluster  Subscription:  o Feature-­‐wise  inferior  to  other  options,  although  it’s  quickly  catching-­‐up;  o Using  Galera  multi-­‐master  replications;  o MariaDB  database  proxy  MaxScale;  o Enterprise  support  and  Consulting  are  available  and  reasonable  priced;  

• Oracle  MySQL  –  vendor  supported  edition;  o Good  performance  and  features;  o CGE   Clustering   option   is   similar   to   Always-­‐On  MSSQL   and   provides   carrier-­‐grade  

data  availability  and  protection;  o Enterprise  support  is  available  but  is  very  expensive;  

• Percona  XtraDB  –  is  free,  but  vendor  support  is  available  on  demand:  o Outstanding  features  and  performance;  o Performance  Scheme  extensions  and  number  of  analysis  and  tuning  tools;  o XtraDB  Cluster  is  using  Galera  multi-­‐master  replication;  o Enterprise  support  and  Consulting  are  available  and  reasonable  priced;  

 For   this  project  Percona  XtraDB   has  been   selected   for   its   outstanding   capabilities   and   relatively  low  support  cost.  However,  there  are  number  of  things  to  keep  in  mind  when  implementing  this  approach  in  production:  

• Unlike  Oracle   and  MariaDB   implementation,   Percona   XtraDB   does   not   provide   packaged  solution   for  cluster   load-­‐balancing,  however,   there  are  number  of   community  papers  and  vendor   articles   for   using   HAproxy   or   hardware   load-­‐balancers   for   implementing   such  function;  

• With  either  MySQL  server  option  enforcing  DB  quotas  and  even  calculating  DB  sizes  may  be  a  complicated   tasks.  There  are  some  workarounds  and  solutions,  but,  generally  speaking,  this   is   InnoDB   storage   engine   limitation.   Still   it   has   to   be   considered   in   hosting  environment,   when   single   DB   server   (or   cluster)   is   shared   by   multiple   DB   instances  belonging  to  different  clients  or  projects;  

• With  either  MySQL   server  option,   the  DB  server  sizing   is  very  complex  exercise  requiring  thorough  knowledge  of  MySQL  server  innards  and  significant  amount  of  stress  and  capacity  tests   to   find   a   rough   formula   that   will   tie   infrastructure   measurements   to   application  measurements  such  as  TPS  (transactions  per  second)  or  QPS  (queries  per  second).  There  is  no  universal  formula  and  such  sizing  must  be  performed  for  every  application  type;  

• The  best  way  to  perform  reliable  application  sizing  is  it  execute  a  number  of  load  and  stress  tests   to  measure   various   performance   KPIs   and   baseline   DB   server   capacity   in   terms   of  TPS/QPS  and  map  those  estimates  to  specific  hardware  profiles.  

 

Page 23: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       23  

Database  Driver  Choosing  Database  distribution   is  very   important.  However,   there   is  another,  not   less   important  and  often  overlooked  subject:  using  proper  database  API  driver  for  your  runtime  and  applications.  For  the  sake  of  simplicity  we  will  mostly  concentrate  on  PHP  specifics.    PHP  offers  three  different  APIs  to  connect  to  MySQL:  

• ext/mysql  –  old  MySQL  extension  that   is  used  by  default.  Has  been  around  since  PHP  2.0  and  will  be  discontinued  soon.  Lacking  support  for  most  modern  MySQL  features;  

• PDO_MySQL   –   implements   PHP   Data   Objects   interface   to   standardize   access   from   PHP  applications  to  MySQL  3.x-­‐5.x  databases.  Provides  good  feature  coverage;  

• ext/mysqli  –  improved  MySQL  extension,  allows  to  access  functions  provided  by  MySQL  4.1  and  above.  Most  notably  it  is  supporting:  

o Non-­‐blocking,  asynchronous  queries;  o Server-­‐side  prepared  statements;  o Stored  procedures;  o Multiple  statements;  o All  MySQL  5.1+  functionality.  

 Additionally,   it’s  recommended  to  employ  MySQL  Native  Driver   instead  of  MySQL  Client  Libraries  used  by  default.  This  driver  is  providing  number  of  advantages:  

• MySQL  Native  Driver  uses  the  PHP  memory  management  system.  Its  memory  usage  can  be  tracked  with  memory_get_usage()   call.   This   is   not   possible   with   libmysqlclient   because   it  uses  the  C  function  malloc()  instead;  

• MySQL  Native  Driver   also   provides   some   special   features   not   available   when   the  MySQL  database  extensions  use  MySQL  Client  Library:  

o Improved  persistent  connections;  o The  special  function  mysqli_fetch_all();  o Performance   and   caching   related   statistics   calls:   mysqli_get_cache_stats(),  

mysqli_get_client_stats(),   mysqli_get_connection_stats().   The   performance   statistics  facility  can  prove  to  be  very  useful  in  identifying  performance  bottlenecks;  

o SSL  Support.  MySQL  Native  Driver  has  supported  SSL  since  PHP  version  5.3.3  o Compressed   Protocol   Support.   As   of   PHP   5.3.2  MySQL  Native   Driver   supports   the  

compressed  client  server  protocol.    For  additional  details  and  features  of  various  PHP  APIs  and  MySQL  drivers,  see  the  following  link    http://dev.mysql.com/doc/apis-­‐php/en/apis-­‐php-­‐introduction.html.    Things  to  keep  in  mind:  

• All  container  images  have  been  tested  using  MySQL  Native  Driver  and  mysqli  API  extension;  • MySQL  Native  Driver  packages  provided  for  most  Linux  distributions  and  can  be  deployed  

using  standard  management  mechanisms;  • MySQL   Native   Driver   must   be   explicitly   enabled   and   configured   in   PHP   configuration  

pertinent  to  your  runtime;  • Obviously,  no  custom  drivers  and  configurations  can  safeguard  against  bad  code  and  wrong  

coding   patterns.   For   example,   the   high   number   of   opened   and   not   properly   closed  persistent  database  connections  can  quickly  exhaust  the  MySQL  server  connection  pool  and  thus  impact  other  applications  attempting  to  connect  to  the  same  database  server.  

 

Page 24: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       24  

Percona  XtraDB  Cluster  Limitations  Below   is   the   list   with   specific   limitations   that   must   be   considered   and   taken   care   of,   when  designing  solution  based  on  Percona  XtraDB  Cluster  product:  

• Currently  replication  works  only  with  InnoDB  storage  engine;  • Unsupported  queries  LOCK/UNLOCK  TABLES  cannot  be  supported  -­‐  LOCK  functions;  • Maximum   Allowed   transaction   sizes   is   defined   by  wsrep_max_ws_rows   and  

wsrep_max_ws_size   variables   for   more   documentation   please   have   a   look  at:  https://www.percona.com/doc/percona-­‐xtradb-­‐cluster/5.6/wsrep-­‐system-­‐index.html#wsrep_max_ws_rows  

• The  minimal   recommended  size  of   cluster   is  3  nodes.  The  3rd  node  can  be  an  arbitrator.  However,  a  full  node  can  be  beneficial  in  regards  to  availability  concerns  and  performance  of  the  individual  node  during  rebuild;  

• XA  transactions  cannot  be  supported  due  to  possible  rollback  on  commit.  • When   running   Percona   XtraDB   Cluster   in   cluster   mode,   avoid   ALTER   TABLE   ...  

IMPORT/EXPORT  workloads.  It  can  lead  to  node  inconsistency  if  not  executed  in  sync  on  all  nodes;  

• The  write  throughput  of  the  whole  cluster  is  limited  by  weakest  node.  If  one  node  becomes  slow,  whole  cluster   is  slow.  If  you  have  requirements  for  stable  high  performance,  then  it  should  be  supported  by  corresponding  hardware;  

• Due  to  cluster   level  optimistic  concurrency  control,   transaction  that   issuing  COMMIT  may  still   be   aborted   at   that   stage.   There   can   be   two   transactions   writing   to   same   rows   and  committing   in   separate   Percona   XtraDB   Cluster   nodes,   and   only   one   of   them   can  successfully  commit.  The  failing  one  will  be  aborted.  

 Other   Database   flavors,  MariaDB   and  Oracle   GCE   in   particular,   having   own   limitations   too.  We  won’t  be  doing  thorough  analysis  of  different  DB  options  here.  It  is  beyond  the  scope  of  this  paper  and  a  subject  for  a  separate  research  project.    

Secure  Storage  There   is   a   need   to   store   and   pass   credentials   in   secure   and   simple   manner   between   various  platform  components  and  hosted  applications.  There  are  multiple  approaches  possible   for  doing  this,   ranging   from  using   temporary   files   and   environment   variables   to   implementing   encrypted  volumes  visible  to  certain  containers.    There  is  a  nice  write  up  about  handling  runtime  secrets  with  Docker  containers  that  approaches  the   subject   from  more   general   perspective   and   exploring   various   options   and  mechanisms.   See  http://elasticcompute.io/2016/01/21/runtime-­‐secrets-­‐with-­‐docker-­‐containers/.   Actually,   the  build  secrets  such  as  credentials,  SSH  keys,  certificates  are  equally  important  and  you  may  want  to  Check   this   http://elasticcompute.io/2016/01/22/build-­‐time-­‐secrets-­‐with-­‐docker-­‐containers/  article  as  well.    After   performing   deep   analysis,  which   is   outside   of   the   scope   of   this   document,   and   comparing  various   options,   it’s   been   decided   to   use   dedicated   service   to   encapsulate   protected   storage  functionality.    Secure  Storage  service   is   implemented  as  Docker  container   located  on  utility  host   in   foundation  farm  and  running  HashiCorp  Vault  https://www.vaultproject.io.  All  data  in  vault  is  encrypted  and  not  accessible  for  external  user  without  the  access  key.  When  vault  is  locked,  no  access  is  possible,  until  it’s  been  unlocked  by  admin  using  several  unlock  keys.    

Page 25: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       25  

Here  is  an  example  of  how  to  setup  and  run  vault  container.  Since  all  communication  with  vault  is  only  possible  over  encrypted  transport,  first  of  all  we’ll  create  SSL  certificate:  

 $  mkdir  -­‐p  ${VOL_DATA}/vault/ssl  $  openssl  req  -­‐x509  -­‐nodes  -­‐days  365  -­‐newkey  rsa:2048  \    -­‐subj  "/C=DE/ST=HE/L=Frankfurt/O=VZ/OU=MH/CN=vault.poc.local/[email protected]"  \    -­‐keyout  ${VOL_DATA}/vault/ssl/vault.key  -­‐out  ${VOL_DATA}/vault/ssl/vault.crt    

Next,  we’ll  create  the  configuration:  

 $  cat  <<EOT  >/var/data/vault/config.hcl  backend  "file"  {      path="/vault/data"  }    listener  "tcp"  {      address  =  "0.0.0.0:8200"      tls_disable  =  0      tls_key_file  =  "/vault/ssl/vault.key"      tls_cert_file  =  "/vault/ssl/vault.crt"  }  EOT    

Finally,  we’ll  start  container:  

 $  docker  run  -­‐-­‐name  vault  -­‐-­‐detach=true  -­‐-­‐cap-­‐add  IPC_LOCK  -­‐-­‐publish  ${VAULT_HOST}:8200:8200  \    -­‐-­‐env  VAULT_ADDR=https://127.0.0.1:8200  -­‐-­‐env  VAULT_SKIP_VERIFY=1  \    -­‐-­‐volume  /var/data/vault/config.hcl:/vault/config.hcl  \    -­‐-­‐volume  /var/data/vault/data:/vault/data  \    -­‐-­‐volume  /var/data/vault/ssl:/vault/ssl  \    registry.poc:5000/poc/vault  server  -­‐config  /vault/config.hcl    

WARNING:  right  after  start  the  Vault  storage  is  sealed  and  must  be  unsealed  prior  to  the  first  use.  The  following  command  must  be  executed  3  times  and  3  out  of  5  vault  keys  must  be  provided.  These  vault   keys  are  generated  when  new  Vault   is   initialized.  They  must  be   stored  elsewhere   in  a   secure  location.  

 $  docker  exec  -­‐it  vault  vault  unseal    

Now,  that  vault  container  is  running  and  unsealed,  we  can  query,  read  and  store  credentials  and  other  variables  as  key-­‐value  tuples:  

 $  /opt/deploy/vault  dir  /secret/poc/dev/d7-­‐vzbase/  SITE_DB_USER  SITE_DB_PASS    $  /opt/deploy/vault  get  /secret/poc/dev/d7-­‐vzbase/SITE_DB_PASS  a$zrHS+cv}9DH.QR    

Page 26: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       26  

Things  to  keep  in  mind:  • The   configuration   provided   above   is   for   demonstration   purposes   only.   The   vault   is  

providing   lot   more   capabilities,   such   as   multiple   storage   back-­‐ends,   authentication   and  authorization  mechanisms,  access  policies,  etc.  It’s  recommended  to  tailor  configuration  for  production  use  according  to  security  requirements;  

• Although  Secure  Storage  service  can  be  started  in  a  completely  automated  manner,   it  will  become  sealed  until   required  number  of  unseal  keys   is  provided.  This   “unseal”  operation  can   be   also   performed   via  API,   given   that   the   code   calling  API   has   access   to   unseal   keys  stored  externally  in  a  secure  manner;  

• The   vault   application   is   attempting  mlock   syscall   to   avoid   swapping   allocated   memory  pages   to   disk.   For   running   this   application   in   unprivileged   container   the   IPC_LOCK  capability   must   be   granted,   otherwise   you   may   add   disable_mlock=true   statement   to  configuration,  in  order  to  disable  this  functionality;  

• Due   to   security   considerations   all   traffic   is   encrypted   and   service   access   is   only   possible  using  HTTPS  protocol  as  a  transport.  Depending  on  security  requirements  there  may  be  a  need  to  create  and  sign  service  SSL  keys  using  trusted  CA.  Current  implementation  is  using  self-­‐signed   CA   and   keys.   Current   implementation   is   allowing   untrusted   certificates.   For  production  use,  consider  using  keys  signed  by  trusted  CA;  

• This   project   is   using   “file”   storage   backend,  which   does   not   support   clustering.   Consider  using   different   storage   backend   to   support   clustering   and   deploy   the   service   in   highly  available  manner.  

 

Identity  Management  Service  Identity   Management   or   IdM   service   utilizes   MS   Active   Directory   for   user   authentication   and  authorization  as  well  as  storing  service  accounts.  All  access  to  this  service  performed  using  LDAPS  protocol  for  additional  transport  security.      You   can   use   openldap   or   any   other   LDAP   client   to   query   Active   Directory   objects.   It’s  recommended  to  deploy  adtool  application,  which  is  using  openldap  client  libraries  and  providing  very  convenient  CLI  tool  for  managing  MS  Active  Directory  objects.    First  of  all,  let’s  build  and  configure  adtool  application:  

 #  we  need  to  install  compiler  and  openldap  libraries  $  yum  install  gcc  openldap-­‐devel    #  downloading  and  extracting  adtool  sources  $  curl  -­‐LOsS  http://gp2x.org/adtool/adtool-­‐1.3.3.tar.gz  $  tar  xvzf  adtool-­‐1.3.3.tar.gz    #  configuring  package,  building  and  deploying  binaries  $  cd  adtool-­‐1.3.3  $  ./configure  -­‐-­‐prefix=/opt/deploy  -­‐-­‐datarootdir=/tmp  $  make  $  make  install    #  since  we  use  openldap  client  libraries,  we  need  to  hint  openldap  where  to  look  for  stuff  $  cat  <<EOT  >>/etc/openldap/ldap.conf  URI  ldaps://poc.local  BASE  DC=POC,DC=LOCAL  TLS_REQCERT  allow  EOT    

Page 27: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       27  

Now  that  adtool  is  deployed,  you  can  already  access  the  Active  Directory  services,  however,  some  commands  will   only   work  when   using   LDAPS   protocol   and   LDAP   SSL   is   properly   setup   on   AD  server  side.    Below   we’ll   provide   steps   for   setting   up   LDAP   SSL   transport.   There   are   multiple   approaches  possible  that  will  depend  on  your  organization  PKI  management  practices.  We’ll  assume  here  that  different  systems  used  for  certificate  generation,  signature  and  LDAP  services.  For  demonstration  purposes  we  will  be  using  self-­‐signed  CA  and  certificates.  For  production  systems  you’ll  need   to  create  Certificate  Signing  Request  (CSR)  having  required  extended  attributes  and  have  it  signed  by  Certificate  Authority  (CA)  of  your  choice.    First  of  all,  let’s  create  and  sign  SSL  certificates:  

 #  1>  Setting  a  password  that  will  be  used  for  certificate  container  encryption  $  echo  "<Your  Very  Secure  Password>"  >  passwd    #  2>  Create  a  root  CA:  $  openssl  req  -­‐x509  -­‐newkey  rsa:4096  -­‐keyout  myCA.key  -­‐out  myCA.pem  -­‐days  3650  \    -­‐subj  "/C=DE/L=Frankfurt/O=Verizon/OU=MH/CN=AMB38997ADS100.POC.LOCAL/[email protected]"  \    -­‐passout  file:passwd    #  3>  Strip  the  password  from  the  RSA  key:  $  openssl  rsa  -­‐in  myCA.key  -­‐out  myCA_nopass.key  -­‐passin  file:passwd    #  4>  Export  CA  certificate  bundle  along  with  the  private  key  in  PFX  format:  $  openssl  pkcs12  -­‐export  -­‐in  myCA.pem  -­‐inkey  myCA.key  \    -­‐passin  pass:$(<passwd)  -­‐out  CA.pfx  -­‐passout  file:passwd    #  5>  Create  CSR  configuration:  $  cat  <<EOT  >myCSR.cnf  basicConstraints=CA:FALSE  keyUsage  =  nonRepudiation,digitalSignature,keyEncipherment,dataEncipherment  extendedKeyUsage=serverAuth,clientAuth  subjectAltName=DNS:AMB38997ADS100.POC.LOCAL,DNS:POC.LOCAL,IP:10.169.69.11  EOT    #  6>  Create  a  Certificate  Signing  Request:  $  openssl  req  -­‐out  myCSR.csr  -­‐new  -­‐newkey  rsa:4096  -­‐nodes  -­‐keyout  myCSR.key  \    -­‐subj  "/C=DE/L=Frankfurt/O=Verizon/OU=MH/CN=AMB38997ADS100.POC.LOCAL/[email protected]"      #  7>  Sign  the  request  with  your  CA,  using  custom  config  for  extended  attributes  required  by  AD:  $  openssl  x509  -­‐CA  myCA.pem  -­‐CAkey  myCA_nopass.key  -­‐CAcreateserial  -­‐req  -­‐in  myCSR.csr  -­‐days  3650  \    -­‐extfile  myCSR.cnf  -­‐out  myCSR.pem    #  8>  Export  signed  server  certificate  bundle  along  with  the  private  key  in  PFX  format:  $  openssl  pkcs12  -­‐export  -­‐in  myCSR.pem  -­‐inkey  myCSR.key  -­‐out  ldaps.pfx  -­‐passout  file:passwd    #  9>  Do  cleanup  -­‐  we  don’t  need  this  password  any  more:  $  rm  -­‐f  passwd    

After  copying  PFX  files,  the  last  steps  have  to  be  performed  on  your  AD  servers:  

 #  1>  Import  CA.pfx  to  Windows'  Trusted  Root  Certification  Authorities  container    #  2>  Import  ldaps.pfx  to  'NTDS\Personal'  container  for  'Active  Directory  Domain  Services'    #  3>  Use  ldp  tool  to  validate  LDAP  connectivity.    

Page 28: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       28  

Now  that  LDAPS  is  setup,  it’s  a  time  to  test  adtool  functionality:  

 $  adtool  useradd  testuser  'CN=Users,OU=HOSTING,DC=POC,DC=LOCAL'    #  create  and  rename  may  be  required  because  "Name  Length  Limits  from  the  Schema":  #  backward  compatibility  the  limit  is  20  characters  for  login  name.    $  adtool  userrename  testuser  'Firstname  Lastname'    #  set  password  and  unlock  $  adtool  setpass  "Firstname  Lastname"  '<password>'  $  adtool  userunlock  "Firstname  Lastname"    #  login  name  $  adtool  attributereplace  "Firstname  Lastname"  userPrincipalName  testuser    #  logon  Name  (pre-­‐Windows  2000)  $  adtool  attributereplace  "Firstname  Lastname"  sAMAccountName  testuser    #  email  $  adtool  attributereplace  "Firstname  Lastname"  mail  [email protected]    #  First  name  and  Last  name  $  adtool  attributereplace  "Firstname  Lastname"  givenName  "Firstname"  $  adtool  attributereplace  "Firstname  Lastname"  sn  "Lastname"    #  display  name  $  adtool  attributereplace  "Firstname  Lastname"  displayName  "Firstname  Lastname"    #  checking  user  has  been  created  $  adtool  list  'CN=Users,OU=HOSTING,DC=POC,DC=LOCAL'  CN=Firstname  Lastname,CN=Users,OU=HOSTING,DC=POC,DC=LOCAL    

The   architecture,   deployment   and   setup   considerations   for   Active   Directory   services   are   out   of  scope   of   this   document.   See   the   Active   Directory   Structure   chapter   for   additional   details   about  setting  up  platform  related  LDAP  objects.    Things  to  keep  in  mind:  

• The  20  characters   logon  name   limitation  should  be  considered  when  using  very  old   tools  and  AD   infrastructure.  During   the  project  both  national  alphabet   symbols  as  well   as   long  logon  names  have  been  tested  successfully  and  no  issues  identified;  

• The   LDAPS   is   required   to   avoid   sending   credentials   over   un-­‐encrypted   communication  channels.   On   the  Windows   side   LDAPS   is   not   required,   since  Windows   favors   Kerberos.  Using  Kerberos  wasn’t  explored  during  this  project,  only  LDAP  SSL  secure  transport;  

• The   adtool   may   use   configuration   file   or,   alternatively,   all   options   may   be   passed   via  command  line  parameters.  The  latter  approach  is  used  in  platform  management  tools;  

• The  LDAP  bind   credentials   used   for   accessing   and  managing  AD  objects   are   provided  by  Secure  Storage  service;  

• The   vzpoc/adtool   git   repository   contains   updated   and   improved   version   of   adtool   with  extended   commands   and   functions.   In   particular,   it   can  manage  OU   type   containers.   You  may  use  the  build  instructions  provided  above  for  configuring  and  building  this  project;  

• The   adtool   is   a   part   of  management   framework   and   also   provided   as   standalone   binary  depending  on  openldap  client  libraries  only.  Building  a  static  self-­‐contained  binary  does  not  seem  to  be  feasible,  since  openldap  itself  depends  on  Cyrus  SASL  implementation,  which  is  rather  hard  to  build  statically;  

• The  host  system  running  adtool  must  have  openldap  client  libraries  deployed.    

Page 29: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       29  

Load-­‐Balancer  Service  A   clustered   pair   of   Citrix   NetScaler   hardware   appliances   is   performing   load-­‐balancing   function.  This  is  the  only  service,  which  is  not  accessed  via  APIs  and  programmatically.  While  technically  it’s  possible,  most   probably   such   access  will   be   prohibited   due   the   shared   nature   of   those   devices.  Moving  forward  it’s  planned  to  use  HAproxy   for  setting  up  load-­‐balancing  services  and  managing  them  programmatically.    The  load-­‐balancers  performing  several  functions:  

• Load-­‐balancing.   There   are   several   load-­‐balancing   algorithms   supported   including  handling   of   sticky   sessions.   It   is   expected   that   session   stickiness   will   be   avoided   and  connections  will  be  distributed  using  round-­‐robin  or  least-­‐connect  algorithms;  

• SSL   Bridging/Termination.   Depending   on   security   and   application   requirements   SSL  traffic   can   either   be   decrypted   (SSL   termination)   or   not,   and   forwarded   to   elected   load-­‐balancer  pool  member.  In  some  cases,  e.g.  for  traffic  inspection  or  session  persistence,  the  traffic   is  decrypted  and   re-­‐encrypted  again  using   either  original   or  different   SSL  key   and  forwarded  to  elected  load-­‐balanced  pool  member  (SSL  Bridge);  

• Network   Address   Translation.   Since   VIPs   are   setup   using   public   Internet   addresses  resolved  by  DNS  and  load-­‐balanced  pool  members  are  using  private  network  addresses,  the  load-­‐balancer  is  also  responsible  for  performing  NAT  when  forwarding  packets  to  and  from  private  subnet.  

 Since  programmatic  access  to  load-­‐balancers  is  not  available,  adding  and  removing  new  VIPs  may  become  a   real   stumbling  point.   In  order   to  avoid  provisioning  delays   the  pool  of  pre-­‐configured  VIP  slots  is  created  during  the  hosting  farm  setup.  In  this  case,  every  new  website  is  just  claiming  preconfigured  VIP  slot.  When  website  is  removed  or  migrated  to  a  different  hosting  farm  this  VIP  slot  is  freed  up  again  and  available  for  new  websites.    For   this   project,   the   hosting   farm   got   10   VIP   slots   for   production   and   10   VIP   slots   for   staging  websites:  

 wbp1  10.169.64.211-­‐220   213.177.35.146-­‐155   #  1st  production  web  server  wbp2  10.169.64.221-­‐230   213.177.35.146-­‐155   #  2nd  production  web  server  wbs1  10.169.64.231-­‐240   213.177.35.156-­‐165   #  1st  staging  web  server    

As  you  can  see   from  the  above  table,  staging  VIPs  are  mapping  public   IP  addresses  to  private  IP  address  space  1-­‐to-­‐1.  The  production  VIPs  are  balancing  load  between  the  two  web-­‐servers  wbp1  and  wbp2,  thus  providing  highly  available  setup.    One  of  the  important  concepts  for  load-­‐balancer  service  is  so  called  health-­‐check  script  or  page.  It  has  several  applications:  

• Ensuring   that  web   server   is  up  and   running  and  whole  application   stack   is   available  and  can  accept  inbound  connections.  For  simple  websites,  this  page  may  be  just  a  static  HTML  file.  For  more  complex  deployments,  the  page  may  be  dynamic,  include  the  code  to  initiate  basic   DB   transaction   and   thus   validate   multiple   layers:   web   server   engine,   application  engine  and  DB  connection  pool  health  and  readiness;  

• Controlling  whether  the  webserver  instance  included  into  load-­‐balanced  pool  or  not.  This  is  useful  for  maintenance  work,  code  deployment,  and  other  operations  requiring  short-­‐term  webserver  downtime  or  re-­‐configuration.  

 

Page 30: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       30  

Learning   from   the   past   lessons,   the   health-­‐check   page   is   deployed   outside   of   the   website’s  <Document   Root>,   thus   making   it   independent   of   website   deployment,   its   structure   and   page  updates.   This   page   is   kept   in   a   dedicated   code   repository   and   it’s   being  deployed   separately   by  deployment  service  along  with  the  website  code.    There   is   a  dedicated   tool   provided   for   controlling   load-­‐balancer  health-­‐check   routine,  making   it  possible  to  validate,  add  or  remove  website  instance  from  the  load-­‐balanced  pool:  

 $  /opt/deploy/web  vip  status  -­‐-­‐farm  poc  -­‐-­‐env  stg  -­‐-­‐site  d7-­‐demo  @wbs1  web  vip  status:  d7-­‐demo  VIP  state:  disabled    $  /opt/deploy/web  vip  up  -­‐-­‐farm  poc  -­‐-­‐env  stg  -­‐-­‐site  d7-­‐demo  @wbs1  web  vip  up:  d7-­‐demo  VIP  state:  enabled    $  /opt/deploy/web  vip  test  -­‐-­‐farm  poc  -­‐-­‐env  stg  -­‐-­‐site  d7-­‐demo  @wbs1  web  vip  test:  VIP  monitor:  success    $  /opt/deploy/web  vip  down  -­‐-­‐farm  poc  -­‐-­‐env  stg  -­‐-­‐site  d7-­‐demo  @wbs1  web  vip  down:  d7-­‐demo  VIP  state:  disabled    $  /opt/deploy/web  vip  test  -­‐-­‐farm  poc  -­‐-­‐env  stg  -­‐-­‐site  d7-­‐demo  @wbs1  web  vip  test:  VIP  monitor:  failed    

Things  to  keep  in  mind:  • Load-­‐balancer   VIP   slots   have   to   be   setup   when   hosting   farm   is   created,   therefore   farm  

capacity  will  be  capped  by  the  number  of  pre-­‐configured  slots;  • Assuming  current  network  structure  and  design  assumptions,  one  hosting  farm  can  host  up  

to   ~80  websites   (2   x   prod   +   1   x   stag   VIP).   That   is   assuming   that   one   subnet   is   used.   If  multiple  subnets  used  this  limit  can  be  raised  up  to  ~125  websites;  

• It   is   assumed   that   session   state   is   shared   between   application   instances   and   web  applications  can  properly  handle   its  sessions,  so  that  sticky-­‐sessions  are  not  used.   It  does  mean   that   any   webserver   instance   can   be   removed   from   the   load-­‐balancer   pool   and  sessions  will  be  picked  up  by  other  pool  members  automatically;  

• Current  prototype  is  assuming  that  SSL  is  enabled  for  all  web  sites;  • When  new  website  instance  (container)  is  deployed  its  health-­‐check  page  is  disabled,  so  the  

VIP  is  turned  down.  It  must  be  explicitly  enabled  to  allow  inbound  traffic.    

SCM  Service  All  project  artifacts  are   stored   in  versioned  Code  Repositories.  This   service   is   implemented  as  a  Docker  container  located  on  the  utility  host  in  foundation  farm  and  running  GitLab  application.    The  SCM  service  is  depending  on:  

• IdM  Service  –  controls  access  to  Code  Repositories;  • Persistent  Volumes  –  storing  configuration  and  Git  repositories  content;  

Other  services  in  turn  depending  on  SCM  service,  among  them:  • Workflow  Engine  –  storing  configuration  in  corresponding  repo;  • Deployment  Service  –  fetching  project  from  the  project  repo;  • Image  Builder  –  fetching  image  build  file  and  its  dependencies  from  the  image  repo;  

 Additionally,  SCM  service  is  providing  both  User  and  Admin  portals,  accessed  by  Developers  and  Platform   Administrators   correspondingly.   Besides   external   web   projects   all   platform   settings,  scripts  and  configurations  are  also  version  controlled  and  stored  in  corresponding  Git  repositories,  

Page 31: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       31  

following   configuration  as  a   code  paradigm.  Here   is   an  example   for   setting  up  GitLab  container.  First  of  all  we’ll  setup  certificates:  

 $  openssl  req  -­‐x509  -­‐nodes  -­‐days  365  -­‐newkey  rsa:2048  –subj  \    "/C=DE/ST=HE/L=Frankfurt/O=VZ/OU=H/CN=gitlab.poc.local/[email protected]"  \    -­‐keyout  ${VOL_DATA}/gitlab/config/ssl/gitlab.poc.local.key  \    -­‐out  ${VOL_DATA}/gitlab/config/ssl/gitlab.poc.local.crt    

Next,  assuming  volume  folders  in  place,  we’ll  boot  up  GitLab  container:  

 $  docker  run  -­‐-­‐name  gitlab  -­‐-­‐detach=true  -­‐-­‐restart  always  -­‐-­‐hostname  gitlab.poc  \    -­‐-­‐publish  ${UTIL_HOST}:8443:443  -­‐-­‐publish  ${UTIL_HOST}:8080:80  -­‐-­‐publish  ${UTIL_HOST}:2222:22  \    -­‐-­‐volume  ${VOL_DATA}/gitlab/config:/etc/gitlab  \    -­‐-­‐volume  ${VOL_DATA}/gitlab/logs:/var/log/gitlab  \    -­‐-­‐volume  ${VOL_DATA}/gitlab/data:/var/opt/gitlab  \    gitlab/gitlab-­‐ce:8.6.8-­‐ce.0    

Now  GitLab  application   is   listening  on  TCP  ports  8443  for  HTTPS  connections  and  2222  for  SSH  access  to  git  repositories.    In  order  to  provide  sufficient  project  and  user  isolation,  all  projects  are  created  as  private,  so  they  are  visible  only  to  users  who  have  been  explicitly  granted  access.  The  users  are  setup  with  LDAP  authentication  provider  and  using  IdM  Service  for  authentication.      The   Enterprise   Edition   (EE)   of   GitLab   provides   more   advanced   features   for   authorization   and  access  management  using  LDAP  groups  and  roles  out  of   the  box.  However,   the  basic  mechanism  described  above  does  satisfy  project  isolation  requirements  and  the  POC  project  was  implemented  using   free  Community  Edition   (CE)  of  GitLab.  See  more  details  about  GitLab  setup   in   the  GitLab  Repository  Structure  chapter.    Things  to  keep  in  mind:  

• The  GitLab  is  setup  for  hybrid  authentication  and  using  both  local  user  database  and  LDAP  user  directory.  There  are  two  special  accounts  in  GitLab:  

o root  –  the  Administrative  account  used  for  GitLab  management  and  configuration;  o robot  –  the  API  Bot  account  used  by  automation  workflows  and  platform  services.  

This  account  is  mainly  used  for  API  access  and  its  API  key  is  stored  in  secure  store;  • User  logon  name  is  used  for  GitLab  authentication.  It’s  assumed  that  IdM  service  is  ensuring  

uniqueness  of  this  logon  name  across  enterprise;  • Although  plain  HTTP  is  supported  by  GitLab  for  accessing  the  Web  UI  and  using  WebDAV,  it  

is  discouraged  due  to  security  considerations  and  attempts  to  access  GitLab  over  HTTP  will  result  in  redirect  to  HTTPS  endpoint;  

• Obviously  there  is  a  trade-­‐off  with  known  pro’s  and  contras,  when  implementing  local  code  repository   versus   hosted   code   repository.   For   this   project   it’s   been   decided   to   use   local  GitLab   based   repository,   however,   it’s   also   possible   to   use   external   or   hosted   repository  such  as  GitHub  or  BitBucket,  assuming  that  service  integration  has  been  performed,  service  availability,  security  and  access  issues  have  been  addressed.  

 

Page 32: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       32  

Workflow  Engine  The  Workflow  Engine  or  Orchestration  Engine  is  implemented  on  the  basis  of  Jenkins  automation  framework.  Jenkins  itself  is  running  in  the  container  on  the  utility  host  in  foundation  farm.  Jenkins  configurations  and  workflows,  implemented  as  Groovy  code,  are  stored  in  the  corresponding  code  repository   provided   by   SCM   service.   The   access   to   Workflow   Engine   is   controlled   by   RBAC  mechanisms  provided  by  IdM  Service.    Jenkins  provides  multiple  possibilities   for   setting  up  authorization  mechanisms  and  RBAC  rules.  Current   implementation   is  using  Matrix-­‐based  access  model,  which   is  allowing  mapping  specific  LDAP  groups  (Roles)  to  certain  Jenkins  access  permissions.  Obviously,  every  specific  project  may  call   for  additional   roles  and  mappings.  The  POC  project  has  been  setup  with   the   following   roles  and  permission  matrix:    Permission  Group  

Permission   Anonymous   Jenkins  Administrators  

Jenkins  Job  Managers  

Overall             Administer     ✔       Configure  Update  Center     ✔       Read     ✔   ✔     Run  Scripts     ✔   ✔     Upload  Plugins     ✔    Credentials       ✔       Create     ✔       Delete     ✔       Manage  Domains     ✔       Update     ✔       View     ✔    Agent             Build     ✔       Configure     ✔       Connect     ✔       Create     ✔       Delete     ✔       Disconnect     ✔    Job             Build     ✔   ✔     Cancel     ✔   ✔     Configure     ✔       Create     ✔       Delete     ✔       Discover     ✔       Move     ✔       Read     ✔   ✔     Workspace     ✔   ✔  View             Delete     ✔       Replay     ✔       Update     ✔       Configure     ✔   ✔     Create     ✔       Delete     ✔       Read     ✔   ✔  SCM             Tag     ✔        

Page 33: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       33  

Below  is  an  example  for  setting  up  Jenkins  container:  

 #  pulling  the  latest  stable  Jenkins  build  https://hub.docker.com/_/jenkins/  $  docker  pull  jenkins:alpine    #  get  admin  password  from  container  log  or  /var/jenkins_home/secrets/initialAdminPassword  88345b8ecf904c0ba9eca63fb2cf8d47    $  docker  run  -­‐-­‐name=jenkins  -­‐-­‐hostname  jenkins  -­‐-­‐detach=true  -­‐-­‐restart=always  \    -­‐-­‐cpu-­‐shares  512  -­‐-­‐memory  2G  \    -­‐-­‐volume=${VOL_DATA}/jenkins:/var/jenkins_home  \      -­‐-­‐publish  jenkins.poc.local:8080:8080  \    -­‐-­‐publish  jenkins.poc.local:50000:50000  \    jenkins:alpine    

Now,  that  Jenkins  is  up  and  running  it  can  be  managed  via  Web  UI  as  well  as  API  or  CLI,  e.g.  

 #  login  with  the  client  $  java  -­‐jar  ./war/WEB-­‐INF/jenkins-­‐cli.jar  -­‐s  http://localhost:8080  login  \    -­‐-­‐username  admin  -­‐-­‐password  '<your  admin  password>'    #  get  list  of  installed  plugins  $  java  -­‐jar  ./war/WEB-­‐INF/jenkins-­‐cli.jar  -­‐s  http://localhost:8080  list-­‐plugins  \    |  cut  -­‐c  1-­‐27,92-­‐  |  awk  '{printf  "%s:%s\n",  $1,$2}'  |  sort  >  plugins.txt    

One   of   the   Jenkins   benefits   is  modularity.   You  may   find   plug-­‐ins   for   pretty  much   anything   and  tailor  your  Jenkins  setup  for  your  particular  needs  and  processes.  Below  is  a  list  of  Jenkins  plugins  and  their  dependencies  proposed  for  this  POC  project:  

 #  The  most  important  and  essential  plugins  are  annotated  and  provided  with  comments.    #  Note:  not  all  plugins  are  used  and  most  required  to  satisfy  dependencies    ace-­‐editor:1.1         #  JS  UI  library  active-­‐directory:1.47       #  AD  authentication  and  authorization  ansicolor:0.4.2       #  ANSI  colors  in  the  console  output  improve  readability  ant:1.3  antisamy-­‐markup-­‐formatter:1.5  branch-­‐api:1.10  build-­‐name-­‐setter:1.6.5     #  Sets  the  name  of  a  build  to  something  other  than  #1,  #2,  #3,  ...  build-­‐pipeline-­‐plugin:1.5.4  build-­‐timeout:1.17.1       #  Interrupt  task  after  certain  execution  time  threshold  reached  cloudbees-­‐folder:5.12       #  Folders  can  help  structuring  and  organizing  jobs  conditional-­‐buildstep:1.3.5  credentials-­‐binding:1.8     #  Makes  credentials  visible  as  a  build  parameter  credentials:2.1.4       #  Store  credentials  in  Jenkins  secure  containers  durable-­‐task:1.12  email-­‐ext:2.47  external-­‐monitor-­‐job:1.6  git-­‐client:1.19.7       #  Git  client  for  Jenkins  git-­‐server:1.7  git:2.5.3         #  Support  for  Git  SCM  gitlab-­‐plugin:1.3.0       #  Support  for  GitLab  SCM  greenballs:1.15       #  Makes  successful  jobs  “green”  not  “blue”  groovy-­‐postbuild:2.3.1  handlebars:1.1.1       #  JS  UI  library  icon-­‐shim:2.0.3  javadoc:1.4  jquery-­‐detached:1.2.1       #  JS  UI  library  

Page 34: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       34  

jquery:1.11.2-­‐0  junit:1.18  ldap:1.12  mailer:1.17  mapdb-­‐api:1.0.9.0  matrix-­‐auth:1.4       #  matrix-­‐based  authorization  strategies  (global  and  per-­‐project)  matrix-­‐project:1.7.1  maven-­‐plugin:2.13  momentjs:1.1.1         #  JS  UI  library  pam-­‐auth:1.3  parameterized-­‐trigger:2.32  pipeline-­‐build-­‐step:2.2  pipeline-­‐input-­‐step:2.1  pipeline-­‐rest-­‐api:1.7  pipeline-­‐stage-­‐step:2.1  pipeline-­‐stage-­‐view:1.7  plain-­‐credentials:1.2  publish-­‐over-­‐ssh:1.14  rebuild:1.25  role-­‐strategy:2.3.2       #  Enables  authorization  using  a  role-­‐based  strategy  run-­‐condition:1.0  scm-­‐api:1.2  scm-­‐sync-­‐configuration:0.0.10     #  Stores  Jenkins  configuration  in  SCM  (Git)  repo  script-­‐security:1.22       #  Controls  what  users  can  execute  what  scripts  scriptler:2.9  ssh-­‐credentials:1.12       #  Stores  SSH  credentials  in  Jenkins  secure  containers  ssh:2.4           #  Remote  job  execution  using  SSH  structs:1.3  subversion:2.6  timestamper:1.8.4       #  Adds  timestamps  to  console  output  token-­‐macro:1.12.1  uno-­‐choice:1.4         #  Interactive  input  parameter  HTML  controls  windows-­‐slaves:1.2  workflow-­‐aggregator:2.2  workflow-­‐api:2.1  workflow-­‐basic-­‐steps:2.1  workflow-­‐cps-­‐global-­‐lib:2.2  workflow-­‐cps:2.12  workflow-­‐durable-­‐task-­‐step:2.4  workflow-­‐job:2.5  workflow-­‐multibranch:2.8  workflow-­‐scm-­‐step:2.2  workflow-­‐step-­‐api:2.3  workflow-­‐support:2.2       #  Workflow  or  Pipeline  as  a  code  support  ws-­‐cleanup:0.30       #  Clean-­‐up  workspace  upon  task  completion    

For  more  details  on  workflows  please  refer  to  the  Management  Tasks  and  Workflows  chapter.    Things  to  keep  in  mind:  

• Jenkins   project   is   actively   developed   and   new   versions   appearing   very   often.   Unless   you  require  particular  fix  implemented  in  those  versions,  it’s  recommended  to  stick  to  a  certain  LTS   version   rather   than   always   using   latest-­‐greatest   one.   Although   the   Jenkins   Core   is  known   to  be  mature  and  stable,   the  Core  version  update  may   introduce   incompatibilities  with  the  deployed  plugins,  so  they  will  need  to  be  checked  for  compatibility  as  well;  

• The  list  of  plugins  proposed  above  is  by  no  means  a  definitive  list  and  should  be  seen  as  an  example  only.  The  versions  indicated  above  are  actual  at  the  time  of  container  deployment  and  half  of  the  plugins  already  proposing  upgrades;  

• By  default  the  Active  Directory  plugin  is  using  plain  LDAP  protocol.  In  order  to  secure  LDAP  communication  you  can  use  one  of  the  supported  mechanisms:  

o Active   Directory   plugin   performs  TLS   upgrade  –   it   connects   to   domain   controllers  through   insecure   LDAP,   then   from   within   the   LDAP   protocol   it   "upgrades"   the  

Page 35: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       35  

connection   to   use  TLS,   achieving   the   same   degree   of   confidentiality   and   server  authentication  as  LDAPS  does;  

o If   you  must   insist   on   using   LDAPS,   and   not   TLS   upgrade,   you   can   set   the   system  property  hudson.plugins.active_directory.ActiveDirectorySecurityRealm.forceLdaps=true  as   a   startup  parameter   to   force   Jenkins   to   start   a   connection  with  LDAPS,   even  though   this   will   buy   you   nothing   over   LDAP+TLS   upgrade.   You   will   also   need   to  check  inside  config.xml  to  ensure  either  the  secured  port  is  defined  (636  or  3269)  or  not  defined  at  all;  

• Jenkins  recognizes  all  the  groups  in  Active  Directory  that  the  user  belongs  to,  so  you  can  use  those   to   make   authorization   decisions.   For   example,   you   can   choose   the   matrix-­‐based  security   as   the   authorization   strategy   and  perhaps   allow   "Jenkins  Admins"   to   administer  Jenkins;  

• Jenkins  is  using  JVM  and  Java  is  hungry  for  memory,  therefore  for  production  setup  Jenkins  container   memory   constraints   should   be   carefully   chosen   considering   JVM   memory  allocation  and  garbage  collection  settings;  

• You  can  setup  additional  Jenkins  slaves  if  greater  concurrency  is  required;  • Since  Jenkins  container  does  not  store  any  state,  it’s  safe  to  stop  and  restart  this  container,  

assuming  there  are  no  tasks  in  flight  and  no  management  activities  being  performed  at  the  moment.  The  recommended  approach  is  to  use  “Prepare  for  Shutdown”  task  in  the  Jenkins  management  menu.  

 

SonarQube  Service  The   SonarQube   Analysis   Engine   or   Static   Code   Analyzer   Service   is   implemented   as   a   set   of  containers  located  on  utility  host  in  foundation  farm  and  running  SonarQube  application:  

• SonarQube  container  –  provides  Web  Portal  and  Code  Analysis  engine;  • Sonar  Scanner  container  –  wraps  the  Sonar  Scanner  application  and  its  runtime;  • Sonar  Database  container  –  optional  container  providing  Sonar  Database  instance.  

 The   SonarQube   container   is   also   exposing  user   interface   and  APIs,  which  may  be  used   for   both  administration  and  code  review,  depending  on  the  user’s  role.  The  SonarQube  service  is  depending  on  Sonar  Database,  Persistent  Volumes  and  IdM  Services.  The  former  is  used  as  persistent  storage  for  analysis  reports  and  artifacts  and  latter  is  providing  authentication  and  authorization  services.    The  SonarQube  application  and   its   functionality  can  be  extended  using  various  plugins  provided  either   by   vendor   or   the   broad   community.   For   this   project   the   following   plugins   have   been  selected  and  explored:  

• LDAP  –  delegates  authentication  to  LDAP;  • Generic  Coverage  –  imports  coverage  reports  defined  a  given  format;  • Git  –  Git  SCM  Provider;  • CSS  –  enables  analysis  of  CSS  and  Less  files;  • JavaScript  –  enables  scanning  of  JavaScript  source  files;  • PHP  -­‐  enables  analysis  of  PHP  projects;  • Web  Languages  -­‐  enables  scanning  of  HTML,  and  JSP/JSF  files;  • XML  –  enables  analysis  and  reporting  on  XML  files.  

 Things  to  keep  in  mind:  

• In  the  current  setup  Sonar  Database  is  provided  by  the  Persistent  Database  Storage  service  and  dedicated  Sonar  Database  container  is  not  used;  

• SonarQube   is   using   JVM   and   Java   is   hungry   for  memory,   therefore   for   production   setup  container   memory   constraints   should   be   carefully   chosen   considering   JVM   memory  

Page 36: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       36  

allocation  and  garbage  collection  settings.  Obviously,  those  settings  must  be  correlated  with  container  resource  constraints  for  the  optimal  performance;  

• Additional  sizing  exercise  must  be  performed  to  identify  number  of  scanner  jobs  that  may  be  submitted  concurrently;  

• After   successful   authentication   all   user   groups   provided   by   IdM   Service   are   mapped  automatically  to  the  groups  known  by  SonarQube.  There  are  two  SonarQube  groups  setup  by   default:   sonar-­‐users   and   sonar-­‐administrators.   It’s   sufficient   to   create   such   groups   on  Active  Directory  side  and  add  users  to  them  accordingly.  After  authentication  these  users  will  be  properly  mapped  and  grouped  by  SonarQube  application  too.  

• By   default   sonar-­‐users   may   submit   new   analysis   jobs   and   review   results.   The   sonar-­‐administrators   can   manage   the   application   and   its   settings.   However,   by   default,   sonar-­‐administrators  cannot  submit  new  analysis  jobs;  

• SonarQube   project   is   actively   developed   and   new   versions   appearing   quite   often.   Unless  you   require   particular   fix   implemented   in   those   versions,   it’s   recommended   to   stick   to   a  certain  LTS  version  for  the  sake  of  consistency  and  compatibility  with  chosen  plugins.  

 

Sonar  Database  The   Sonar   Database   service   can   be   implemented   as   a   Docker   container   or   external   Database  identified  by   its  DSN   (Data  Source  Name).  The  DB   scheme   required  by  SonarQube   to  operate   is  created   automatically   upon   application   startup   and   first   DB   access,   if   Sonar   DB   user   is   having  enough  access  permissions.  The  DB  scheme  must  be  pre-­‐created  manually  otherwise.    As  already  mentioned,  in  the  current  setup  Sonar  Database  is  provided  by  the  Persistent  Database  Storage  service,  therefore  dedicated  Sonar  Database  container  is  not  used.    

Sonar  Scanner  Sonar   Scanner   service   implemented   as   a  Docker   container   located   on   utility   host   in   foundation  farm.   Its   main   purpose   is   submitting   projects   to   the   SonarQube   Analysis   Engine.   The   usual  workflow  is  looking  like  following:  

• The  Sonar  Scanner  is  executed:  o This  may  be  a  standalone  task,  when  user  is  initiating  a  code  analysis  project;  o The  code  analysis  task  may  be  a  task  invoked  as  a  part  of  bigger  workflow;    

• The  Sonar  Scanner   is  submitting  project  code  to   the  Sonar  Analysis  Engine  and  reporting  scanner  task  ID  and  analysis  progress  back  to  the  caller;  

• Sonar  Analysis  engine  is  validating  the  code  against  defined  quality  rules  and  standards.  All  identified  deviations,  code  defects  and  issues  stored  in  the  Sonar  Database;  

• When  analysis  completed,  the  user  can  access  the  SonarQube  Portal  to  review  the  analysis  summary  and  code  quality  metrics.  From  there  the  user  can  drill  down  to  every  single  issue  and  understand  the  scope  and  problems  identified  with  the  code  fragment;  

• After  fixing  the  code  quality  issues  and  identified  defects  the  analysis  may  be  repeated.  In  this   case   Sonar   Analysis   Engine   is   also   calculating   various   trends   and   incremental   code  quality  metrics  so  that  users  can  track  code  quality  improvement  over  time.  

 The   Sonar   Scanner   service   is   depending   on   SonarQube   service.   There   is   no   service   directly  depending  on  Sonar  Scanner,  though  indirectly  it  may  be  invoked  by  workflow  tasks  or  triggered  by  Platform  CLI.    For  example,  let’s  manually  trigger  code  analysis  for  the  website  project.  The  -­‐-­‐login  parameter  is  using  Sonar  Token  instead  of  user-­‐password  pair.  It  is  preferred  and  more  secure  approach.  

Page 37: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       37  

 $  /opt/deploy/sonar  -­‐-­‐login  4f758a739160aeec49d5c7ed628a4f...  \      -­‐-­‐path  /var/web/dev/root/d7-­‐demo/demo/docroot  \      -­‐-­‐group  demo_agency  -­‐-­‐name  d7_demo  -­‐-­‐version  7.50    INFO:  Scanner  configuration  file:  /opt/sonar-­‐scanner/conf/sonar-­‐scanner.properties  INFO:  Project  root  configuration  file:  NONE  INFO:  SonarQube  Scanner  2.6.1  INFO:  Java  1.8.0_92-­‐internal  Oracle  Corporation  (64-­‐bit)  INFO:  Linux  3.10.0-­‐327.28.3.el7.x86_64  amd64  INFO:  User  cache:  /workspace/cache  INFO:  Load  global  repositories  INFO:  Load  global  repositories  (done)  |  time=180ms  INFO:  User  cache:  /workspace/cache  INFO:  Load  plugins  index  INFO:  Load  plugins  index  (done)  |  time=4ms  INFO:  SonarQube  server  5.6.1  INFO:  Default  locale:  "en_US",  source  code  encoding:  "UTF-­‐8"  (analysis  is  platform  dependent)  INFO:  Process  project  properties  INFO:  Load  project  repositories  INFO:  Load  project  repositories  (done)  |  time=48ms  INFO:  Load  quality  profiles  INFO:  Load  quality  profiles  (done)  |  time=75ms  INFO:  Load  active  rules  INFO:  Load  active  rules  (done)  |  time=587ms  INFO:  Publish  mode  INFO:  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐    Scan  d7-­‐demo  INFO:  Language  is  forced  to  php  INFO:  Load  server  rules  INFO:  Load  server  rules  (done)  |  time=115ms  INFO:  Base  dir:  /workspace  INFO:  Working  dir:  /workspace/.sonar  INFO:  Source  paths:  docroot  INFO:  Source  encoding:  UTF-­‐8,  default  locale:  en_US  INFO:  Index  files  INFO:  1296  files  indexed  INFO:  Quality  profile  for  php:  Sonar  way  INFO:  Sensor  NoSonar  and  Commented  out  LOC  Sensor  INFO:  Sensor  NoSonar  and  Commented  out  LOC  Sensor  (done)  |  time=1084ms  INFO:  Sensor  Lines  Sensor  INFO:  Sensor  Lines  Sensor  (done)  |  time=60ms  INFO:  Sensor  PHPSensor  INFO:  1296  source  files  to  be  analyzed  INFO:  50/1296  files  analyzed,  current  file:  /workspace/docroot/includes/lock.inc  INFO:  150/1296  files  analyzed,  current  file:  /workspace/docroot/modules/openid/openid.pages.inc  INFO:  203/1296  files  analyzed,  current  file:  /workspace/docroot/modules/simpletest/tests/upgrade/drupal-­‐6.forum.database.php  INFO:  203/1296  files  analyzed,  current  file:  /workspace/docroot/modules/simpletest/tests/upgrade/drupal-­‐6.forum.database.php  INFO:  346/1296  files  analyzed,  current  file:  /workspace/docroot/profiles/vzbase/modules/ctools/includes/wizard.theme.inc  INFO:  635/1296  files  analyzed,  current  file:  /workspace/docroot/profiles/vzbase/modules/features/features.admin.inc  INFO:  847/1296  files  analyzed,  current  file:  /workspace/docroot/profiles/vzbase/modules/views/handlers/views_handler_field.inc  INFO:  1088/1296  files  analyzed,  current  file:  /workspace/docroot/profiles/vzbase/modules/views/plugins/views_plugin_exposed_form.inc  INFO:  1296/1296  source  files  have  been  analyzed  INFO:  Sensor  PHPSensor  (done)  |  time=86335ms  …  INFO:  Analysis  report  generated  in  1676ms,  dir  size=22  MB  INFO:  Analysis  reports  compressed  in  2316ms,  zip  size=8  MB  INFO:  Analysis  report  uploaded  in  700ms  INFO:  ANALYSIS  SUCCESSFUL,  you  can  browse  http://10.169.64.241:9000/dashboard/index/vzpoc  INFO:  Note  that  you  will  be  able  to  access  the  updated  dashboard  once  the  server  has  processed  the  submitted  analysis  report  INFO:  More  about  the  report  processing  at  http://10.169.64.241:9000/api/ce/task?id=AVfCEentql6U738tQ3qu  INFO:  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  

Page 38: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       38  

INFO:  EXECUTION  SUCCESS  INFO:  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  INFO:  Total  time:  1:50.097s  INFO:  Final  Memory:  53M/908M  INFO:  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐    

Now,  in  the  Sonar  User  Portal  we  can  see  analysis  summary  along  with  the  code  quality  metrics:    

 Figure  8  -­‐  Sonar  Project  Dashboard  

 SonarQube   does   provide   a   number   of   scan-­‐rule   packages,   however,   you   may   want   to   include  additional  rules  provided  by  vendors,  in  our  case  PHP  and  Drupal  or  implement  own  rules  specific  to  your  organization  quality  guidelines  and  policies.    It’s  important  to  understand  that  no  automated  code-­‐analyzer  can  replace  good  coding  culture  and  pair  code  reviews.  However,  such  code  analysis  tools  can  be  seen  as  a  great  helper  for  identifying  many  classes  of  coding  errors.    We   can   check   the   first   bug   from   the   list   to   get   an   idea   about   kind   of   issues   caught   by   the  SonarQube  code  analyzer.    As  you  can  see  from  the  screenshot  below,  the  alert  was  caused  by  the  for-­‐loop  missing  the  code  block.  The  code  quality  topic  warrants  an  involved  discussion  and  scan  results  will  depend  a  lot  on  scanner  rules  and  applied  quality  policies.  There  will  definitely  be  some  false  positives  found.    

Page 39: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       39  

 Figure  9  -­‐  Sonar  Issue  Report  

 The  Sonar  Scanner  parameters  can  be  adjusted  via  command  line  options.  Alternatively,  you  can  use  sonar-­‐project.properties  configuration  file  in  the  root  directory  of  the  project,  e.g.    

 #  must  be  unique  in  a  given  SonarQube  instance  sonar.projectKey=my:project    #  this  is  the  name  and  version  displayed  in  the  SonarQube  UI.  Was  mandatory  prior  to  SonarQube  6.1.  sonar.projectName=My  project  sonar.projectVersion=1.0      #  Path  is  relative  to  the  sonar-­‐project.properties  file.  #  Since  SonarQube  4.2,  this  property  is  optional  if  sonar.modules  is  set.    #  If  not  set,  SonarQube  starts  looking  for  source  code  from  the  directory  containing    #  the  sonar-­‐project.properties  file.  sonar.sources=.      #  Encoding  of  the  source  code.  Default  is  default  system  encoding  #sonar.sourceEncoding=UTF-­‐8    

For  more  details  on  scanner  parameters  see  the  following  links:  http://docs.sonarqube.org/display/SCAN/Analyzing+with+SonarQube+Scanner  http://docs.sonarqube.org/display/SONAR/Analysis+Parameters    Things  to  keep  in  mind:  

• Sonar  Scanner  is  using  JVM  and  Java  is  hungry  for  memory,  therefore  for  production  setup  container   memory   constraints   should   be   carefully   chosen   considering   JVM   memory  

Page 40: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       40  

allocation  and  garbage  collection  settings.  Obviously,  those  settings  must  be  correlated  with  container  resource  constraints  for  the  optimal  performance;  

• Additional  sizing  exercise  must  be  performed  to  identify  number  of  scanner  jobs  that  may  be  submitted  concurrently;  

• The   scanner   behavior   can   be   refined   and   controlled   by   providing   additional   scanner  settings   in   corresponding   configuration   files.   This   can   be   done   either   via   project-­‐   or  module-­‐wide  settings  file  and  thus  may  be  controlled  by  developer.  

 

Platform  Interfaces  The  platform  provides  multiple   interfaces  of  various  types:  Web  UI,  CLI  and  API.  Those  different  interfaces  are  usually  providing  access  to  the  same  functionality  and  appropriate  interface  must  be  chosen  depending  on  the  use-­‐case.    

API  Endpoints  Generally  speaking,  those  API  end-­‐points  are  for  internal  use  only  and  not  exposed  by  the  platform  for  external  consumptions,  with  some  possible  exceptions2.    Below  is  the  list  of  component  APIs:  

• MS  ADC  –  LDAPS  protocol  is  used  to  utilize  standard  APIs  implemented  my  MS  ADC  • Grafana  –  http://docs.grafana.org/reference/http_api/    • InfluxDB  –  https://docs.influxdata.com/influxdb/v1.0//tools/api/  • cAdvisor  –  https://github.com/google/cadvisor/blob/master/docs/api.md    • SyncThing  –  https://docs.syncthing.net/dev/rest.html    • Docker  Engine  –  https://docs.docker.com/engine/reference/api/docker_remote_api/    • Docker  Registry  –  https://docs.docker.com/registry/spec/api/    • Jenkins  –  https://wiki.jenkins-­‐ci.org/display/JENKINS/Remote+access+API    • GitLab  –  https://docs.gitlab.com/ce/api/    • SonarQube  –  http://docs.sonarqube.org/display/DEV/Web+API    • Drupal  –  https://www.drupal.org/drupalorg/docs/api    

 

Command  Line  Interfaces  The  set  of  CLI  tools  is  providing  the  next  abstraction  level  above  programmatic  interface  or  APIs  and  used  for  particular  administrative  and  troubleshooting  tasks.  These  CLI  tools  providing  input  validators  and  auto-­‐completion  for  parameters  and  other  convenient  shortcuts  and  helpers.      For   day-­‐to-­‐day   administrative   and   support   activities,   however,   it   is   expected   that   Web   UI   and  corresponding  portals  will  be  predominantly  used.    

Platform  CLI  The   Platform   CLI   or   management   CLI   toolkit   is   designed   with   master-­‐slave   or   hub-­‐spoke  architecture  in  mind.  The  platform  management  tools  are  deployed  on  a  management  (master  or  hub)   node   and  management   tasks   being   submitted   to   the   slave   or   spoke   nodes.   That   said,   the  distributed  or  multi-­‐master  model   is   also  possible.  The  CLI   can  be   executed  on  any  node   in   the  farm.   If   management   task   targets   the   current   node,   then   this   task   being   performed   locally,  otherwise,  the  target  node  is  contacted  and  the  task  is  delegated  and  executed  remotely.    

                                                                                                               2  If  platform  has  to  be  integrated  with  other  enterprise  tools  and  services,  then  yes,  specific  APIs  may  be  exposed  to  external  applications.  

Page 41: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       41  

Thus,   the  management   toolkit   can  be  used   in  both   scenarios:  with  dedicated  management  node  and  without  one.    The  dotted  lines  on  platform  diagram  showing  that  administrative  user  may  use  CLI  tools.  This  is  considered   a   low-­‐level   access   and,   generally   speaking,   it’s   discouraged   in   favor   of  management  workflows.  Those  workflows  are  still  relying  upon  platform  CLI,  however,  they  are  implementing  strict  validation  and  task  sequences.  Thus,  management  workflows  guaranteeing  task  serialization  and  consistency,  which  may  not  always  be  ensured,  when  CLI  tools  executed  manually.  For  current  project   the  Platform  CLI   has   been  deployed  on  utility   host,   at  util.poc.local:/opt/deploy   location.  The  following  CLI  tools  belong  to  the  Platform  CLI  suit.    One  of  potential   improvements  would  be  packaging  Platform  CLI   tools   into  a  separate  container  and  becoming  completely  independent  from  runtime  provided  by  particular  host.  Obviously,  this  will  remove  dependency  on  openldap  client  libraries  on  the  host,  where  adtool  is  executed.    

LDAP/AD  CLI  This  tool   is  used  to  manage  hosting  related  objects   located  in  LDAP/AD  containers.  The  tool   is  a  wrapper  for  aforementioned  adtool  and  besides  translating  AD  management  operations  into  LDAP  calls  it’s  ensuring  AD  structure  and  validating  user  inputs  against  naming  standards  and  rules.  The  credentials  for  the  LDAP  management  account  are  stored  in  the  Vault  credential  storage.  

 $  ./vault  list  /secret/poc/ldap/  binddn  bindpw  $  ./vault  get  /secret/poc/ldap/binddn  CN=adtool,CN=Service  Accounts,OU=HOSTING,DC=POC,DC=LOCAL  $  ./ldap  LDAP/AD  CLI  tool  for  managing  hosting  related  objects  Usage:  ./ldap  <object>  <action>  [<args  1>  ...  <arg  N>]  Objects  and  actions:          group  <list|add|del|users|adduser|deluser>  <args...>                  list  (dir)  [-­‐-­‐pattern  <pattern>]  [-­‐-­‐format  <short|long>]                  add  (create)  -­‐-­‐group  <group  name>                  del  (remove)  -­‐-­‐group  <group  name>                  users  (members)  -­‐-­‐group  <group  name>  [-­‐-­‐format  <short|long>]                  adduser  (useradd)  -­‐-­‐group  <group  name>  -­‐-­‐org  <org  name>  -­‐-­‐name  <user  name>                  deluser  (userdel)  -­‐-­‐group  <group  name>  -­‐-­‐org  <org  name>  -­‐-­‐name  <user  name>            org  <list|add|del>  <args...>                  list  (dir)  [-­‐-­‐pattern  <pattern>]  [-­‐-­‐format  <short|long>]                  add  (create)  -­‐-­‐org  <org  name>                  del  (remove)  -­‐-­‐org  <org  name>            user  <list|add|del>  <args...>                  groups  (memberof)  -­‐-­‐name  <user  name>  [-­‐-­‐format  <short|long>]                  list  (dir)  -­‐-­‐org  <org  name>  [-­‐-­‐pattern  <pattern>]  [-­‐-­‐format  <short|long>]                  add  (create)  -­‐-­‐org  <org  name>  -­‐-­‐first  <first  name>  -­‐-­‐last  <last  name>                            -­‐-­‐login  <user  login>  -­‐-­‐mail  <email>  -­‐-­‐password  <password>                  del  (remove)  -­‐-­‐org  <org  name>  -­‐-­‐name  <user  name>    

GitLab  CLI  This   tool   is  used   for  managing  GitLab  application   from  the  command   line.   It’s   talking  directly   to  GitLab  APIs  and  can  be  used  for  automating  administrative  and  maintenance  activities.    The   GitLab   management   token   is   stored   in   the   Vault   and   provided   by   Secure   Storage   service  during  the  runtime,  so  that  administrative  GitLab  credentials  are  not  exposed  anywhere  and  can  be  safely  changed,  secured  or  stored  in  different  credentials  storage.    

Page 42: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       42  

 This  management   token   is   belonging   to   the   “API   bot”   user   called   robot,   thus   all   administrative  tasks   are   performed   on   its   behalf   and   administrative   user   root   is   mainly   for   the   Web   UI  administrative  use.    Both,  root  and  robot  administrative  users  are  created  in  the  local  GitLab  user  database.  All  other  users  are  setup  for  LDAP  authentication  using  IdM  Service.  

 #  the  API  token  used  to  manage  GitLab  $  /opt/deploy/vault  list  /secret/poc/gitlab  token    $  /opt/deploy/gitlab  GitLab  CLI  tool  for  managing  hosting  related  objects  Usage:  ./gitlab  <object>  <action>  [<args  1>  ...  <arg  N>]  Objects  and  actions:          group  <list|add|del|users|adduser|deluser>  <args...>                  list  (dir)  [-­‐-­‐pattern  <pattern>]  [-­‐-­‐format  <list|table>]                  add  (create)  -­‐-­‐group  <group  name>  [-­‐-­‐description  <description>]                  del  (remove)  -­‐-­‐group  <group  name>                  users  (members)  -­‐-­‐group  <group  name>  [-­‐-­‐format  <list|table>]                  adduser  (useradd)  -­‐-­‐group  <group  name>  -­‐-­‐login  <user  login>                            [-­‐-­‐access  <guest|reporter|developer|master|owner>]                  deluser  (userdel)  -­‐-­‐group  <group  name>  -­‐-­‐login  <user  login>            project  <list|add|del>  <args...>                  list  (dir)  [-­‐-­‐group  <group  name>]  [-­‐-­‐pattern  <pattern>]                          [-­‐-­‐format  <list|table>]                  add  (create)  -­‐-­‐project  <project  name>  -­‐-­‐group  <project  group>                          [-­‐-­‐description  <project  description>]  [-­‐-­‐clone  <project  path>]                  del  (remove)  -­‐-­‐project  <project  name>  -­‐-­‐group  <project  group>            user  <list|add|del>  <args...>                  list  (dir)  [-­‐-­‐pattern  <pattern>]  [-­‐-­‐format  <list|table>]                  add  (create)  -­‐-­‐login  <user  login>  -­‐-­‐name  <First  Last>  -­‐-­‐mail  <email>                            -­‐-­‐org  <LDAP  org>                  del  (remove)  -­‐-­‐login  <user  login>    

 

Web  CLI  This  tool  is  used  for  managing  web  projects,  their  web  containers  and  load-­‐balancer  VIPs.  It  is  one  of  the  key  platform  management  tools.  It  does  “know”,  how  to  deploy  new  web  projects  from  the  source  code  repositories  to  the  target  web  hosting  farm  and  environment  and  like  other  CLI  tools  being  used  by  platform  management  workflows.    Strictly  speaking,  the  Web  CLI  is  the  only  platform  tool,  which  is  depending  on  SSH  connectivity,  delegating  tasks  and  re-­‐spawning  itself  on  remote  nodes.  The  tool  does  not  require  any  credentials  and  remote  host  access  is  performed  using  password-­‐less  key-­‐based  SSH  connectivity.    It’s  worth  mentioning  that  SSH  is  setup  to  use  connection  multiplexing,  giving  additional  speed-­‐up  to  cross-­‐host  connectivity.  One  of  the  possible  improvements  may  be  using  SSH  agent  and  storing  key  password  in  the  Vault.  This  will  prevent  unauthorized  user  from  hopping  between  hosts  in  the  farm  yet  allowing  authorized  platform  services  cross-­‐host  SSH  access.      

Page 43: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       43  

 $  /opt/deploy/web  CLI  tool  for  managing  web  projects  and  containers  Usage:  ./web  <object>  <action>  [<args  1>  ...  <arg  N>]  Objects  and  actions:          container  <deploy|health|ipmap|list|stats|remove>  <args...>                  deploy  (create)  -­‐-­‐site  <site  name>  -­‐-­‐farm  <farm>  -­‐-­‐env  <env>  [-­‐-­‐host  <target>]                          [-­‐-­‐ip  <ipaddr>]  [-­‐-­‐size  <small|medium|large>]  [-­‐-­‐image  <image  name>]                  health  -­‐-­‐site  <site  name>  -­‐-­‐farm  <farm>  -­‐-­‐env  <env>  [-­‐-­‐host  <target>]                  ipmap  [-­‐-­‐site  <site  name>]  -­‐-­‐farm  <farm>  -­‐-­‐env  <env>  [-­‐-­‐host  <target>]                  list  [-­‐-­‐site  <site  name>]  -­‐-­‐farm  <farm>  -­‐-­‐env  <env>  [-­‐-­‐host  <target>]  [-­‐-­‐format  <list|table>]                  stats  [-­‐-­‐site  <site  name>]  -­‐-­‐farm  <farm>  -­‐-­‐env  <env>  [-­‐-­‐host  <target>]                  remove  (delete)  -­‐-­‐site  <site  name>  -­‐-­‐farm  <farm>  -­‐-­‐env  <env>  [-­‐-­‐host  <target>]            project  <build|deploy|list|remove>  <args...>                  list  (dir)  [-­‐-­‐site  <site  name>]  -­‐-­‐farm  <farm>  -­‐-­‐env  <env>  [-­‐-­‐host  <target>]                          [-­‐-­‐pattern  <pattern>]  [-­‐-­‐format  <list|table>]                  build  -­‐-­‐project  <project  name>  -­‐-­‐group  <project  group>  [-­‐-­‐description  <project  description>]                          [-­‐-­‐profile  <profile  path>]                  deploy  -­‐-­‐site  <site  name  >  -­‐-­‐farm  <farm>  -­‐-­‐env  <env>  [-­‐-­‐host  <target>]  -­‐-­‐org  <org  name>                          -­‐-­‐project  <project  name>  -­‐-­‐group  <project  group>                  remove  -­‐-­‐site  <site  name>  -­‐-­‐farm  <farm>  -­‐-­‐env  <env>  [-­‐-­‐host  <target>]                            -­‐-­‐path  <archive  path  |  /dev/null>            vip  <down|status|test|up>                  down  -­‐-­‐site  <site  name>  -­‐-­‐farm  <farm>  -­‐-­‐env  <env>  [-­‐-­‐host  <target>]                  status  -­‐-­‐site  <site  name>  -­‐-­‐farm  <farm>  -­‐-­‐env  <env>  [-­‐-­‐host  <target>]                  test  -­‐-­‐site  <site  name>  -­‐-­‐farm  <farm>  -­‐-­‐env  <env>  [-­‐-­‐host  <target>]                  up  -­‐-­‐site  <site  name>  -­‐-­‐farm  <farm>  -­‐-­‐env  <env>  [-­‐-­‐host  <target>]    

 

Vault  CLI  The   HashiCorp   Vault   https://www.vaultproject.io/   provides   own   CLI   tool,   however,   it’s   more  practical   to   access   secure   storage   over   API.   Moreover,   additional   input   validation   has   been  implemented  to  make  Vault  usage  more  safe  and  convenient.  

 $  /opt/deploy/vault  Vault  CLI  tool  for  managing  objects  in  secure  storage  Usage:  ./vault  <action>  <key  path>  [<value>]    <key  path>  -­‐  the  key  path  in  the  secure  storage,  has  the  following  format                            /<key>/.../<path>  and  can  be  up  to  128  chars  long.  Currently                            all  keys  are  put  under  /secret  prefix.  <value>        -­‐  the  value  string  can  be  up  to  1024  chars  long  and  may  contain                            any  ASCII  characters  but  double-­‐quotes.    Supported  actions:          list  (dir)  <key  path>          get  (read)  <key  path>          del  (drop)  <key  path>          put  (set)  <key  path>  <value>    

The   Vault   is   modular   and   supporting   multiple   storage   and   authentication   back-­‐ends.   For   this  project   built-­‐in   key-­‐based   mechanism   is   used   for   authentication,   requiring   at   least   3   out   of   5  unlock  keys  in  order  to  unseal  the  Vault  storage.  It  is  possible  to  use  LDAP  authentication  backend  too.  It  may  lead  to  circular  dependencies  if  credentials  required  for  accessing  LDAP  will  be  stored  in  the  Vault,  which  is  locked  and  must  be  unsealed.    

Page 44: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       44  

Scanner  CLI  This   is   a   wrapper   for   sonar-­‐scanner   java   application   used   to   submit   code   analysis   tasks   to   the  SonarQube   server.   The   java   application   is   executed   in   the   container   for   ensuring   resource  constraints.  User  inputs  additionally  validated  against  naming  standards.  

   $  /opt/deploy/vault  list  /secret/poc/sonar  token    $  /opt/deploy/scanner  Sonar-­‐Scanner:  submits  given  project  to  the  SonarQube  code  analyzer  Usage:  ./scanner  -­‐-­‐option1  <value  1>  ...  -­‐-­‐optionN  <value  N>  [-­‐-­‐  -­‐Dsonar.key=val  ...]  Supported  options:          -­‐-­‐path  <project  path>                  File-­‐system  path  where  the  project  has  been  deployed.          -­‐-­‐group  <group  name>                  The  project  group  name          -­‐-­‐name  <project  name>                  The  project  name          -­‐-­‐version  <project  version>                  The  project  version          [-­‐-­‐sources  <source  subdir>]                  Optional  source  folder  location  relative  to  project  path  if  sources  stored  in  subdir.          [-­‐-­‐host  <Server  URL>]                  Optional  SonarQube  server  URL    For  additional  analysis  parameters,  see  http://docs.sonarqube.org/display/SONAR/Analysis+Parameters    

Although,   it’s   possible   to   use   both   username-­‐password   pair   and   secure   token   for   SonarQube  server  authentication,   the   secure   token   is  more  preferred.  For   submitting   jobs   this   tool   is  using  authentication  token  from  Vault.    

Sonar  CLI  The  SonarQube  application  can  be  managed  either  using  Web  UI  or  REST  API.  The  latter  approach  is   employed   for   automating  general  management   tasks,   such  as  managing  authorization  groups,  project  permission  templates  and  projects  themselves.    In  order   to  provide  reliable   isolation   in  multi-­‐user  and  multi-­‐project  environment,   the   following  strategy  has  been  proposed  and  implemented:  

• Each  LDAP  organization  is  getting  own  LDAP  group(s)  assigned.  This  LDAP  group  is  named  after  Organization  name  with  Sonar  Users  suffix  added.  

• Corresponding  Sonar  authorization  group  with   the  same  name  as  LDAP  group   is   created.  This  enables  mapping  between  LDAP  and  Sonar  groups  during  the  authorization  phase;  

• The  permission  template  is  created  in  Sonar  for  binding  permissions  to  the  projects,  whose  keys  matching  given  pattern;  

• Eventually,   Sonar   authorization   group(s)   with   specific   permissions   getting   added   to   the  template,  thus  implementing  the  authorization  matrix,  which  is  defining  what  users  or  user  groups  allowed  to  perform  certain  operations  on  given  projects;  

 This  can  be  better  explained  on  example.  Let’s  assume,  our  organization  is  called  “Alpha  Agency”  and   we   want   organization   users   in   corresponding   LDAP   groups   to   browse   Sonar   projects   and  access   project   code.   The   following   configuration   has   to   be   performed   for   implementing  authorization  strategy  allowing  organization  and  project  isolation.    

Page 45: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       45  

 LDAP:  add  organization      org  name:  "Alpha  Agency"    LDAP:  add  group      group  name:  "Alpha  Agency  Sonar  Users"    SONAR:  add  group      group  name:  "Alpha  Agency  Sonar  Users"    SONAR:  add  permission  template      template  name:  "Alpha  Agency"      project  key:  "alpha_agency:*"    SONAR:  add  permission      template  name:  "Alpha  Agency"      group:  "Alpha  Agency  Sonar  Users"      permission:  "Browse"    SONAR:  add  permission      template  name:  "Alpha  Agency"      group:  "Alpha  Agency  Sonar  Users"      permission:  "See  Source  Code"    

It  does  effectively  mean  that  all  users   in  Alpha  Agency  Sonar  Users  group  will  be  granted  Browse  and   See   Source   Code   permissions   for   all   Sonar   projects   with   project   keys   matching   the  “alpha_agency:*”  pattern.    Such  authorization  structure  allows  to:  

• Separate  LDAP  organizations  and  their  projects  from  each  other;  • Grant   specific   organization   users   access   to   Sonar   projects   with   given   access   level  

granularity;  • Add   new   projects   to   Sonar,   so   that   they   will   inherit   the   same   permission   set   due   to  

permission  template  mechanisms  in  place.    For   more   details   about   Sonar   Authorization   and   permission   templates,   see   the   following   links  http://docs.sonarqube.org/display/SONARQUBE56/Authorization  http://docs.sonarqube.org/display/SONAR/Authorization  .    

Report  CLI  The   report   CLI   is   running   predefined   InfluxQL   scripts   and   queries   against   Stats   Database   and  generating  reports  in  different  formats.    

Test  CLI  The  set  of  tools  validating  platform  services  and  platform  health.  The  testing  tools  are  idempotent,  meaning  that  they  can  be  executed  repeatedly  and  in  case  of  failure  the  test  will  continue  from  the  failed   step   and   proceed   until   completion,   when   all   objects   created   during   the   test   will   be  automatically  cleaned  up.    The  setup  validation  tool  is  checking  whether:  

• Platform  services  are  deployed;  • Platform  Services  are  running  and  healthy;  • Published  Platform  Service  endpoints  published  and  accessible;  • Credentials  for  accessing  Platform  Services  have  been  setup  and  stored  in  the  Vault.  

Page 46: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       46  

 For  example:  

 testSetup:  ###  checking  service  "cadvisor"  testSetup:  -­‐-­‐  found  container  "cadvisor"  testSetup:  -­‐-­‐  container  "cadvisor"  is  using  image  "google/cadvisor:v0.23.8"  testSetup:  -­‐-­‐  container  "cadvisor"  is  up  and  running  testSetup:  -­‐-­‐  10.169.64.241:18080:  connection  successful  testSetup:  ###  checking  service  "grafana"  testSetup:  -­‐-­‐  found  container  "grafana"  testSetup:  -­‐-­‐  container  "grafana"  is  using  image  "registry.poc:5000/poc/grafana:2.6.0"  testSetup:  -­‐-­‐  container  "grafana"  is  up  and  running  testSetup:  -­‐-­‐  10.169.64.241:3000:  connection  successful  testSetup:  ###  checking  service  "influxdb"  testSetup:  -­‐-­‐  found  container  "influxdb"  testSetup:  -­‐-­‐  container  "influxdb"  is  using  image  "registry.poc:5000/poc/influxdb"  testSetup:  -­‐-­‐  container  "influxdb"  is  up  and  running  testSetup:  -­‐-­‐  10.169.64.241:8083:  connection  successful  testSetup:  -­‐-­‐  10.169.64.241:8086:  connection  successful  testSetup:  ###  checking  service  "sonar"  testSetup:  -­‐-­‐  found  container  "sonar"  testSetup:  -­‐-­‐  container  "sonar"  is  using  image  "registry.poc:5000/poc/sonar:5.6.1"  testSetup:  -­‐-­‐  container  "sonar"  is  up  and  running  testSetup:  -­‐-­‐  10.169.64.241:9000:  connection  successful  testSetup:  ###  checking  service  "vault"  testSetup:  -­‐-­‐  found  container  "vault"  testSetup:  -­‐-­‐  container  "vault"  is  using  image  "registry.poc:5000/poc/vault"  testSetup:  -­‐-­‐  container  "vault"  is  up  and  running  testSetup:  -­‐-­‐  10.169.64.241:8200:  connection  successful  testSetup:  ###  checking  service  "jenkins"  testSetup:  -­‐-­‐  found  container  "jenkins"  testSetup:  -­‐-­‐  container  "jenkins"  is  using  image  "jenkins:alpine"  testSetup:  -­‐-­‐  container  "jenkins"  is  up  and  running  testSetup:  -­‐-­‐  10.169.64.241:50000:  connection  successful  testSetup:  -­‐-­‐  10.169.64.241:8888:  connection  successful  testSetup:  ###  checking  service  "gitlab"  testSetup:  -­‐-­‐  found  container  "gitlab"  testSetup:  -­‐-­‐  container  "gitlab"  is  using  image  "gitlab/gitlab-­‐ce:8.6.8-­‐ce.0"  testSetup:  -­‐-­‐  container  "gitlab"  is  up  and  running  testSetup:  -­‐-­‐  10.169.64.241:8080:  connection  successful  testSetup:  -­‐-­‐  10.169.64.241:8443:  connection  successful  testSetup:  -­‐-­‐  10.169.64.241:2222:  connection  successful  testSetup:  ###  checking  service  "registry"  testSetup:  -­‐-­‐  found  container  "registry"  testSetup:  -­‐-­‐  container  "registry"  is  using  image  "registry:2.5"  testSetup:  -­‐-­‐  container  "registry"  is  up  and  running  testSetup:  -­‐-­‐  10.169.64.241:5000:  connection  successful  testSetup:  ###  looking  up  "LDAP  Bind  Username"  in  Vault  testSetup:  vault:/secret/poc/ldap/binddn  found  testSetup:  ###  looking  up  "LDAP  Bind  Password"  in  Vault  testSetup:  vault:/secret/poc/ldap/bindpw  found  testSetup:  ###  looking  up  "MySQL  DBA  Username"  in  Vault  testSetup:  vault:/secret/poc/mysql/DBA_USER  found  testSetup:  ###  looking  up  "MySQL  DBA  PAssword"  in  Vault  testSetup:  vault:/secret/poc/mysql/DBA_PASS  found  testSetup:  ###  looking  up  "GitLab  Auth  Token"  in  Vault  testSetup:  vault:/secret/poc/gitlab/token  found  testSetup:  ###  looking  up  "SonarQube  Auth  Token"  in  Vault  testSetup:  vault:/secret/poc/sonar/token  found    

The   test-­‐workflow   validation   tool   is   checking   orchestration   and   management   workflows   by  executing  number  of  tasks  and  implementing  complete  website  life-­‐cycle  including,  but  not  limited  to  the  following  steps:  

Page 47: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       47  

• Setup:  -­‐ List  LDAP  orgs  (ldap  org  list)  -­‐ Add  LDAP  org  (ldap  org  add)  -­‐ Add  LDAP  user  (ldap  user  add)  -­‐ Test  user  has  been  added  (ldap  user  list)  -­‐ List  LDAP  groups  (ldap  group  list)  -­‐ Add  LDAP  group  (ldap  group  add)  -­‐ List  LDAP  group  users  (ldap  group  users)  -­‐ Add  LDAP  user  to  group  (ldap  group  adduser)  -­‐ Test  user  is  member  of  the  LDAP  group  (ldap  user  groups)  

 -­‐ List  GitLab  groups  (gitlab  group  list)  -­‐ Add  GitLab  group  (gitlab  group  add)  -­‐ List  GitLab  users  (gitlab  user  list)  -­‐ Add  GitLab  user  (gitlab  user  add)  -­‐ Add  GitLab  user  to  group  (gitlab  group  adduser)  -­‐ Test  user  is  member  of  the  GitLab  group  (gitlab  group  users)  

 -­‐ Build  web  project  from  base  (web  project  build)  -­‐ Create  GitLab  project  (gitlab  project  add)  -­‐ List  GitLab  projects  (gitlab  project  list)  

 -­‐ Deploy  web  project  (web  project  deploy)  -­‐ Test  web  project  deployed  (web  project  list)  -­‐ Deploy  web  container  (web  container  deploy)  -­‐ Test  web  container  deployed  (web  container  list)  -­‐ Enable  VIP  (web  vip  up)  -­‐ Test  VIP  status  (web  vip  status)  

 • Test:  

-­‐ List  IP  map  (web  container  ipmap)  -­‐ Test  VIP  monitor  (web  vip  test)  -­‐ Test  web  container  health  (web  container  health)  -­‐ Get  web  container  stats  (web  container  stats)  

 • Cleanup:  

-­‐ Disable  VIP  (web  vip  down)  -­‐ Remove  web  container  (web  container  remove)  -­‐ Remove  web  project  (web  project  remove)  -­‐ Remove  GitLab  user  from  group  (gitlab  group  deluser)  -­‐ Remove  GitLab  project  (gitlab  project  del)  -­‐ Remove  GitLab  group  (gitlab  group  del)  -­‐ Remove  GitLab  user  (gitlab  user  del)  -­‐ Remove  LDAP  user  from  group  (ldap  group  deluser)  -­‐ Remove  LDAP  user  (ldap  user  del)  -­‐ Remove  LDAP  group  (ldap  group  del)  -­‐ Remove  LDAP  org  (ldap  org  del)  

 

Page 48: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       48  

Docker  CLI  The  same  disclaimer  as  for  Platform  CLI  applies  to  the  Docker  CLI  too.  While  Docker  CLI  may  be  surely   used   for   all  management   tasks,   generally   speaking,   for   the   sake   of   consistency   its   use   is  discouraged  in  favor  of  platform  management  tools  and  workflows.    For  more  details  on  Docker  CLI  and  particular  options,  visit  the  following  link:  https://docs.docker.com/engine/reference/commandline/cli/    

Web  Portals  The  Web  Portals   are  used   for  both  administrative,   system  management   and  development   tasks.  Often,   these  portals  provided  by   existing   service   components   and   the   component   separation  on  the  diagram  is  rather  logical.    

Stats  Visualization  Portal  The  Stats  Visualization  Portal  is  implemented  using  Grafana  -­‐  an  open  source  metric  analytics  and  visualization  suite.  Grafana  is  preconfigured  to  use  Stats  Database  as  its  data  source.    Below  is  an  example  view  of  resource  monitoring  dashboard.    

 Figure  10  -­‐  Stats  Visualization  and  Analysis  Portal  

 After   this  basic  configuration,   the  Stats  Visualization  portal  can  be  accessed  to  view  and  analyze  container   resource   usage.   The   Stats   Visualization   Portal   may   be   setup   to   authenticate   against  Active  Directory.      Depending  on  the  assigned  role,   the  user  can  only  see  measurements  or  also  update  and  modify  monitoring  dashboards  and  database  queries  used  for  fetching  monitoring  data.    There  is  a  handy  script  created  to  help  setting  up  the  DSN  and  dashboards  for  the  first  time.  

Page 49: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       49  

 #  running  Grafana  in  container  $  docker  run  -­‐-­‐name=grafana  -­‐-­‐hostname  grafana.poc  -­‐-­‐detach=true  -­‐-­‐restart=always  \    -­‐-­‐cpu-­‐shares  50  -­‐-­‐memory  50m  \    -­‐-­‐publish=3000:3000  \    registry.poc:5000/poc/grafana:2.6.0    #  first  time  setup  (make  sure  that  setup  script  has  proper  credentials  for  the  InfluxDB  access)  #  -­‐  creating  Data  Source  Name  (DSN)  pointing  to  InfluxDB  #  -­‐  setting-­‐up  custom  dashboards  for  visualizing  container  metrics.      $  ./setup.sh  Usage:  setup.sh  <db_name>  [<dashboards  file  mask>]  Example:  setup.sh  cadvisor  './dashboards'    $  ./setup.sh  cadvisor  dashboards  Grafana:  data  source  cadvisor  created  Grafana:  adding  dashboard  ./dashboards/ContainerStats.json  Grafana:  databoard  ./dashboards/ContainerStats.json  created    

 

GitLab  Portal  The  GitLab  Portal  is  provided  by  GitLab  application  stack.      

 Figure  11  -­‐  GitLab  Portal  

 The  GitLab   is   using   IdM   service   for   authentication   and   authorization,   so   in   order   to   access   this  portal  one  has  to  provide  valid  user  credentials.    Depending  on  the  assigned  role,  the  user  can  access  projects,  browse  code  repositories  or  perform  other  tasks.  It  is  the  same  portal  presented  to  developers  and  administrative  users  and  the  level  of  access  depends  on  the  user  role  granted  in  the  LDAP  user  directory.    

Page 50: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       50  

Sonar  Portal  The  Sonar  Portal  is  provided  by  SonarQube  application  stack.  The  SonarQube  is  using  IdM  service  for  authentication  and  authorization.  So  in  order  to  access  this  portal,  one  has  to  provide  valid  user  credentials.      Depending   on   the   assigned   role,   the   user   can   access   projects,   browse   source   code   or   perform  management   tasks.   It   is   the   same   portal   presented   to   developers   and   administrative   users   and  level  of  access  depends  on  RBAC  setup  and  user  role  granted  in  the  LDAP  user  directory.      

 Figure  12  -­‐  Sonar  Portal  

 The   authorization   strategy   setup   in   a   way   that   the   user   is   only   able   to   access   and   perform  operations  on  projects  owned  by  the  same  LDAP  Organization  the  user  is  belonging  to.    

Platform  Orchestration  Portal  The   Platform   Orchestration   Portal   is   implemented   using   Jenkins   automation   framework   along  with   various   Jenkins  Plugins.  The  Portal   and  Workflow  Engine  provided  by   the   same   container.  The  separation  here  is  rather  logical.    In   essence   Jenkins   is   providing   orchestration   engine,   role-­‐based   authentication,   using   LDAP  integration,   as  well   as  Web  UI   (and  API)   to   trigger   pre-­‐configured   tasks   and  workflows.     Those  workflows  using  in  turn  either  Platform  CLI  or  component  APIs  to  perform  required  actions.  For  additional  details  about  Jenkins  tasks  see  the  Management  Tasks  and  Workflows  chapter.    The  Platform  Orchestration  Portal  is  setup  to  authenticate  against  Active  Directory,  so  in  order  to  access  this  portal,  one  has  to  provide  valid  user  credentials.  Depending  on  the  assigned  role,   the  user  can  see  reports,  execute  or  modify  management  workflows.    

Page 51: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       51  

   Below  is  a  screenshot  showing  most  common  management  tasks  and  workflows.    

 Figure  13  -­‐  Platform  Orchestration  Portal  

 

Other  Components  The   following   components   are,   strictly   speaking,   not   Platform   Services,   rather   general-­‐purpose  platform  building  blocks,  encapsulating  specific  functionality.    

Docker  Engine  The  vendor  provided  definition:  “The  Docker  Engine  is  a  lightweight  container  runtime  and  robust  tooling  that  builds  and  runs  your  container.  Docker  allows  you  to  package  up  application  code  and  dependencies  together  in  an  isolated  container  that  share  the  OS  kernel  on  the  host  system.  The  in-­‐host  daemon  communicates  with  the  Docker  Client  to  build,  ship  and  run  containers.”    The  vendor  site  should  be  consulted  for  more  details.  See  the  following  links:  https://www.docker.com/products/docker-­‐engine  https://docs.docker.com/engine/    It   is  worth  mentioning   that   there   is   no  hard  dependency  on  Docker  Engine   itself   and   it  may  be  replaced  with  CoreOS  rkt  or  LXD  alternatives.  Obviously,  some  platform  components  may  need  to  be   adjusted,   but  due   service   encapsulation,   this   effort   is   expected   to  be  minimal.  The  deep  dive  into  Docker  Engine  ecosystem  and  alternatives  is  out  of  scope  of  this  paper  and  the  only  subject  we  will  discuss  in  more  details  is  Storage  Scalability  in  Docker.    

Docker  Containers  Like  with  Docker  Engine,   there   is  no  direct  dependency  on  Docker  Containers  specifically.  Other  alternatives   may   be   used   too,   depending   on   specific   platform   requirements.   From   a   platform  perspective   containers   are   seen   as   a   mere   application   virtualization   construct   ensuring  component  “containment”,  i.e.  packaging,  security  and  isolation.    The  vendor  site  should  be  consulted  for  more  details.  

Page 52: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       52  

Platform  Capacity  Model  The  proposed  capacity  model   is   trying   to  optimize  resource  usage  and  simplify  calculations  and  predictive  analytics.  For  this  we’ll  make  several  assumptions  and  put  several  constraints:  

• Each  CPU  core  has  capacity  of  1024  compute  units3;  • The  minimal  RAM  unit  is  1MB;  • Optimal  CPU:RAM  ratio4  for  web  applications  is  1:4,  i.e.  1  CPU  unit  gets  4  RAM  units;  • Several  resource  allocation  profiles  defined:  small,  medium  and  large;  • Host  resources  can  be  fully  utilized  by  combining  different  application  sizes;  • The  next  profile  size  requires  4  times  more  resources  than  the  given  one;  • The  host  OS  resource  usage  is  steady,  predictable  and  well  known;  • Resources  are  not  over-­‐committed5  and  application   is   always  guaranteed   to  get   allocated  

resources,  even  under  the  full  load;  • If   application   demands   more   CPU   resources   than   expected,   it   will   get   them   as   long   as  

resource  requirements  of  other  applications  are  met.  In  other  words,  the  application  will  be  provided  at  least  guaranteed  resource  amount  or  even  more;  

• If   multiple   applications   will   claim   more   than   guaranteed   CPU   resource   amount  simultaneously,  the  weighted  resource  distribution  will  be  performed;  

• Unlike  CPU,  the  memory  resources  are  always  limited  by  specific  amount  that  may  be  seen  as  a  soft-­‐quota.   If  application  demands  more  memory  than  allotted,  such  memory  may  be  still   provided   using   OS   page   swapping   mechanism.   The   virtual   memory   amount   per  application  is  also  limited  and  may  be  seen  as  a  hard  quota.  If  application  will  step  over  this  hard  quota,  it  will  be  terminated  by  kernel  OOM  handler.  

 

 Figure  14  –  Platform  Capacity  Model  

                                                                                                               3  Actually  it’s  not  an  assumption  and  rather  a  factual  number  defined  by  Linux  Kernel.  4  This  ratio  is  based  on  practical  experience  and  must  be  seen  as  recommendation  only,  not  the  hard  number.  5  Current  model   assumes   that   OS   resource   usage   is   very   little   and   thus   sharing   application   resource   pool  with   OS  won’t   produce   any   adverse   effects.   In   case   if   OS   load   becomes   an   issue   –   additional   constant   increment   may   be  planned  for  resources,  in  order  to  account  for  OS  demands.  

Page 53: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       53  

The  picture  above  summarizes  assumptions  and  provides  some  resource  allocation  guidance   for  various  hardware  profiles.  Thus,  the  VM  with  4  CPUs  and  16GB  of  RAM  may  host  up  to  256  small  applications  (websites)  or  up  to  64  medium  applications  (websites)  or  up  to  16  large  applications  (websites).   Obviously,   different   combinations   of   various   website   sizes   are   possible   too.   In   this  case,  the  free  CPU  and  RAM  capacity  may  be  calculated  using  provided  formula.    Given  a  normal  distribution  of  various  application  sizes  and  assuming  that  applications  (websites)  can   be   moved   or   migrated   between   host   systems,   it’s   possible   to   perform   the   most   optimal  resource  allocation  without  resource  waste  and  complex  predictive  calculations.    There   are   two   specific   use-­‐cases   to   be   considered   as   a   part   of   a   broader   capacity  management  discipline:  

• Application   re-­‐tiering:   the   application   demands   more   resources   than   current   tier  guarantees  and  its  tier  must  be  upgraded.  Depending  on  the  current  host  system  capacity,  there  may  be  several  options:  

o There   is   free   capacity   to   fit   the   next   resource   allocation   tier.   In   this   case   only  resource  allocations  for  the  application  container  must  be  adjusted.  

o Defragmentation   is   needed,   i.e.   some   other   applications   may   be   migrated   to   the  different   host   system   to   free   up   required   resources   on   the   given   host   system   for  increasing  application  tier.    

o If  no  migration  possible  –  see   the  next  point  –   the  additional  host  system  must  be  provisioned.  

• Host   Capacity   Extension:   there   is   no   free   capacity   on   available   host   system(s)   for  provisioning   the   next   application.   In   this   case,   additional   host   system(s)   must   be  provisioned.  

 Things  to  keep  in  mind:  

• The  storage  (volume  size,  IOPs)  considerations  have  been  deliberately  skipped  in  the  above  capacity  model,   since   they  are  not   specific   to  web  containers.  Nonetheless,   this   subject   is  very   important   and   it’s  more   driven   by   application   sizing   agreements   and   requirements,  rather  than  by  the  platform  itself;  

• The  capacity  model  described  above  is  considering  the  most  generic  use  case,  when  hosted  applications  can  only  scale-­‐up  by  choosing  one  from  the  pre-­‐defined  resource  profiles,  not  scale-­‐out;  

• Due   to   the   previous   point,   it’s   further   assumed   that   resource   allocation   decision   is  performed   either  manually,   semi-­‐manually   or   orchestration   service   is   aware   of   different  container  sizes  and  deploying  them  accordingly;  

• In  case,  if  application  can  scale-­‐out,  it’s  recommended  to  standardize  on  specific  container  size   or   resource   allocation   profile   for   optimal   resource   allocation   and   perform   linear  application  scale-­‐out,  i.e.  deploy  more  containers  hosting  application  instances,  rather  than  having  various  container  sizes  and  resource  profiles;  

• Generally  speaking,  the  scale-­‐out  scenario  is  recommended  and  more  preferred  over  scale-­‐up  approach,   since   the   latter   is   limited  by  a   single  host   resource   capacity.  Unfortunately,  most  applications  we’re  not  designed  for  scaling  out,   leaving  scale-­‐up  scenario  as  the  sole  option;  

• Obviously,   the   scale-­‐out   scenario   is   preferred   for   orchestration   engines   like   Kubernetes,  since   it   allows   for   automatic   resource   ramp-­‐up   and   ramp-­‐down,   when   application   load  decreases.   Such   auto-­‐scaling   mechanisms   have   been   explored,   however,   they   are   out   of  scope  of  this  paper.  

Page 54: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       54  

Platform  Security  Platform  security   is  considered  to  be  an  integral  part  of   its  design.  Therefore,  wherever  possible  design  is  striving  to  accommodate  specific  security  controls  and  practices.    This  is  the  list  of  specific  security  mechanisms  that  have  been  explored  and  implemented:  

• Centralized  Identity  and  Access  Management  (IdM)  • Role  Based  Access  Control  (RBAC);  • Read-­‐only  container  file-­‐system;  • Key-­‐based  content  validation;  • Linux  sec-­‐comp  profiles;  • Linux  SELinux  policies;  • Linux  AppArmor  profiles;  • Linux  user  namespace  remap;  • Container  and  image  audits.  

 Unfortunately,  not  all   security  mechanisms  are  getting  along  well.  For  example,  user  namespace  remap   is   not   possible   to   use   for   containers  with   file-­‐systems.   There   are   some   limitations  when  using   SELinux  policies   and   sec-­‐comp  profiles   is   quite   young   feature   that  will  mature   over   time.  Different  Linux  distributions  prefer  using  either  AppArmor  or  SecComp  profiles.    We  won’t  be  reviewing  all  security  mechanisms  in  details,  since  in-­‐depth  security  research  is  out  scope   of   this   paper.   To   get   up   to   speed   on   existing   security   mechanisms,   however,   it’s  recommended  to  check  the  following  resources  and  articles:  

• The  Understanding  Docker  Security  and  Best  Practices  blog  post;  • Project  Nautilus  https://blog.docker.com/2016/05/docker-­‐security-­‐scanning/;  • Clair   https://github.com/coreos/clair   is   an   open-­‐source   project   for   static   analysis   and  

vulnerabilities  in  containers;  • Docker   Engine   Security   Improvements   https://blog.docker.com/2016/02/docker-­‐engine-­‐

1-­‐10-­‐security/;  • Comprehensive   overview   for   various   container   security   tools   and   frameworks:  

https://www.alfresco.com/blogs/devops/2015/12/03/docker-­‐security-­‐tools-­‐audit-­‐and-­‐vulnerability-­‐assessment/.  

   

User  Namespace  Remap  Essentially,   a   user   namespace   is   a   special   Linux   kernel  mechanism   allowing   containers   to   have  own   root  user,   completely   separate   from   the  host   root   account.   For   example,   the   root  user   in   a  container  would  be   able   to  manage   its   root   owned   files   in   the   container,   act   as   any  user   in   the  container,  manage  his  own  network  interfaces  and  some  of  his  mount-­‐points  (restrictions  apply)  and  at   the  same   time  being  “mapped”   or  “translated”   to,   say,  user  “container”  with  UID  1000  on   the  host   system.  User  namespaces  have  been   introduced   as   early   as   Linux  3.5   and   are  considered  as  stable  starting  with  Linux  4.3.    The  user  namespace  remap  is  relatively  new,  yet  promising  feature  and  below  we’ll  provide  some  configuration  details  and  examples.    Since  user  namespace  mapping   is   disabled  by  default,   first   of   all,  we  need   to   enable   this   kernel  feature:  

Page 55: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       55  

 #  adding  kernel  parameter  and  rebooting  $  grubby  -­‐-­‐args="user_namespace.enable=1"  -­‐-­‐update-­‐kernel=/boot/vmlinuz-­‐3.10.0-­‐327.28.3.el7.x86_64    $  reboot    #  verify  user  namespace  enabled  $  cat  /proc/cmdline  BOOT_IMAGE=/vmlinuz-­‐3.10.0-­‐327.28.3.el7.x86_64  root=/dev/mapper/rhel_rheltest2-­‐root  ro  crashkernel=auto  rd.lvm.lv=rhel_rheltest2/root  rd.lvm.lv=rhel_rheltest2/swap  rhgb  quiet  LANG=en_US.UTF-­‐8  user_namespace.enable=1    

Next,  we  need  to  setup  some  user  and  group  IDs  as  well  as  their  mapping  rules:  

 #  creating  container-­‐root  user  $  groupadd  -­‐-­‐gid  100000  dc-­‐root  $  useradd  -­‐-­‐system  -­‐-­‐uid  100000  -­‐-­‐gid  100000  -­‐-­‐home-­‐dir  /  -­‐-­‐no-­‐create-­‐home  -­‐-­‐shell  /bin/false  -­‐-­‐comment  "docker  root  user"  dc-­‐root    $  cat  <<EOT  >/etc/subuid  dc-­‐root:100000:65535  EOT  $  cat  <<EOT  >/etc/subgid  dc-­‐root:100000:65535  EOT    

At  last,  we  need  to  enable  user  namespace  remap  for  containers,  either  by  editing  systemd  script:  

 $  cat  <<EOT  >/etc/systemd/system/docker.service.d/override.conf  [Service]  ExecStart=  ExecStart=/usr/bin/docker  daemon  -­‐-­‐storage-­‐driver=overlay  -­‐-­‐userns-­‐remap="dc-­‐root"  EOT    

Or,  more  preferred  approach,  by  changing  Docker  configuration  file:  

 $  cat  <<EOT  >/etc/docker/daemon.json  {      "debug":  true,      "selinux-­‐enabled":  false,      "storage-­‐driver":  "overlay",      "userns-­‐remap":  "dc-­‐root",      "live-­‐restore":  true  }  EOT    

Now,  we  should  be  having  the  following  mapping  in  place:  

 User:                            root                none                web                  redis  _____________________________________________________________  Host            UID:          100000            101001            101002            101003  Container  UID:          0                          1001                1002                1003    

Page 56: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       56  

Thus,  the  user  web  will  have  UID  1002  inside  container  and  outside,  from  the  host  perspective  it  will  have  UID  101002.  Now,  the  last  touch,  we  can  hide  complexity  to  an  extent  by  creating  users  with  corresponding  IDs  on  both  host  system  and  inside  container  image:  

 #  On  the  Host  $  groupadd  -­‐-­‐gid  101001  none    &&  useradd  -­‐-­‐system  -­‐-­‐uid  101001  -­‐-­‐gid  101001  -­‐-­‐home-­‐dir  /  -­‐-­‐no-­‐create-­‐home  -­‐-­‐shell  /bin/false  -­‐-­‐comment  "Nobody"  none  $  groupadd  -­‐-­‐gid  101002  web      &&  useradd  -­‐-­‐system  -­‐-­‐uid  101002  -­‐-­‐gid  101002  -­‐-­‐home-­‐dir  /  -­‐-­‐no-­‐create-­‐home  -­‐-­‐shell  /bin/false  -­‐-­‐comment  "Web  User"  web  $  groupadd  -­‐-­‐gid  101003  redis  &&  useradd  -­‐-­‐system  -­‐-­‐uid  101003  -­‐-­‐gid  101003  -­‐-­‐home-­‐dir  /var/lib/redis  -­‐-­‐no-­‐create-­‐home  -­‐-­‐shell  /bin/false  -­‐-­‐comment  "Redis  User"  redis    #  In  the  Container  $  groupadd  -­‐-­‐gid  1001  none        &&  useradd  -­‐-­‐system  -­‐-­‐uid  1001      -­‐-­‐gid  1001      -­‐-­‐home-­‐dir  /  -­‐-­‐no-­‐create-­‐home  -­‐-­‐shell  /bin/false  -­‐-­‐comment  "Nobody"  none  $  groupadd  -­‐-­‐gid  1002  web          &&  useradd  -­‐-­‐system  -­‐-­‐uid  1002      -­‐-­‐gid  1002      -­‐-­‐home-­‐dir  /  -­‐-­‐no-­‐create-­‐home  -­‐-­‐shell  /bin/false  -­‐-­‐comment  "Web  User"  web  $  groupadd  -­‐-­‐gid  1003  redis      &&  useradd  -­‐-­‐system  -­‐-­‐uid  1003      -­‐-­‐gid  1003      -­‐-­‐home-­‐dir  /var/lib/redis    -­‐-­‐no-­‐create-­‐home  -­‐-­‐shell  /bin/false  -­‐-­‐comment  "Redis  User"  redis    

For  additional  details  and  examples,  please  see  the  following  links:  http://rhelblog.redhat.com/2015/07/07/whats-­‐next-­‐for-­‐containers-­‐user-­‐namespaces/#more-­‐1004    http://goyalankit.com/blog/2016/06/25/user-­‐namespace-­‐in-­‐red-­‐hat-­‐enterprise-­‐linux-­‐7-­‐dot-­‐2/    https://blog.yadutaf.fr/2016/04/14/docker-­‐for-­‐your-­‐users-­‐introducing-­‐user-­‐namespace/    https://docs.docker.com/engine/reference/commandline/dockerd/#/daemon-­‐user-­‐namespace-­‐options      Although  user  namespace  mapping  is  a  great  feature  it  does  have  (at  the  time  of  writing)  several  known  limitations:    

• Sharing  PID  or  NET  namespaces  with  the  host  (-­‐-­‐pid=host  or  -­‐-­‐network=host);  • A  -­‐-­‐read-­‐only  container   file-­‐system   (this   is   a   Linux   kernel   restriction   against   remounting  

with  modified  flags  of  a  currently  mounted  file-­‐system  when  inside  a  user  namespace);  • External  volume  or  graph  drivers,  which  are  incapable  of  using  daemon  user  mappings;  • Using  -­‐-­‐privileged  mode  flag  on  docker  run  (unless  also  specifying  -­‐-­‐userns=host)  • In   general,   user   namespaces   are   an   advanced   feature   and  will   require   coordination  with  

other  capabilities.  For  example,  if  volumes  are  mounted  from  the  host,  file  ownership  will  have  to  be  pre-­‐arranged  if  the  user  or  administrator  wishes  the  containers  to  have  expected  access  to  the  volume  contents;  

• Finally,  while  the  root  user  inside  a  user  “namespaced”  container  process  has  many  of  the  expected   admin   privileges   that   go   along  with   being   the   super-­‐user,   the   Linux   kernel   has  restrictions   based   on   internal   knowledge   that   this   is   a   user   “namespaced”   process.   The  most   notable   restriction   that   we   are   aware   of   at   this   time   is   the   inability   to   use  mknod.  Permission   will   be   denied   for   device   creation   even   as   container  root   inside   a   user  namespace.  

 Things  to  keep  in  mind:  

• As  it  stands  now,  one  need  to  choose  either  using  read-­‐only  container  file-­‐system  or  using  user  namespaces.  Hopefully,  this  restriction  will  be  lifted  soon  and  both  mechanisms  can  be  used  simultaneously;  

• At  the  time  of  writing  Red  Hat  does  not  support  user  namespaces  yet  and  considering  them  an  experimental  feature.  

 

Page 57: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       57  

Docker  Bench  for  Security  Container   Security   has   been   recognized  by   community   as   one   of   the   biggest   hurdles   and   at   the  same  time  pain  points  for  companies  adopting  containers.  As  result,   lot  more  attention  has  been  paid   by   vendors   and   developers   to   this   subject.   Number   of   security   improvements   has   been  proposed   and   most   of   them   summarized   in   CIS   Docker   Benchmark.   The   latest   version:  https://benchmarks.cisecurity.org/tools2/docker/CIS_Docker_1.12.0_Benchmark_v1.0.0.pdf      There   is   a   handy   tool   hosted   in  Docker   project   repo:   https://github.com/docker/docker-­‐bench-­‐security.  Docker  Bench   for   Security   is   a   script   that   checks   for  dozens  of   common  best  practices  around  deploying  Docker  containers  in  production.  The  tests  are  all  automated,  and  are  inspired  by  the  CIS  Docker  Benchmark.  The  tool  itself  is  regularly  updated  form  every  Docker  release.    It’s  recommended  to  perform  regular  platform  audits  using  this  benchmarking  tool,  perform  risk  analysis,  plan  and  implement  remediation  actions  as  described  in  CIS  Docker  Benchmark  paper.      This  exercise  may  be  added  to  the  list  of  security  management  tasks  regularly  performed  on  the  container  platform.    Things  to  keep  in  mind:  

• The   CIS   Security   Benchmark   is   not   mandatory   prescription.   It’s   a   guide   and   a   set   of  checking   rules   helping   to   improve   overall   deployment   security.   Therefore,   it’s   not  mandatory  to  implement  every  single  control  without  performing  risk  analysis;  

• The  same  security  risk  may  be  addressed  by  multiple  controls  and  it’s  enough  to  implement  one,  not  all  of  them;  

 

Web  Application  Security  The  Web  Application  Security  has  always  been  a  tricky  subject  and  there  is  no  ultimate  recipe  for  making  applications  secure.  For  example,  OWASP  https://www.owasp.org  is  listing  the  following  controls:  

• Verify  for  Security  Early  and  Often  • Parameterize  Queries  • Encode  Data  • Validate  All  Inputs  • Implement  Identity  and  Authentication  Controls  • Implement  Appropriate  Access  Controls  • Protect  Data  • Implement  Logging  and  Intrusion  Detection  • Leverage  Security  Frameworks  and  Libraries  • Error  and  Exception  Handling  

 Unfortunately,   those  controls  mostly  concerned  with  application  code  and   its  data  management,  which   is   mostly   out   of   control   in   case   of   hosting   platform,   which   is   providing   a   deployment  placeholder  and  reliable  runtime.    Using  code  analyzers,  secured  platform  components  and  best  configuration  practices  for  security  can   definitely   improve   the   overall   application   security,   however,   generally   speaking,   it   won’t  guarantee  sufficient  protection.    This  is  requiring  another  security  layer  –  Web  Application  Firewall,  also  called  WAF.    

Page 58: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       58  

The   OWASP   definition:   A   web   application   firewall   (WAF)   is   an   application   firewall   for   HTTP  applications.  It  applies  a  set  of  rules  to  an  HTTP/S  conversation.  Generally,  these  rules  cover  common  attacks  such  as  cross-­‐site  scripting  (XSS)  and  SQL   injection.  While  proxies  generally  protect  clients,  WAFs   protect   servers.   A   WAF   is   deployed   to   protect   a   specific   web   application   or   set   of   web  applications.    Another   benefit   of   using  WAF   is   that   it   can   protect   against   both   known   and   unknown   security  attacks   as   well   as   shield   against   DoS   and   DDoS   type   of   attacks,   the   application   itself   and  infrastructure  may  not  be  able  to  cope  with.  Often  WAF  is  combined  with  content  distribution  or  CDN  solution.  Therefore,  we’ll  be  referring  to  a  CDN/WAF  combination,  since  most  considerations  apply  to  both  of  them.    During   the   POC   project,   the   VDMS   CDN/WAF   solution   has   been   tested   and   validated.   Generally  speaking,  adding  WAF  or  CDN,  must  be  completely  transparent  to  the  web  application.  However,  there  are  couple  points  that  have  to  be  paid  special  attention:  

• WAF  and  CDN  are  acting  as  a  reverse  proxy  and  forwarding  application  requests  on  behalf  of  the  user,  so  additional  steps  required  to  identify  the  “Real  User  IP”.  Usually  it  is  supplied  in  custom  HTTP  headers;  

• If   CDN   or   CDN/WAF   combination   is   caching   application   content   or   even   dynamic   pages,  those  caches  must  considered  when  managing  application   lifecycle.  For  example,   flushing  and  re-­‐populating  caches  or  their  parts;  

• Sometimes,  for  the  optimal  performance  and  best  integration,  application  must  be  aware  of  external  CDN/WAF   layer  and  adjusted  accordingly.  There   is  a   term  coined  by  major  CDN  vendor  –  application  must  be  “akamized”;  

• Since  most  WAF  solutions  are  using  rule-­‐based  scoring  systems,  sometimes  WAF  must  be  taught   and   adjusted   for   specific   application   behavior,   in   order   to   avoid   reporting   false  positives  and  blocking  valid  application  requests  and  activities;  

• The  WAF/CDN   solution   can   also   take   care   of   SSL   termination   or   bridging,   depending   on  specific  security  requirements.   In  the  latter  case  when  traffic  re-­‐encrypted  again,  the  self-­‐signed  CA  may  be  used;  

• Although,  most  CDN/WAF  solutions  are  managed  via  corresponding  vendor  portals,  using  APIs   for  CDN/WAF  service  management  becoming  more  and  more  ubiquitous.  These  API  calls  may   be   integrated   into   platform   deployment   processes,   thus   simplifying  WAF/CDN  management  and  making  it  more  transparent  for  a  platform  user;  

• Last,   but   not   least,   using   CDN/WAF   solution   will   have   significant   positive   impact   on  application   capacity,   so   it   must   be   considered   by   application   and   platform   capacity  management  processes;  

 Things  to  keep  in  mind:  

• The  VDMS  solution  worked  smoothly  with  Drupal-­‐based  sites  with  couple  little  exceptions,  when  AJAX  calls  have  been  scored  as  suspicious  and  eventually  blocked  by  WAF.  This  has  broken  some  website  functionality  and  required  to  adjust  some  scoring  rules;  

• The  VDMS  solution  provides  a  set  of  APIs,  however,  at  the  time  of  writing  the  SSL  certificate  management  via  APIs  is  not  yet  possible;  

• Using   CDN/WAF   solution   may   also   simplify   SSL   certificate   setup   and   management  procedures.  It  is  expected  that  most  if  not  all  hosted  applications  will  be  using  SSL  or  TLS  for  transport  security;  

• The  performance  tests  did  not  use  CDN  or  WAF  and  were  all  executed  in  the  same  subnet.  Obviously,   using   CDN   can   offload   most   static   file   requests   or   even   some   dynamic   page  requests.  

Page 59: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       59  

Platform  Change  Management  Often,  when  Change  and  Release  management  is  discussed,  it’s  related  to  Customer  applications  or  other  assets  deployed  on  the  platform.  The  platform  itself,  however,  does  have  own  lifecycle  and  requirements  for  change  and  release  management.    Although  specific  development  processes  and  practices  may  vary,  it’s  recommended  to  implement  at  least  the  following  environments  for  platform  change  and  release  management:  

• Development   Environment   –   this   is   where   all   developments,   code   updates   and   other  platform  changes  are  performed  initially  and  tested;  

• Staging  Environment  –  also  known  as  Integration  or  QA  environment.  This  is  where  code  releases   and   platform   updates   deployed,   tested   and   validated   prior   to   pushing   them   to  production.  Optimally,   this  environment   is  mirroring  production  environment   from  setup  and  infrastructure  perspective;  

• Production  Environment  –  this  is  where  Customer  applications  being  hosted;    Additionally,  there  is  a  need  for  fully  automated  Continuous  Delivery  workflow  that  is  performing  platform  deployment  given  the  following  parameters:  

• Platform  Release  Identifier  or  Version  Tag;  • Target  Specification  or  Location  Identifier;  

 The   last   but   not   least,   there   is   a   need   for   the  Test   Suite   that  will   be   executed   against   deployed  platform  to  ensure  that  platform  is  up,  healthy  and  its  services  working  as  expected.    Things  to  keep  in  mind:  

• Sometimes,   especially   during   the  major   application   version   upgrades,   the   data   structure  format  or  configuration  is  not  compatible  with  the  new  version  and  thus  requires  migration  or   data   format   (scheme)   upgrade.   Such   upgrade   or   ETL   procedures   must   be   a   part   of  platform  upgrade  roll-­‐out;  

• For   quick   deployment   roll-­‐back   it’s   recommended   to   use   copy-­‐upgrade   approach   for   the  application  data  stored  on  persistent  volumes,  rather  than  doing  data  update  in-­‐place;  

• Although  most  dependencies  between  platform  components  are  explicit   and  well  known,  there  are  some   implicit  or   indirect  dependencies  available   too.  Therefore,  when  changing  one   platform   component,   it’s   not   enough   to   validate   only   dependent   components.  Optimally,  complete  end-­‐to-­‐end  test  must  be  performed  to  ensure  that  all  components  are  working  as  expected;  

• Due   to   platform   modularity   and   loose   coupling,   pretty   much   any   component   can   be  replaced  with  external  service.  This  is  simplifying  platform  support,  however,  it  does  make  platform  change  management  even  more  complicated,  since  external  change  management  schedules  and  dependencies  must  be  accounted  for.  Thus,  one  may  choose  to  use  GitHub  or  BitBucket  as  SCM  Service,   instead  of   locally  hosted  GitLab   instance.  While   it  does  address  number   of   questions   related   to   component   support,   at   the   same   time   changes   to  GitHub  APIs   may   result   in   unexpected   need   to   upgrade   depending   platform   components   or  processes;  

• Container  ecosystem  is  still  quickly  evolving.  This  is  resulting  in  new  features  being  added  for   addressing   existing   shortcomings,   changing   application   architecture   or   breaking   API  changes.    

• Depending  on  platform  architecture,  the  upgrade  operation  may  require  maintenance  time  windows,  partial  capacity  or  function  downgrade  or  complete  platform  shutdown.  

Page 60: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       60  

Drupal  Hosting  The  platform  does  not  put  limits  on  applications  that  can  be  deployed.  The  only  possible  constraint  can  be  stated  as:  whatever  may  be  packaged  in  container,  can  be  hosted  on  this  platform.  Drupal  CMS  has  been   chosen   to  demonstrate  hosting   capabilities   and  DevOps-­‐style   approach   to  application  lifecycle  management  as  well  as  code  deployment  and  content  publishing  workflows.      

 Figure  15  -­‐  Drupal  CMS:  Configuration  Portal  

 The   end   goal   is   to   deliver   Drupal   CMS   as   a   Service   hosting  model,  where   the   Customer   is   only  responsible   for   so   called   creative   part   –   design   and   content,   where   as   Service   Provider   is  responsible   for   hosting   platform,   from   a   Drupal   instance   and   down   to   infrastructure   and  application  lifecycle  management.    Drupal  CMS  itself   is  a  PHP  application  that  uses  so  called  LAMP  (Linux,  Apache,  MySQL,  PHP)  or  LEMP  (Linux-­‐NGINX-­‐MySQL-­‐PHP)  application  stack  as  its  runtime.      

Page 61: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       61  

Drupal  Site  Components    

 Figure  16  -­‐  Drupal  Site  Components  

The  Drupal  application  is  built  out  of  multiple  components:  • Drupal  Core  -­‐  The  standard  release  of  Drupal.  The  Drupal  core  installation  can  be  seen  as  a  

bare-­‐bones  setup  that  can  be  extended  per  specific  project  needs  and  requirements;  • Core  Modules  –  The  set  of  modules  included  into  standard  Drupal  release  to  provide  basic  

CMS   features:   user   account   registration   and  maintenance,   menu  management,  RSS  feeds,  taxonomy,  page  layout  customization  and  system  administration;  

• Contrib   Modules   –   The   set   of  modules   provided   by   Drupal   Community   and   3rd   parties.  They   offer   such   additional   or   alternate   features   as   image   galleries,   custom   content   types  and   content   listings,   WYSIWYG   editors,   private   messaging,   third-­‐party   integration   tools,  integrating   with   enterprise   applications,   and   more.   As   of   November   2016   the   Drupal  website  lists  more  than  35,800  free  modules;  

• Libraries  –  Parts  of  code,  shared  by  multiple  modules  or  even  sites,  which  are  not  included  to  the  package  or  distribution  for  licensing,  maintenance  or  other  reasons;  

• Themes  –  Defining  Drupal  site   look  and   feel.  They  use  standardized   formats   that  may  be  generated   by   3rd   party   theme   design   engines.   Some   templates   use   hard-­‐coded   PHP.   The  Drupal  themes  utilizing  a  template  engine  to  further  separate  HTML/CSS  from  PHP;  

• DB  Objects   –  Drupal  CMS   is  heavily  dependant  on  database.  The  DB  Objects   are  used   to  store  units  of  content,  configuration  and  customization;  

• File-­‐system  Objects  –  Besides  its  PHP  code  and  often  session-­‐state  objects,  Drupal  CMS  is  storing  media  assets,  caches,  compressed  files  and  temporary  files  as  file-­‐system  objects.  

Page 62: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       62  

 These  components  may  be  grouped  as:  

• Drupal  Base  (or  Distribution)  Components  –  a  common  denominator  set  of  components  included  into  specific  Drupal  distribution;  

• Site  Specific  (or  Custom)  Components  –  each  website  may  require  further  customization  and  unique  set  of  features  and  functions,  which  are  not  provided  by  the  distribution.  

 One  of  the  most  important  objectives  is  to  minimize  number  of  custom  components,  which  means  that  Drupal  Distribution  has   to  provide  majority  of   features   required   for   the  modern  enterprise  website.  This  way  Drupal  sites  will  be  using  the  common,  secure  and  tested  code-­‐base  and  differ  from  each  other  only  by   themes,   customization  and  content.  This  allows   in   turn  simplifying  and  standardizing  website  sizing,  development,  maintenance  and  management  processes  and  getting  closer   to   the   end   goal   –   industrialized   website   management   and   delivering   Drupal   CMS   as   a  Service.    

Drupal  Container  Components  One   of   the   design   paradigms   is   to   keep   containers   immutable   (read-­‐only)   and   all   mutable  application   objects   must   be   kept   outside   of   container   itself,   on   container   volumes   or   in   the  database.    

 

Figure  17  -­‐  Web  Container  Components  

Page 63: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       63  

This  design  paradigm  has  been  applied  to  web  containers.  Obviously,  for  number  of  websites  using  a   similar   LAMP   or   LEMP   runtime   components,   it’s   possible   to   define   a   common   denominator  runtime   components   and   package   them   in   a   container   image.   All   website   specific   objects:  configuration,   content,   assets,   transient   and   temporary   files,   session   state   are   stored   outside   of  container  –  either  on  file-­‐system  volumes  or  in  the  database.    Let’s   take   a   closer   look   to   components   included   to   web   container   image.   It   does   include   the  following  components:  

• Base  OS  image  –  Although  pretty  much  any  Linux-­‐based  OS  can  be  used  here,  the  number  of   viable   options   is   rather   limited,   each   having   own   up   and   downsides.   See   the   Base   OS  Image   section   for  more   considerations   on   this   subject.   For   this   particular   project   either  Alpine  Linux  or  custom  built  and  hardened  “Debian  8.1  Slim”  distribution  has  been  used;  

• Process   Supervisor   –   Another,   almost   religious   subject   –   one   or   many   processes   per  container.  Again,  please  refer  to  Base  OS  Image  section  for  more  details.  For  this  project  the  S6  process  supervisor  has  been  used  to  provide  init-­‐process  functionality  and  manage  sub-­‐ordinate  process  life-­‐cycle  in  container-­‐aware  manner;  

• Web  Server  –  The  Apache  web  server  is  having  longer  track  record,  however,  the  NGINX  is  much  more  lightweight  and  efficient.  The  images  have  been  built  with  both  httpd  and  nginx  binaries  and  results  speaking  for  themselves;  

• PHP-­‐FPM  –  In  order  to  make  HTTP  request  processing  more  efficient,  the  PHP  engine  has  been   separated   from  HTTP   pipeline.   The   PHP   requests   are   served   by   application   server  component  accessed  via  FastCGI  Process  Manager  (FPM)  reverse  proxy  protocol;  

• Redis   –   The   fast   in-­‐memory   key-­‐value   store   implemented   by   Redis   can   speed   up   web  applications  (page  rendering  times)  by  orders  of  magnitude.  It  can  cache  DB  query  results,  content  blocks,  session  objects,  etc.  

• Cron   –   Applications   packaged   in   container  may   and   often   do   require   using   OS   provided  task  scheduler.  In  our  particular  case  both  PHP  and  Drupal  are  relying  upon  cron  for  certain  cleanup,  validation  and  other  maintenance  activities;  

• Syslog   –  Applications  packaged   in   container  often   rely  upon   system   logging  mechanisms  and   sending   log  messages  via   /dev/log   socket.   In  our  particular   case   the  cron   daemon   is  sending  log  messages  via  syslog  facilities;  

• Smtp-­‐Proxy   –  Applications  packages   in   container   often  need   to   send   emails   using   SMTP  protocol.  There  are  several  possibilities  to  make  it  possible:  

o Add   an   MTA   package   to   every   container.   The   most   straightforward   solution,   but  goes  against  stackable  applications  paradigm.  Also  sending  emails   is  not  a  primary  function  of  web  container;  

o Add  a  simple  SMTP-­‐proxy  or  gateway   that  will  pass  SMTP  commands   from  within  container   to   the   proper   mail-­‐relay.   This   approach   has   been   implemented   and   all  messages  being  transparently  passed  to  SMTP  relay;  

• Container  Volumes  –  As  you  can  see  above,  several  container  volumes  have  been  defined,  each  serving  specific  purpose:  

o ROOT  –  web  application  root  folder.  The  file-­‐system  location  where  web  application  is  being  deployed  to;  

o DATA  –  web  site  content,  media  assets  and  persistent  file-­‐system  objects.  These  files  guaranteed  to  be  preserved  between  web  application  restarts;  

o CERT   –   SSL/TLS   certificates:   server.crt   and   server.key   required   to   serve   HTTPS  requests;  

o LOGS  –  web  container  log  files  stored  here,  including,  but  not  limited  to  web  server,  php,  redis,  syslog  and  cron  log  files;  

o TEMP   –   transient   and   temporary   file-­‐system   objects   stored   here.   These   files   not  guaranteed  to  be  preserved  between  web  application  restarts.  

Page 64: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       64  

 Things  to  keep  in  mind:  

• The  Apache  server,  even  with  the  most  modern  and  efficient  Event  MPM,  can’t  cope  with  the  number   of   requests   that   NGINX   handles   with   ease.   Eventually,   one   is   presented   with   a  trade-­‐off,  either  using  more  “standard”  Apache  server  providing  a  huge  amount  of  modules  and  extensions  or   chose   relatively  new  NGINX,   though   in   this   case   support   skills  may  be  rather  scarce;  

• The   number   of   threads   processing   web   requests   for   both   web   server   and   php-­‐fpm   is  dynamic  and  can  be  adjusted  for  various  container  sizes  and  resource  allocation  models;  

• Both  options:  keeping  dedicated  Redis  instance  inside  website  container  and  making  Redis  a  platform  service  can  be  considered  a  viable  choices,  each  having  own  up-­‐  and  downsides.  The   dedicated  Redis   instance  model   allows   reaping   quick   benefits  with   relatively   simple  deployment.   The   platform   service   approach   allows   much   better   economies   of   scale   and  resilience   in   case   of   failure   with   a   bit   more   involved   deployment   and   management  overhead;  

• Web   container   initialization   script   is   checking   whether   valid   SSL   certificates   have   been  supplied.   If   SSL   certificates   are   missing,   the   new   set   of   self-­‐signed   SSL   certificates   is  generated  on  the  fly,  meaning  that  web  container  is  able  to  serve  HTTPS  requests  whether  site  certificates  have  been  provided  or  not.  

 

Drupal  Container  Performance  Using  containers  and  enforcing  resource  constraints  requires  even  more  thorough  considerations,  when  it  comes  to  application  architecture,  its  scalability  and  configuration.  The  following  chapters  are  providing  an  overview  and  useful  background  for  web  application  sizing  and  performance.    

Sizing  Considerations  Historically,   PHP   applications   have   been   running   on   the   LAMP   stack,   which   stands   for   Linux,  Apache,  MySQL  and  PHP.  Usually,   the  PHP  runtime  is  compiled  as  a  shared-­‐object  and  loaded  as  Apache   module.   This   is   allowing   Apache   to   handle   both   static   files   from   the   file-­‐system   and  process  so  called  dynamic  pages,  or  scripts  created  in  PHP.  This  solution  is  very  elegant,  simple  to  setup  and  manage,  however,  it  comes  with  a  price.    The   single   Apache   process   size   may   grow   quite   large   in   size,   significantly   increasing   the   web  server  memory  footprint.  Truth  be  said,  the  biggest  part  of  the  process  image  size  is  contributed  by   VSS   pages   shared   between   processes,   so   the  math   is   not   really   linear   here.  More   details   on  finding  process  memory  footprint  provided  in  the  Process  Size  Conundrum  chapter.      Much   bigger   issue   is   that   PHP   engine   is   not   thread   safe,   generally   speaking,   due   to   3rd   party  libraries  and  extensions.  Which  is  leaving  no  other  options,  but  using  Prefork  MPM.  Or  using  other  words,   one   HTTP   connection   will   be   served   by   one   Unix   process   and   number   of   concurrent  connections  that  your  server  may  potentially  serve  is  capped  by  amount  of  RAM  allotted  to  your  web  server.  The  rough  calculation  will  look  like  following:      ConcurrentConnections  =  (ServerMem  –  OsMem)/HttpdMem      For   example,   for   the  web   server  with   4GB  RAM  and  Apache  process   size  ~64M,   the   number   of  concurrent  HTTP  connections  will  be  (4G  –  2G)/64M  =  32.    

Page 65: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       65  

We  shall   remember   that   if   no  CDN  or   other   caching   technology  used,   the   same  web   server  will  serve  both  HTML  pages  and  static  resources,  such  as  images,  scripts,  CSS  files,  etc.  Assuming  that  modern  web  browser  is  sending  3-­‐4  concurrent  HTTP  requests  when  accessing  the  web  page,  in  reality  ~8  browser  sessions  can  completely  saturate  your  web  server  connection  pool.    In  order   to   increase   the  number  of   concurrent  users   (or   connections)   you   can  either   grow  web  server  RAM  or  decrease  the  httpd  process  size.  The  latter  option  is  not  going  to  help  a  lot.  Cutting  here  and   there  may  help  a  bit,  but   it  won’t  be  a  game  changer.  This   is   leaving   the  only  options:  growing  the  web  server  RAM  or  adding  more  web  servers  to  spread  the  load.    This   is  where  FastCGI   is   coming   to  help.  The  basic   idea   is   to   let  Apache  do  what   it  does  good  –  serving  static   content  and  requests   for  dynamic  content  are  proxied  using  FastCGI  protocol   to  a  separate   server   running   PHP   engine   or   PHP   application   server.   Such   approach   allows   killing  multiple  birds  with  the  same  stone,  since  PHP  engine  is  not  a  part  of  Apache  any  more:  

• Apache   can   eventually   use   multi-­‐threading   instead   of   multi-­‐process   model,   thus  significantly  reducing  resource  usage;  

• Apache  can  use  other  more  efficient  MPMs,  for  example  Event  MPM;  • Application  server  can  be  deployed  and  scaled  independently  from  the  web  server;  • The  web  server  life-­‐cycle  management  and  maintenance  is  simplified.  

 For  PHP,  specifically,  moving  forward  we’ll  be  using  PHP-­‐FPM.  A  quote  from  vendor  site:  PHP-­‐FPM  (FastCGI  Process  Manager)  is  an  alternative  PHP  FastCGI  implementation  with  some  additional  features  useful  for  sites  of  any  size,  especially  busier  sites.  These  features  include:  

• Adaptive  process  spawning  • Basic  statistics,  ala  Apache's  mod_status  • Advanced  process  management  with  graceful  stop  and  start  • Start  workers  with  different  uid/gid/chroot/environment/php.ini.  Replaces  safe_mode  • Stdout  and  stderr  logging  • Emergency  restart  in  case  of  accidental  opcode  cache  destruction  • Accelerated  upload  support  • Support  for  a  "slowlog"  –  logging  long  running  requests  • Enhancements  to  FastCGI,  such  as  fastcgi_finish_request()  -­‐  a  special  function  to  finish  request  

and  flush  all  data  while  continuing  to  do  something  time-­‐consuming,  e.g.  video  converting,  stats  processing,  etc.  

 Having  PHP-­‐FPM  deployed  on  the  same  server,  the  calculation  is  becoming  a  bit  more  complex:      ServerMem  –  OsMem  =  Nweb  *  HttpdMem  +  Nphp  *  PHPMem  Nweb  >  Nphp      This  equation  is  having  two  variables  Nweb  and  Nphp  (or  number  of  web  and  php  server  instances  correspondingly)  and  therefore  has  multiple  solutions.    Generally  speaking,  we  can  assume  that  number  of  web  server  instances  must  be  higher  than  number  of  PHP  server  instances.  The  more  precise  constraint  will  depend  on  static  to  dynamic  requests  ratio  processed  by  the  web  server,  which  will  in  turn  depend  on  particular  application  architecture  and  deployment.    

Page 66: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       66  

Another  caveat  –  as  soon  as  we’re  going  away  from  the  Prefork  MPM  and  using  Worker  or  Event  MPM,  the  number  of  connections  is  depending  on  number  of  threads,  rather  than  number  of  web  server  processes.  This  leads  us  to  more  accurate  formula:      ServerMem  –  OsMem  =  Nweb  *  HttpdMem  +  Nweb  *  Tweb  *  ThreadMem  +  Nphp  *  PHPMem  Nweb  =  Nphp  *  StaticDynamicRatio      Now,   let’s   see  some  practical  examples   for  using   this   formula.  Let’s  assume  the   following   inputs  are  provided:  

• The  web  server  is  equipped  with  4GB  RAM;  • The  OS  and  other  essential  services  are  using  ~2GB  RAM;  • The  httpd  process  size  (HttpdMem)  is  4M;  • The  php  process  size  (PHPMem)  is  16M;  • The  httpd  thread  size  (ThreadMem)  is  256K  used  for  stack  and  other  runtime  structures;  • The  httpd  configured  with  Event  MPM;  

o MPM  is  setup  to  use  4  httpd  processes  (Nweb);  o 256  threads  per  process  (Tweb);  

 Let’s  put  those  inputs  into  our  formula:  4GB  –  2GB  =  4  *  4MB  +  4  *  256  *  256KB  +  Nphp  *  16MB  This  can  be  reduced  to:  2048MB  =  16MB  +  256MB  +  Nphp  *  16MB  and  provides  us  with  the  value  for  the  number  of  PHP  processes,  Nphp  =  111.    This  number  is  going  to  be  a  limit  for  concurrent  PHP  requests  that  our  web  server  will  be  able  to  handle   concurrently.   Considering   queuing   mechanisms   and   the   fact   that   number   of   available  threads  on  the  web  server  side  is  much  higher  (1024),  the  real  life  constraint  is  going  to  be  a  bit  higher,   somewhere   around  128-­‐160+   concurrent   requests,   depending   on  PHP   script   complexity  and  script  processing   time.   If  parts  of   the  page  and  SQL  queries  will  be  coming   from  cache,   this  number  can  be  yet  higher.  Assuming  that  we  have  lots  of  “free”  web  server  threads  to  serve  static  resources,  our  web  server  should  be  easily  able  to  handle  150+  web  browser  sessions  and  up  to  1000   concurrent   HTTP   hits.   At   this   point   we’ll   possibly   start   seeing   other   bottlenecks   and  constraints  such  as  network  and  disk  subsystem  throughput  limits.    This  number  looks  impressive  for  such  moderately  sized  web  server,  but  it  can  be  improved  even  further.  Apache  web  server   is  very  capable,  however,   it  was  designed  when   Internet  was  young  and  Apache   is  not  always  able   to   cope  with  demands  of   cloud  age.  Using  other  words  –  Apache  failed  C10K  test,  see  https://en.wikipedia.org/wiki/C10k_problem.    

Apache  vs.  NGINX  It’s  time  to  welcome  NGINX!  Quote  from  the  vendor  site:  NGINX  is  one  of  a  handful  of  servers  written  to  address  the  C10K  problem.  Unlike  traditional  servers,  NGINX  doesn’t  rely  on  threads  to  handle  requests.  Instead  it  uses  a  much  more  scalable  event-­‐driven   (asynchronous)   architecture.   This   architecture   uses   small,   but   more   importantly,  predictable   amounts   of   memory   under   load.   Even   if   you   don’t   expect   to   handle   thousands   of  simultaneous   requests,   you   can   still   benefit   from   NGINX’s   high-­‐performance   and   small   memory  footprint.  NGINX   scales   in  all   directions:   from   the   smallest  VPS  all   the  way  up   to   large   clusters  of  servers.    

Page 67: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       67  

NGINX  is  supporting  FastCGI  protocol  and  integrates  well  with  PHP-­‐FPM.  This  is  making  NGINX  an  ideal  candidate  for  Apache  replacement  in  our  web  container.  The  NGINX  is  very  lean,  so  PHP  gets  even  more  memory.  Some  performance  tuning  guides  may  be  found  in  the  following  articles:  http://drupalwxt.github.io/performance/apache-­‐fpm/  https://www.howtoforge.com/configuring-­‐your-­‐lemp-­‐system-­‐linux-­‐nginx-­‐mysql-­‐php-­‐fpm-­‐for-­‐maximum-­‐performance      

Performance  Test  Let’s  run  some  real  tests  using  apache-­‐bench.  We’ll  be  testing  performance  of  httpd-­‐php-­‐fpm  and  nginx-­‐php-­‐fpm  containers  running  the  same  Drupal  7.50  based  web  application.  Containers  limited  to  64MB  memory  and  1GB  swap.         Apache  

50  TCP  Apache  60  TCP  

NGINX  50  TCP  

NGINX  100  TCP  

NGINX  100  UDS  

NGINX  150  UDS  

Concurrency  Level   50   60   50   100   100   150  Test  duration  (sec)   95.88   102.46   70.21   71.99   70.08   72.44  Complete  requests   100K   100K   100K   100K   100K   100K  Filed  requests   0   0   0   0   0   0  Write  errors   0   649   0   0   0   0  Total  Transferred  (MB)   796.13   791.31   792.98   793.65   792.98   795.15  HTML  Transferred  (MB)   757.31   752.63   757.31   757.31   757.31   757.31  Requests/sec  (mean)   1043.01   975.96   1424.26   1389.14   1427.03   1380.42  Time  per  request  (ms,  mean)   47.94   61.48   35.11   71.99   70.08   108.66  Time  per  request  (ms,  max)   274   1231   122   177   169   298  Transfer  rate  (MB/s)   8.30   7.72   11.29   11.02   11.32   10.98    

 Figure  18  -­‐  Stress  Test  Results  

Page 68: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       68  

What  can  we  conclude  from  those  test  results?  • Apache  50  TCP:  the  httpd-­‐php-­‐fpm  container  was  able  to  cope  with  50  concurrent  dynamic  

page  requests  quite  well.  It  served  1000+  requests  per  second,  with  <50ms  response  time  in  average.  Quite  impressive  for  the  little  web  server  with  64MB  of  RAM,  right?  Web  server  provides  rock-­‐solid  service  and  fully  capable  to  handle  this  load  over  extended  time;  

• Apache  60  TCP:  let’s  increase  the  load  and  run  the  same  test  on  the  fresh  created  container  (to   avoid   caching   effects)  with   60   concurrent   connections.   This   is  where  we’re   stepping  close  to  the  limit.  Closer  to  the  end  of  the  test  server  is  overwhelmed  with  the  load.  It  starts  throwing   errors   and   response   time   jumps   20   times   higher,   up   to   1200ms.   What’s  happening  inside  container?  When  number  of  connections  is  growing,  PHP-­‐FPM  is  not  able  to  cope  with   the   load  and  adding  more  and  more   threads.  The  memory  usage   is  growing  too.  At  some  point,  when  even  1GB  of  virtual  memory   is  used,   the  OOM  handler   is  killing  some   processes.   That’s   an   explanation   for   the   write   errors   reported   by   apache-­‐bench   –  listeners   disappearing   in   the  middle   of  HTTP   request.   Intensive   swap   usage   and   process  creation  is  leading  to  significantly  increased  latency  and  response  time.  It  doesn’t  make  any  sense  to  increase  the  load  even  more.  We  have  already  reached  the  bottleneck;  

• NGINX  50  TCP:  the  nginx-­‐php-­‐fpm  container  was  able  to  cope  with  50  concurrent  dynamic  page  requests  very  well.  It  served  1400+  requests  per  second,  with  <36ms  response  time  in  average.   Impressive   +40%   increase   in   throughput.   Looking   to   the   overall   container   load  and  behavior,  it’s  clear  -­‐  this  container  can  handle  lot  more.  Let’s  increase  the  load;  

• NGINX  100  TCP:   the  container   is  still  stable,  handling  the   load  with  grace  and  delivering  very  similar  figures.  The  number  of  requests  per  second  slightly  dropped,  just  below  1400.  The  response   time  doubled,   still  below  <72ms.  Let’s   see   if  we  can  squeeze   little  bit  more  performance  out  of   it.  The  NGINX  and  PHP-­‐FPM  are  using  TCP  protocol   for   inter-­‐process  communication.  Since  both  processes  are  inside  the  same  container,  we  can  use  UDS  (Unix  Domain  Sockets)  and  reduce  the  overhead  of  the  TCP  stack.  We  shall  not  expect  a  miracle  here,  still  worth  a  try;  

• NGINX  100  UDS:   the  same  test  as  before,   just  using  UDS  instead  of  TCP.  As  expected,  the  container  keeps  up  with  the  load  not  even  breaking  a  sweat.  The  performance  figures  have  improved   just   a   tad.   The   number   of   requests   per   second   is   again   above   1400+   and   the  request  processing  time  is  ~70ms.  Again,  as  expected,  switching  to  UDS  from  TCP  can  give  very  little,  barely  noticeable  speed  bump.  Let’s  increase  the  load  even  more;  

• NGINX  150  UDS:  incredible  result.  The  single  little  container  with  just  64MB  of  RAM  can  serve  150  concurrent  PHP  page  requests!  Obviously,  the  response  time  has  increased  up  to  300ms,  but  container  is  handling  the  load,  delivering  persistent  and  reliable  results.  This  concludes  the  test.  We  won’t  be  exploring  capacity  of  this  container  in  greater  detail  since  it  goes  way  beyond  of  scope  for  this  paper.  The  main  objective  was  to  show  that  NGINX  can  be  and  is  a  great  replacement  for  the  Apache  web  server  in  web  hosting  scenario.  

 Things  to  keep  in  mind:  

• To  those  of  you  reading  with  a  skeptical  face:  yes,  this  test  is  just  scratching  a  surface  and  by  no  means  can  be  seen  as  a  thorough  test  exploring  performance  for  web  containers  or  even  particular  web  servers.  The  whole  point  was  to  show  that  in  equal  conditions  NGINX  is  delivering  better  throughput  than  Apache;  

• You’ve   definitely   noticed   that   we’re  much  more   concerned  with   RAM   constraints   rather  than  with  CPU  limits.  The  reason  is  simple.  If  CPU  resource  is  scarce  and  application  will  get  less  number  of  CPU  cycles  or  will  be  scheduled  a  bit  later  to  run  on  CPU,  it  most  probably  won’t  break  the  application.  Yes,  it  will  take  longer  to  perform  an  operation,  the  latency  will  definitely   increase,   but   the   application   still   does   its   job.  The  memory   is   a  different   story.  When  application  used  both  physical  and  virtual  memory,  the  OOM  handler  will  kick-­‐in  and  application  will  be  killed.  End  of  story.  

Page 69: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       69  

• Ideally,   we   shall   consider   all   resources,   including   I/O   capacity   and   storage   throughput,  number  of  CPUs  and  threads,  though,  just  to  keep  things  simple,  we’re  concentrating  on  the  most  important  variables;  

• Are   those   test   results   valid   at   all?  We  do  have  Redis   cache  built   into  web   containers.   So,  eventually,   the   test   is   doing   nothing  more   than   bashing   the   cache…  While   this   is   a   true  concern,   it’s   not   an   issue   in   this   particular   case   and   here   is   why.   Drupal   is   depending  heavily  on   its  DB.  Without  Redis   this  performance   test  would  result   in   serious  DB  server  bashing  and  surely  it  wasn’t  the  main  point  of  this  test.  By  using  Redis  we  simply  excluded  DB  server  from  equation,  so  the  test  is  really  stressing  the  HTTP  pipeline;  

• Still,  something  does  not  add  up…  The  container  size  is  64M.  Is  it  a  typo?  Anyway,  even  with  640MB   using   your   formula   we   can   have   just   640MB   /   16   MB   (PHP   process   size)   =   40  concurrent  PHP  requests,  not  50  or  even  100!  Where  is  the  trick?  The  answer  is  –  there  is  no  trick  and  to  explain  memory  calculations  better  I  had  to  add  the  next  chapter.  

 

Process  Size  Conundrum  This   is  a  bonus   chapter.  Originally,   I   didn’t  plan   to   include   it   to   this  paper,   since   the   subject   is  not  directly   in   scope.   However,   understanding   some   of   the   concepts   described   below   can   help   a   great  deal,  when  sizing  web  applications  and  their  containers.    Let’s   start   with   a   simple   question:   “how   to   find   the   amount   of   memory   used   by   the   given  application?”    Actually,   it’s   one   of   those   great   questions   that   leading   to   some   interesting   discoveries   and   the  more  you  drill   in  depth   looking   for   an  answer   the  more  you   learn  and   realize   that  most   simple  questions  have  no  simple  answers.  The  question  is  quite  straightforward  and  the  answer  is  –  just  run  top  or  ps  command  and  it  will  give  you  the  number.      Well,  let’s  see.  Here  is  the  process  list  inside  our  web  container:  

 #  ps  fauxwww  USER              PID  %CPU  %MEM        VSZ      RSS  TTY            STAT  START      TIME  COMMAND  root          18840    0.0    0.0    20196    1684  ?                Ss      09:18      0:00  /bin/bash  root          19240    0.0    0.0    17488    1148  ?                R+      11:46      0:00    \_  ps  fauxwww  root                  1    0.0    0.0        184          0  ?                Ss      Oct05      0:00  s6-­‐svscan  -­‐t0  /var/run/s6/services  root                29    0.0    0.0        184          0  ?                S        Oct05      0:00  s6-­‐supervise  s6-­‐fdholderd  root              228    0.0    0.0        184          0  ?                S        Oct05      0:00  s6-­‐supervise  syslog  root              239    0.0    0.0  182856    1452  ?                Ssl    Oct05      0:00    \_  rsyslogd  -­‐f  /etc/rsyslog.conf  -­‐n  root              229    0.0    0.0        184          0  ?                S        Oct05      0:00  s6-­‐supervise  redis  redis            238    0.0    0.0    33344    5624  ?                Ssl    Oct05    26:00    \_  redis-­‐server  127.0.0.1:6379  root              231    0.0    0.0        184          0  ?                S        Oct05      0:00  s6-­‐supervise  php5-­‐fpm  root              236    0.0    0.1  254592  12452  ?                Ss      Oct05      1:54    \_  php-­‐fpm:  master  process    web                261    0.0    0.0  254592    1696  ?                S        Oct05      0:00            \_  php-­‐fpm:  pool  www  web                262    0.0    0.0  254592    1700  ?                S        Oct05      0:00            \_  php-­‐fpm:  pool  www  web                263    0.0    0.0  254592    1700  ?                S        Oct05      0:00            \_  php-­‐fpm:  pool  www  web                264    0.0    0.0  254592    1696  ?                S        Oct05      0:00            \_  php-­‐fpm:  pool  www  web                265    0.0    0.0  254592    1700  ?                S        Oct05      0:00            \_  php-­‐fpm:  pool  www  web                266    0.0    0.0  254592    2116  ?                S        Oct05      0:00            \_  php-­‐fpm:  pool  www  web                267    0.0    0.0  254592    1696  ?                S        Oct05      0:00            \_  php-­‐fpm:  pool  www  web                268    0.0    0.0  254592    1700  ?                S        Oct05      0:00            \_  php-­‐fpm:  pool  www  root              232    0.0    0.0        184          0  ?                S        Oct05      0:00  s6-­‐supervise  nginx  root              237    0.0    0.0    36180    2728  ?                Ss      Oct05      0:00    \_  nginx:  master  process    web                259    0.0    0.0    37188    2616  ?                S        Oct05      2:39            \_  nginx:  worker  process  web                260    0.0    0.0    37156    2116  ?                S        Oct05      2:44            \_  nginx:  worker  process  root              233    0.0    0.0        184          0  ?                S        Oct05      0:00  s6-­‐supervise  cron  root              235    0.0    0.0    25896    1028  ?                Ss      Oct05      0:03    \_  cron  -­‐f    

Page 70: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       70  

So,  what  is  the  process  memory  usage?  There  are  two  values  RSS  and  VSZ.      According  to  the  man-­‐page:  

• RSS  –  resident  set  size,  the  non-­‐swapped  physical  memory  that  a  task  has  used  (in  kiloBytes);  • VSZ  –  virtual  memory  size  of  the  process  in  KiB  (1024-­‐byte  units).  

 So,   the   value   in   RSS   column  must   represent   the   process   size   in   kilobytes   and   often   one   of   the  following  formulas  being  used:  

 $  ps  -­‐ylC  php5-­‐fpm  -­‐-­‐sort:rss  |  awk  '!/RSS/  {  s+=$8  }  END  {  printf  "Total  memory  used  by  processes:  %dM\n",  s/1024}'  Total  memory  used  by  processes:  28M    $  ps  aux  |  grep  'php-­‐fpm'  |  grep  -­‐v  grep  |  awk  '{s+=$6}  END  {printf  "Total  memory  used  by  processes:  %dM\n",  s/1024}'  Total  memory  used  by  processes:  28M    

Is  it  the  right  answer  to  our  question?  Unfortunately,  no,  it’s  not.    The  Linux  virtual  memory  system  is  not  quite  so  simple.  There  are  many  reasons  for  RSS  not  being  an  accurate  memory  usage  estimate  and  the  most  important  are:  

• When   a   process   forks,   both   the   parent   and   the   child   will   show   up   with   the   same   RSS.  However,   Linux   employs   a   copy-­‐on-­‐write   mechanism,   so   that   both   processes   are   really  using   the  same  memory  segments.  Only  when  one  of   the  processes  modifies   the  memory  will   it   actually   be   duplicated.   So   in   our   calculation   above   we   have   counted   the   same  memory  pages  multiple  times;  

• The   RSS   value   doesn't   include   shared   memory.   Shared   memory   is   owned   by   many  processes  and  not  owned  by  any  single  one,  hence  why  it  isn’t  included  in  process  RSS  field.  

 So,  we  may  conclude  that  while  RSS  is  indeed  the  total  memory  actually  held  in  RAM  for  a  process,  it  is  really  misleading,  when  it  comes  to  estimating  the  process  size.      What   about   VSZ?   The   VSZ   is   the   total   accessible   address   space   of   a   process.   It   also   includes  memory  that  may  not  be  resident  in  RAM  like  mallocs  that  have  been  allocated,  but  not  written  to.  As  we  can  see  -­‐  VSZ  is  of  very  little  use  for  determining  real  memory  usage  of  a  process.  It’s  time  to  drill  deeper  and  use  more  advanced  tools:  

 $  pmap  -­‐d  237  237:      nginx:  master  process  /usr/sbin/nginx  Address                      Kbytes  Mode    Offset                      Device        Mapping  0000000000400000          980  r-­‐x-­‐-­‐  0000000000000000  0fd:00002  nginx  00000000006f5000          112  rw-­‐-­‐-­‐  00000000000f5000  0fd:00002  nginx  0000000000711000          124  rw-­‐-­‐-­‐  0000000000000000  000:00000      [  anon  ]  000000000122f000          508  rw-­‐-­‐-­‐  0000000000000000  000:00000      [  anon  ]  00007f136ab8a000            44  r-­‐x-­‐-­‐  0000000000000000  0fd:00002  libnss_files-­‐2.19.so  00007f136ab95000        2044  -­‐-­‐-­‐-­‐-­‐  000000000000b000  0fd:00002  libnss_files-­‐2.19.so  00007f136ad94000              4  r-­‐-­‐-­‐-­‐  000000000000a000  0fd:00002  libnss_files-­‐2.19.so  00007f136ad95000              4  rw-­‐-­‐-­‐  000000000000b000  0fd:00002  libnss_files-­‐2.19.so  …  00007f136cf01000              4  rw-­‐-­‐-­‐  0000000000000000  000:00000      [  anon  ]  00007ffd8d02d000          132  rw-­‐-­‐-­‐  0000000000000000  000:00000      [  stack  ]  00007ffd8d0f3000              8  r-­‐x-­‐-­‐  0000000000000000  000:00000      [  anon  ]  ffffffffff600000              4  r-­‐x-­‐-­‐  0000000000000000  000:00000      [  anon  ]  mapped:  36180K        writeable/private:  1300K        shared:  4K    

Page 71: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       71  

If   you   go   through   the   output,   you   will   find   that   the   lines   with   the   largest   Kbytes   number   are  usually  the  code  segments  of  the  included  shared  libraries,  the  ones  with  .so  suffix.  What  is  great  about  that  is  that  they  are  the  ones  that  can  be  shared  between  processes.  If  you  factor  out  all  of  the  parts  that  are  shared  between  processes,  you  end  up  with  the  "writeable/private"  total,  which  is  shown  at  the  bottom  of  the  output.  This  is  what  can  be  considered  the  incremental  cost  of  this  process  in  terms  of  memory  consumption,  factoring  out  the  shared  libraries.    Let’s  check  the  size  of  nginx  processes  again:  

 #  for  pid  in  `pgrep  nginx`;  do  pmap  -­‐d  $pid  |  grep  'private'  ;  done  mapped:  36180K        writeable/private:  1300K        shared:  4K  mapped:  0K                writeable/private:  0K              shared:  0K  mapped:  0K                writeable/private:  0K              shared:  0K    

We  can  see  that  36180KB   is  the  total  amount  of  process  addressable  space  for  the  nginx  process,  also  known  as  mapped  memory  or  VSZ.  The  most  part  of  this  mapped  memory  is  taken  by  shared  libraries,  which  are  mapped   to  many,  but  not   really   “belonging”   to  any  of   the  process.  The  only  part  of  memory  pertinent  to  nginx  processes  is  just  1300K.  In  other  word,  3  nginx  Unix  processes  are  using  1300K  of  memory.    Let’s  check  php-­‐fpm  processes:  

 #  for  pid  in  `pgrep  php`;  do  pmap  -­‐d  $pid  |  grep  'private'  ;  done  mapped:  254592K        writeable/private:  6440K        shared:  66052K  mapped:  0K        writeable/private:  0K        shared:  0K  mapped:  0K        writeable/private:  0K        shared:  0K  mapped:  0K        writeable/private:  0K        shared:  0K  mapped:  0K        writeable/private:  0K        shared:  0K  mapped:  0K        writeable/private:  0K        shared:  0K  mapped:  0K        writeable/private:  0K        shared:  0K  mapped:  0K        writeable/private:  0K        shared:  0K  mapped:  0K        writeable/private:  0K        shared:  0K    

The  same  goes   for  PHP  processes.  The  memory  usage  can  be  estimated  as  6440K.  Eventually,   it  looks  like  we  got  an  answer  to  a  simple  question  asked  in  the  beginning  of  this  chapter.      Not  so  quick…  Those  mapped  shared  libraries  that  we  discounted  are  still  taking  some  significant  amount   of   memory.     Although,   the   numbers   we’ve   found   are   quite   accurate   estimates   of  “incremental”   process   memory   usage,   it   would   be   good   to   have   shared   memory   somehow  distributed   and   accounted   for   across   processes   using   those   shared   libraries   too.   That’s   exactly  what  Proportional  Set  Size  metric  is  providing.    The  Proportional  Set  Size  (PSS)  is  more  meaningful  representation  of  the  amount  of  memory  used  by   libraries   and   applications   in   a   virtual   memory   system.   Because   large   portions   of   physical  memory  are  typically  shared  among  multiple  applications,  the  standard  measure  of  memory  usage  known   as   Resident   Set   Size   (RSS)   will   significantly   overestimate   memory   usage.   PSS   instead  measures  each  application's  "fair  share"  of  each  shared  area  to  give  a  realistic  measure.    For   example,   if   three  processes   all   use   a   shared   library   that  has  30  pages,   that   library  will   only  contribute  10  pages  to  the  PSS  that  is  reported  for  each  of  the  three  processes.  PSS  is  a  very  useful  number  because  when  the  PSS  for  all  processes  in  the  system  are  summed  together,  that  is  a  good  

Page 72: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       72  

representation   for   the   total  memory   usage   in   the   system.  When   a   process   is   killed,   the   shared  libraries   that   contributed   to   its   PSS  will   be   proportionally   distributed   to   the   PSS   totals   for   the  remaining  processes  still  using  that  library.  

 #  smem  -­‐tk      PID  User          Command                                                  Swap            USS            PSS            RSS        29  root          s6-­‐supervise  s6-­‐fdholderd              8.0K          4.0K          9.0K        40.0K      228  root          s6-­‐supervise  syslog                        16.0K          4.0K          9.0K        40.0K      229  root          s6-­‐supervise  redis                          16.0K          4.0K          9.0K        40.0K      231  root          s6-­‐supervise  php5-­‐fpm                      8.0K          4.0K          9.0K        40.0K      232  root          s6-­‐supervise  nginx                            8.0K          4.0K          9.0K        40.0K      233  root          s6-­‐supervise  cron                              8.0K          4.0K          9.0K        40.0K          1  root          s6-­‐svscan  -­‐t0  /var/run/s6/s        16.0K        36.0K        36.0K        40.0K      235  root          cron  -­‐f                                              184.0K      112.0K      205.0K          1.0M      239  root          rsyslogd  -­‐f  /etc/rsyslog.co      380.0K      776.0K      830.0K          1.7M      237  root          nginx:  master  process  /usr/      956.0K      288.0K      851.0K          2.7M  18840  root          /bin/bash                                          228.0K          1.1M          1.1M          1.8M  19464  root          /usr/bin/python  /usr/bin/sm                0          6.3M          6.4M          7.1M      236  root          php-­‐fpm:  master  process  (/e          4.8M          9.1M          9.7M        12.2M  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐        13  1                                                                                6.5M        17.7M        19.1M        26.7M    

Truth  be  said,  even  PSS  does  not  give  the  ultimate  answer,  since  it  is  “proportional”  or  statistical  point-­‐in-­‐time  dependent  value  and  it  will  vary  depending  on  the  processes  running  on  the  system  and  whether  they  are  mapping  the  same  shared  libraries  or  not.    It  has  taken  quite  a  bit  of  digging  to  answer  that  simple  question  we  asked  in  the  chapter  begin  and  the  answer  is  pretty  much  “it  depends…”  ;-­‐)    The  whole  picture  is  far  from  being  complete.  If  you  want  to  have  a  solid  understanding  of  all  the  bits  and  pieces  contributing  to  Linux  kernel  memory  allocation  mechanisms  often  referred  to  as  Linux  VMM,  take  a  look  at  the  following  great  five-­‐part  article:  https://techtalk.intersec.com/2013/07/memory-­‐part-­‐1-­‐memory-­‐types/      Things  to  keep  in  mind:  

• The  considerations  in  this  chapter  providing  are  estimates  only,  not  hard  figures.  Therefore  these   estimates   must   be   periodically   verified   and   adjusted   as   needed.   Eventually,   the  capacity   management   shall   consider   these   estimates   for   infrastructure,   application   and  container  resource  sizing;  

• To  make  things  even  more  complicated,  containers  creating  additional  layers  of  indirection  and   understanding   virtual   memory   in   the   context   of   containerization   is   a   subject   for   a  separate  book.  The  homework  task  for  the  curious  reader  is  to  figure  out,  how  the  “docker  stats”  memory   usage   relate   to   the  memory   figures   reported   for   the   processes   inside   the  container.  Hint:  see  the  source;  

• Page  swapping  is  definitely  bad,  but,  if  your  options  are  killing  the  application  or  letting  it  run  very  slow,  the  choice  is  obvious;  

• One  shall  not  forget  that  swap  memory  slices  allotted  to  multiple  containers  are  taken  from  the  host  system  swap  memory  pool.  For  example,  100  small  containers  with  1GB  swap  limit  each,  given  serious  load,  may  all  claim  their  share  of  swap  memory  at  some  point.  So,  you  got   to  size  your  host  system  swap  partitions  accordingly,  considering  per-­‐container  swap  allocations,  otherwise  prepare  to  OOM  messages  in  the  logs;  

 

Page 73: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       73  

Drupal  Project  Creation  The  Drupal  CMS  is  modular  and  very  flexible,  when  it  comes  to  customization  and  extending  core  functionality.  The  developer  may  add  required  modules,   themes  and   features   to   the  Drupal  core  and   thus   create  new   custom  Drupal   distribution   that  may  be  used   for   company-­‐wide   or   brand-­‐specific   websites.   Drupal   Profiles   providing   programmatic   mechanism   for   defining   distribution  components,  dependencies  and  rules  for  building  distribution  out  of  definition  (make)  file.    From  vendor  documentation:  Distributions   provide   site   features   and   functions   for   a   specific   type   of   site   as   a   single   download  containing  Drupal   core,   contributed  modules,   themes,   and  pre-­‐defined   configuration.  They  make   it  possible  to  quickly  set  up  a  complex,  use-­‐specific  site  in  fewer  steps  than  if  installing  and  configuring  elements  individually.    There  are  several  building  blocks  used  to  assemble  new  Drupal  website  project:  

• VzBase   profile   –   it   does   include   carefully   selected   set   of   Drupal   modules,   features,  configurations     and   themes   to   provide   a   jumpstart   and   a   solid   base   for   enterprise   CMS-­‐based   websites.   Two   example   installation   profiles   have   been   created   under   vzpoc  repository:  vzbase-­‐7.43  and  vzbase-­‐7.50;  

• Drupal  Base  Project  –  it’s  a  boilerplate  structure  that  helps  to  simplify  starting  a  new  site  by  having  the  most  common  directory  structures  and  files  already  included  and  set  up;  

 The  diagram  below  is  schematically  depicting  the  most  important  build  stages.    

 

Figure  19  -­‐  Drupal  Project  Creation  Process  

Page 74: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       74  

 The  whole  Drupal  Project  Creation  process  can  be  described  as  following:  

1. The  Drupal  Base  project  is  cloned  from  git   to  the  new  Drupal  Project  location.  Let’s  call   it  SITE_ROOT.   This   way   we   can   ensure   that   all   site   projects   are   having   well   defined  standardized  structure;  

2. The  VzBase  Profile  is  cloned  from  git  to  a  temporary  location;  3. The   VzBase   Profile   being   built   and   its   build   results   –   the   custom   Drupal   distribution   is  

stored  under  the  SITE_ROOT/docroot  path;  4. New   git   repository   initialized   under   the   SITE_ROOT,   including   both   Drupal   Base   project  

structure  and  fresh  built  VzBase  Drupal  distribution;  5. The  project  repository  for  the  new  Drupal  project  is  checked  into  the  git,  i.e.  committed  and  

pushed  to  the  origin.    The  steps  outlined  above  can  be  executed  using  either  Orchestration  Portal  or  Platform  CLI.  For  example,   the   following  command  will  build  new  Drupal  project  named  d7base   in   the  test_agency  project  group,  using  vzpoc/vzbase-­‐7.50  installation  profile:  

 $  /opt/deploy/web  project  build  -­‐-­‐project  d7base  -­‐-­‐group  test_agency  -­‐-­‐profile  vzpoc/vzbase-­‐7.50    web  project  build:  found  Gitlab  project  group:  test_agency  web  project  build:  creating  new  GitLab  project:  web  project  build:  |-­‐-­‐  project  group:  test_agency  web  project  build:  |-­‐-­‐  project  name:  d7base  web  project  build:  |-­‐-­‐  project  desc:  d7base  web  project  web  project  build:  \-­‐-­‐  response  JSON:  {      "id":  133,      "description":  "d7base  web  project",      "default_branch":  null,      "tag_list":  [],      "public":  false,      "archived":  false,      "visibility_level":  0,      "ssh_url_to_repo":  "ssh://[email protected]:2222/test_agency/d7base.git",      "http_url_to_repo":  "https://gitlab.poc.local:8443/test_agency/d7base.git",      "web_url":  "https://gitlab.poc.local:8443/test_agency/d7base",      "name":  "d7base",      "name_with_namespace":  "test_agency  /  d7base",      "path":  "d7base",      "path_with_namespace":  "test_agency/d7base",      "issues_enabled":  true,      "merge_requests_enabled":  true,      "wiki_enabled":  true,      "builds_enabled":  true,      "snippets_enabled":  false,      "created_at":  "2016-­‐11-­‐10T13:21:55.498Z",      "last_activity_at":  "2016-­‐11-­‐10T13:21:57.196Z",      "shared_runners_enabled":  true,      "creator_id":  7,      "namespace":  {          "id":  106,          "name":  "test_agency",          "path":  "test_agency",          "owner_id":  null,          "created_at":  "2016-­‐10-­‐21T13:10:16.072Z",          "updated_at":  "2016-­‐10-­‐21T13:10:16.072Z",          "description":  "test_agency  project  group",          "avatar":  {              "url":  null          },          "share_with_group_lock":  false,          "visibility_level":  0  

Page 75: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       75  

   },      "avatar_url":  null,      "star_count":  0,      "forks_count":  0,      "open_issues_count":  0,      "runners_token":  "mXyxGNzRXmj-­‐nZNFRC8d",      "public_builds":  true  }  gitlab:  project  created  successfully  web  project  build:  assembling  new  project  d7base@test_agency...  web  project  build:  |-­‐-­‐  cloning  drupal-­‐base  +  vzbase-­‐7.50  -­‐>  workspace  web  project  build:  |-­‐-­‐  building  vzbase-­‐7.50  drupal  site  profile  web  project  build:  \-­‐-­‐  commiting  and  pushing  workspace  -­‐>  git  Cloning  into  'd7base'...  remote:  Counting  objects:  200,  done.  remote:  Compressing  objects:  100%  (112/112),  done.  Receiving  objects:  100%  (200/200),  164.68  KiB  |  0  bytes/s,  done.  Resolving  deltas:  100%  (93/93),  done.  Checking  connectivity...  done.  Cloning  into  'vzbase-­‐7.50'...  remote:  Counting  objects:  54,  done.  remote:  Compressing  objects:  100%  (54/54),  done.  Receiving  objects:  100%  (54/54),  26.08  KiB  |  0  bytes/s,  done.  Resolving  deltas:  100%  (13/13),  done.  Checking  connectivity...  done.  Wiping  target  directory  ../d7base/docroot  ...  Building  distribution  <vzbase>  Beginning  to  build  /var/www/vzbase-­‐7.50/build-­‐vzbase.make.                      [ok]  drupal-­‐7.50  downloaded.                                                                                            [ok]  vzbase  copied  from  ..                                                                                                [ok]  Found  makefile:  drupal-­‐org.make                                                                            [ok]  Project  ldap  contains  12  modules:  ldap_test,  ldap_authorization_og,  ldap_authorization_drupal_role,  ldap_authorization,  ldap_feeds,  ldap_help,  ldap_servers,  lda  p_authentication,  ldap_query,  ldap_sso,  ldap_views,  ldap_user.  ldap-­‐7.x-­‐2.0-­‐beta11  downloaded.                                                                            [ok]  …  Applying  local  patches  ...  /var/www/d7base/docroot/profiles/vzbase  /var/www/vzbase-­‐7.50  patching  file  modules/workbench_email/workbench_email.module  Hunk  #1  succeeded  at  802  (offset  6  lines).  /var/www/vzbase-­‐7.50  Initialized  empty  Git  repository  in  /var/www/d7base/.git/  [master  (root-­‐commit)  f556005]  Initial  revision    4859  files  changed,  880897  insertions(+)    create  mode  100644  .gitignore    create  mode  100644  docroot/.editorconfig    …    create  mode  100644  tests/readme.md  Counting  objects:  5315,  done.  Delta  compression  using  up  to  2  threads.  Compressing  objects:  100%  (5138/5138),  done.  Writing  objects:  100%  (5315/5315),  22.70  MiB  |  10.24  MiB/s,  done.  Total  5315  (delta  513),  reused  0  (delta  0)  To  ssh://[email protected]:2222/test_agency/d7base.git    *  [new  branch]            master  -­‐>  master  Branch  master  set  up  to  track  remote  branch  master  from  origin.  web  project  build:  successfully  built  project    

After  performing  above  steps  the  test_agency/d7base  website  project  has  been  built  from  scratch  using  vzpoc/vzbase-­‐7.50  installation  profile.    Obviously,  there  is  no  need  to  build  installation  profile  for  every  new  Drupal  project.  Much  more  efficient  approach  is  cloning  existing  base  distribution,  so  the  following  command  will  produce  the  equivalent  result:  

Page 76: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       76  

 $  /opt/deploy/gitlab  project  add  -­‐-­‐project  d7base2  -­‐-­‐group  test_agency  -­‐-­‐clone  vzpoc/d7-­‐vzbase-­‐7.50  gitlab:  looking  up  seed  project:  vzpoc/d7-­‐vzbase-­‐7.50...  gitlab:  project  created  successfully  gitlab:  cloning  project  vzpoc/d7-­‐vzbase-­‐7.50...  Cloning  into  'd7base2'...  remote:  Counting  objects:  5315,  done.  remote:  Compressing  objects:  100%  (4625/4625),  done.  Receiving  objects:  100%  (5315/5315),  22.70  MiB  |  29.96  MiB/s,  done.  Resolving  deltas:  100%  (513/513),  done.  Checking  connectivity...  done.  Initialized  empty  Git  repository  in  /var/www/d7base2/.git/  [master  (root-­‐commit)  553ebd3]  Initial  revision    4859  files  changed,  880897  insertions(+)    create  mode  100644  .gitignore    create  mode  100644  docroot/.editorconfig    …    create  mode  100644  tests/readme.md  Counting  objects:  5315,  done.  Delta  compression  using  up  to  2  threads.  Compressing  objects:  100%  (5138/5138),  done.  Writing  objects:  100%  (5315/5315),  22.70  MiB  |  11.12  MiB/s,  done.  Total  5315  (delta  513),  reused  0  (delta  0)  To  ssh://[email protected]:2222/test_agency/d7base2.git    *  [new  branch]            master  -­‐>  master  Branch  master  set  up  to  track  remote  branch  master  from  origin.  gitlab:  project  cloned  successfully    

Things  to  keep  in  mind:  • An  installation  profile  can  only  be  used  when  you're  installing  a  new  Drupal  instance.  This  

means   that   you   cannot   run  an   installation  profile   on  an  existing  Drupal   site   to   add  extra  functionality;  

• You   can   also   only   select   one   installation   profile   per   site.   This   means   that,   if   existing  installation   profile   has   to   be   extended  with   new  modules   or   features,   the   new   extended  profile  must  be  created;  

• The   proposed   approach   is   to   have   a  well-­‐defined   set   of   installation   profiles   for   different  website  types.  All  website  projects  must  be  built  out  of  one  of  standard  installation  profiles.  

 

Drupal  Website  Deployment  The  Website  deployment  includes  two  major  steps:  Web  Project  Deployment  and  Web  Container  Deployment.  Simply  speaking,  the  Project  is  Website  code,  content  and  everything  that  belongs  to  Website  assets.  The  Web  Container   is  an  application  engine  or  runtime,  which   is  executing  code  and  serving  requests  to  the  end-­‐users.  The  management  and  lifecycle  of  those  two  parts  supported  by  corresponding  Platform  Services.      Although,  both  steps  are  completely  independent,  for  the  sake  of  consistency  it’s  recommended  to  deploy  Web  Project  first  and  then  deploy  Web  Container  for  this  project.    

Web  Project  Deployment  Let’s   look   to   the   Project   Deployment   first.   The   http://Drupal.org   website   and   community  resources   providing   number   of   step-­‐by-­‐step   guides   for   installing   Drupal   projects   ranging   from  manual  instructions  to  fully  unattended  setup  procedures.    The   proposed   project   deployment   procedure   is   completely   automated   and   taking   into   account  platform   requirements,   naming   conventions   and  project   structure.  Deployment   can  be   executed  using  either  Orchestration  Portal  or  Platform  CLI.  

Page 77: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       77  

   

   

Figure  20  -­‐  Drupal  Project  Deployment  Process  

 The  following  steps  are  usually  executed  for  deploying  Drupal  project:  

1.  The   site-­‐monitor   project   is   cloned   from   git   to   the   new   Drupal   Project   location  corresponding  to  the  ROOT  container  volume.  Let’s  call  it  SITE_ROOT.  

2. The  website  project   is   cloned   from  git   to   the  new  Drupal  Project   location,  or  SITE_ROOT.  The  SITE_ROOT/docroot  symlink  created,  pointing  to  the  docroot  in  the  web  project  folder;  

3. The  site  installation  scripts  executed.  The  following  tasks  being  performed  at  this  stage:  a. Create  website  database;  b. Deploy  fie-­‐system  objects  required  for  site  operation;  c. Deploy  certificates;  d. Deploy  and  adjust  configuration  files;  e. Adjust  file-­‐system  object  owners  and  permissions;  

4. Populate  Database.  There  are  two  possible  approaches:  a. New  “empty”  database  is  created  and  initialized  by  Drupal;  b. Existing  site  DB  snapshot  can  be  transformed  and  imported;  

5. Populate  media  and  assets.  These  objects  stored  in  the  location  corresponding  to  the  DATA  container  volume.  Same  as  with  database,  two  approaches  possible:  

a. Generate  assets  and  site  media  using  programmatic  routines;  b. Sync  all  or  some  real  media  and  assets  from  existing  site;  

Page 78: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       78  

 The  best  this  can  be  demonstrated  by  example.  First  of  all,  a  little  bit  of  background  information.  Every   project   is   belonging   to   a   project   group,   which   is   in   turn   owned   by   a   corresponding  organization.   Or   other  way   around:   the   organization  Demo  Agency   has   the   project   group   called  demo_agency  having  in  turn  the  project  demo.  

 $  /opt/deploy/ldap  org  list  Alpha  Agency  Beta  Agency  Demo  Agency    $  /opt/deploy/gitlab  group  list  alpha_agency  beta_agency  demo_agency  images  vzpoc    $  /opt/deploy/gitlab  project  list  -­‐-­‐group  demo_agency  -­‐-­‐format  table  118        Private        demo        2016-­‐10-­‐06T08:07:27.041Z        https://gitlab.poc.local:8443/demo_agency/demo        Drupal  7.50  demo  project    

Let’s  deploy  demo_agency/demo  project  to  the  d7-­‐demo  website  in  staging  environment:  

 $  /opt/deploy/web  project  deploy  -­‐-­‐site  d7-­‐demo  -­‐-­‐farm  poc  -­‐-­‐env  stg  -­‐-­‐org  'Demo  Agency'  -­‐-­‐project  demo  -­‐-­‐group  demo_agency  web  project  deploy:  looking  up  Drupal  credentials  in  secure  storage...  web  project  deploy:  get  secure  variable  vault:/secret/poc/stg/d7-­‐demo/DRUPAL_ADMIN_NAME  web  project  deploy:  get  secure  variable  vault:/secret/poc/stg/d7-­‐demo/DRUPAL_ADMIN_PASS  web  project  deploy:  Drupal  credentials  missing,  generating  new  set...  web  project  deploy:  saving  Drupal  credentials  in  secure  storage...  web  project  deploy:  put  secure  variable  vault:/secret/poc/stg/d7-­‐demo/DRUPAL_ADMIN_NAME  web  project  deploy:  put  secure  variable  vault:/secret/poc/stg/d7-­‐demo/DRUPAL_ADMIN_PASS  @wbs1  web  project  deploy:  looking  up  Drupal  credentials  in  secure  storage...  @wbs1  web  project  deploy:  get  secure  variable  vault:/secret/poc/stg/d7-­‐demo/DRUPAL_ADMIN_NAME  @wbs1  web  project  deploy:  get  secure  variable  vault:/secret/poc/stg/d7-­‐demo/DRUPAL_ADMIN_PASS  @wbs1  web  project  deploy:  found  Drupal  credentials  in  secure  storage  @wbs1  web  project  deploy:  using  Drupal  credentials  from  secure  storage  @wbs1  web  project  deploy:  folder  /var/web/stg/root/d7-­‐demo  not  found,  creating  @wbs1  web  project  deploy:  folder  /var/web/stg/data/d7-­‐demo  not  found,  creating  @wbs1  web  project  deploy:  folder  /var/web/stg/logs/d7-­‐demo  not  found,  creating  @wbs1  web  project  deploy:  folder  /var/web/stg/cert/d7-­‐demo  not  found,  creating  @wbs1  web  project  deploy:  folder  /var/web/stg/temp/d7-­‐demo  not  found,  creating  @wbs1  web  project  deploy:  creating  private  and  public  file-­‐storage...  @wbs1  web  project  deploy:  folder  /var/web/stg/data/d7-­‐demo/public  not  found,  creating  @wbs1  web  project  deploy:  folder  /var/web/stg/data/d7-­‐demo/private  not  found,  creating  @wbs1  web  project  deploy:  website  DB  host:  10.169.69.12  @wbs1  web  project  deploy:  website  DB  name:  STG_D7_DEMO  @wbs1  web  project  deploy:  looking  up  DBA  credentials  in  secure  storage  @wbs1  web  project  deploy:  get  secure  variable  vault:/secret/poc/mysql/DBA_USER  @wbs1  web  project  deploy:  get  secure  variable  vault:/secret/poc/mysql/DBA_PASS  @wbs1  web  project  deploy:  looking  up  site  DB  credentials  in  secure  storage  @wbs1  web  project  deploy:  get  secure  variable  vault:/secret/poc/stg/d7-­‐demo/SITE_DB_USER  @wbs1  web  project  deploy:  get  secure  variable  vault:/secret/poc/stg/d7-­‐demo/SITE_DB_PASS  @wbs1  web  project  deploy:  DB  credentials  missing,  generating  new  set...  @wbs1  web  project  deploy:  put  secure  variable  vault:/secret/poc/stg/d7-­‐demo/SITE_DB_USER  @wbs1  web  project  deploy:  put  secure  variable  vault:/secret/poc/stg/d7-­‐demo/SITE_DB_PASS  @wbs1  web  project  deploy:  creating  DB  user  if  not  present  @wbs1  web  project  deploy:  building  website  ...  Cloning  into  'monitor'...  Cloning  into  'demo'...  

Page 79: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       79  

@wbs1  web  project  deploy:  site-­‐install:  creating  website  settings:  /var/www/demo/docroot/sites/default/settings.php  @wbs1  web  project  deploy:  site-­‐install:  allow  site  authentication  only  for  users  from:  Demo  Agency  @wbs1  web  project  deploy:  site-­‐install:  configuring  drupal  settings...  @wbs1  web  project  deploy:  site-­‐install:  installing  new  site:  d7-­‐demo  You  are  about  to  CREATE  the  'STG_D7_DEMO'  database.  Do  you  want  to  continue?  (y/n):  y  Starting  Drupal  installation.  This  takes  a  while.  Consider  using  the                [ok]  -­‐-­‐notify  global  option.  Installation  complete.    User  name:  admin    User  password:  *****************    [ok]  VZbase  defaults  configured.                                                                                          [status]  @wbs1  web  project  deploy:  site-­‐install:  rebuilding  node  access  permissions  The  content  access  permissions  have  been  rebuilt.                                              [status]  @wbs1  web  project  deploy:  site-­‐install:  linking  /var/www/demo/docroot/sites/default/files  -­‐>  /var/data/public  @wbs1  web  project  deploy:  site-­‐install:  site  docroot  clean-­‐up  Changing  ownership  of  all  contents  of  /var/www/demo/docroot:    user  =>  none        group  =>  web  Changing  permissions  of  all  directories  inside  /var/www/demo/docroot  to  rwxr-­‐x-­‐-­‐-­‐...  Changing  permissions  of  all  files  inside  /var/www/demo/docroot  to  rw-­‐r-­‐-­‐-­‐-­‐-­‐...  Changing  permissions  of  files  directories  in  /var/www/demo/docroot/sites  to  rwxrwx-­‐-­‐-­‐...  Changing  permissions  of  all  files  inside  all  files  directories  in  /var/www/demo/docroot/sites  to  rw-­‐rw-­‐-­‐-­‐-­‐...  Changing  permissions  of  all  directories  inside  all  files  directories  in  /var/www/demo/docroot/sites  to  rwxrwx-­‐-­‐-­‐...  Done  setting  proper  permissions  on  files  and  directories  @wbs1  web  project  deploy:  project  deployed  successfully    

The  following  major  steps  were  performed  in  the  example  above:  • Since  no  credentials  provided,  generated  a  set  of  credentials  for  Drupal  admin  account;  • Stored  Drupal  admin  credentials  in  the  secure  storage;  • Created  folder  structure  for  the  project.  Later  on,  those  folders  will  be  mapped  to  container  

volumes  and  thus  made  available  to  applications  inside  the  container;  • Generated   project-­‐specific   database   name   and   set   of   user   credentials   according   to   the  

naming  standards;  • Created  corresponding  database  objects  and  granted  access  permissions;  • Stored  database  user  credentials  in  the  secure  storage;  • Deployed  demo_agency/demo  project  from  the  GitLab  code  repository;  • Deployed  vzpoc/monitor  add-­‐on  project  from  the  GitLab  code  repository;  • Executed  project-­‐specific  setup  steps:  

o Created  Drupal  site  settings;  o Setup  LDAP  authentication  and  authorization;  o Installed  new  Drupal  site;  o Rebuilt  Drupal  node  access  permissions;  o Setup  file-­‐system  ownerships  using  best  security  practices;  

 Finally,  let’s  check  deployed  projects  to  validate  that  deployment  was  successful:  

 $  /opt/deploy/web  project  list  -­‐-­‐farm  poc  -­‐-­‐env  stg  d7-­‐demo    

   

Page 80: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       80  

Web  Container  Deployment  Now,   let’s   look   at  Web   Container   deployment   part.   It’s   worth   mentioning   that   website   project  deployment  and  website  container  deployment  are  two  separate  steps  that  usually  being  executed  sequentially,   however,   they   are   completely   independent   of   each   other.  More   details   about  Web  Container  Deployment  process  can  be  found  in  the  Container  Provisioning  Service  chapter.    For  the  sake  of  simplicity  and  consistency,  the  Website  ID  and  Web  Container  name  are  equal.  The  following   Platform   CLI   command   can   be   used   to   deploy   the   Web   Container   for   the   d7-­‐demo  website:  

 $  /opt/deploy/web  container  create  -­‐-­‐farm  poc  -­‐-­‐env  stg  -­‐-­‐site  d7-­‐demo  -­‐-­‐image  nginx-­‐php-­‐fpm  web  container  create:  using  next  free  IP:  10.169.64.232  web  container  create:  checking  10.169.64.232  is  setup          inet  10.169.64.232/26  brd  10.169.64.255  scope  global  secondary  enp0s17:  web  container  create:  folder  /var/web/stg/root/d7-­‐demo  not  found,  creating  web  container  create:  folder  /var/web/stg/data/d7-­‐demo  not  found,  creating  web  container  create:  folder  /var/web/stg/logs/d7-­‐demo  not  found,  creating  web  container  create:  folder  /var/web/stg/cert/d7-­‐demo  not  found,  creating  web  container  create:  folder  /var/web/stg/temp/d7-­‐demo  not  found,  creating  web  container  create:  exporting  container  ENV  variables  from  /opt/deploy/container.env  web  container  create:  creating  container  d7-­‐demo  web  container  create:  |-­‐-­‐  image-­‐tag:  registry.poc:5000/poc/nginx-­‐php-­‐fpm  web  container  create:  |-­‐-­‐  resources:  small  (-­‐-­‐cpu-­‐shares  16  -­‐-­‐memory  64m  -­‐-­‐memory-­‐swap  1G)  web  container  create:  |-­‐-­‐  published:  10.169.64.232:8080:80  web  container  create:  |-­‐-­‐  published:  10.169.64.232:8443:443  web  container  create:  |-­‐-­‐  volume:  /var/web/stg/cert/d7-­‐demo:/etc/apache2/ssl  web  container  create:  |-­‐-­‐  volume:  /var/web/stg/logs/d7-­‐demo:/var/log  web  container  create:  |-­‐-­‐  volume:  /var/web/stg/root/d7-­‐demo:/var/www  web  container  create:  |-­‐-­‐  volume:  /var/web/stg/data/d7-­‐demo:/var/data  web  container  create:  |-­‐-­‐  volume:  /var/web/stg/temp/d7-­‐demo:/var/tmp  web  container  create:  |-­‐-­‐  volume:  tmpfs:/run  web  container  create:  |-­‐-­‐  volume:  tmpfs:/tmp  web  container  create:  |-­‐-­‐  label:  container.env=stg  web  container  create:  |-­‐-­‐  label:  container.size=small  web  container  create:  |-­‐-­‐  label:  container.site=d7-­‐demo  web  container  create:  \__  label:  container.type=web  web  container  create:  started  site  container  cb68618b84b4d3276a77ebd4a0635c5387a8319f1ffaac3759c74820fa32b258    

 

Website  Deployment  Workflow  The  workflow  shown  on  the  screenshot  below  is  performing  both  tasks,  i.e.  deploying  chosen  Web  Project   for   specified   organization   into   selected   environment   and   creating   the  Web  Container   to  serve  this  website  to  internet  users.    Things  to  keep  in  mind:  

• Since  Web  Projects  and  Web  Containers  belonging   to   the  same  Website  are   independent,  they  may  be  managed   separately   and  have  own   lifecycle.   The  Web  Container  may  be   re-­‐deployed  in  case  of  upgrade  or  patching  without   impact  to  Web  Project.  The  Web  Project  may  be  re-­‐deployed  and  still  be  served  by  existing  Web  Container.  The  only  touch-­‐point  or  interface  between  Web  Projects  and  Web  Containers  is  a  set  of  file-­‐system  volumes.  

• Nothing  is  preventing  from  having  more  than  two  instances  for  the  same  website,  making  possible   scale-­‐out   scenarios.   On   the   other   hand,   each  website   instance   has   it’s   own  web  container,   however,   whether   it’s   having   own   (dedicated)   or   shared   project   code   and  content  deployed  on  container  volumes  –  depends  on  the  storage  model  chosen.  

 

Page 81: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       81  

 Figure  21  -­‐  Website  Deployment  Workflow  

 

Editorial  Workflow  Now,   that   Drupal   CMS   instance   is   up   and   running,   it’s   a   time   to   start   adding   website   content.  Drupal  is  a  mature  CMS  itself  and  by  using  various  modules  its  content  management  capabilities  can  be  extended  even  further.  The  VzBase  distribution  includes  all  required  modules,  features  and  configuration  for  implementing  editorial  workflow  depicted  on  the  diagram  below.    The  workflow  is  defining  the  following  states  for  the  unit  of  content:  

• DRAFT  –  the  content  is  still  worked  on  or  requires  amendments;  • NEEDS  REVIEW  –  the  content  work  is  complete  and  it  requires  review;  • PUBLISHED  –  the  content  passed  review,  validation  and  eventually  has  been  approved  for  

publishing.  Although  the  content  is  published,  it’s  not  yet  visible  to  the  public  users;  • PENDING  DEPLOY  –  the  content  has  been  scheduled  for  deployment;  • DEPLOYED  –  the  content  has  been  deployed  to  internet  facing  environment  and  visible  to  

the  public  users;      

Page 82: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       82  

The  editorial  workflow  is  also  assuming  the  following  user  roles:  • Contributor,  or  Author  –  responsible  for  authoring  and  populating  the  web  site  content;  • Editor,   or  Reviewer   –   responsible   for   proof-­‐reading,   checking   and   otherwise   validating  

site  content  against  defined  rules  and  quality  standards;  • Publisher,   or   Content   Manager   –   responsible   for   publishing   the   content,   i.e.   making   it  

accessible  to  public  internet  users;    Since  VzBase  Drupal  distribution  is  using  Platform  IdM  service,  the  Drupal  roles  outlined  above  are  mapped   to  Active  Directory   groups.   This  mapping   is   absolutely   transparent,   so   the   user   role   in  Editorial  workflow  is  defined  by  user’s  LDAP  group  membership.    

 Figure  22  -­‐  Editorial  Workflow  

 Obviously,   at  any  moment   in   time  each  unit  of   content  has  one  of   the   states  defined  above.  The  workflow  actors,   i.e.  Drupal  users  having  corresponding  workflow  roles  can  change  these  states,  thus  executing  workflow  transitions  for  the  given  unit  of  content.      When   content   is   transitioned   between  workflow   states,   the   email   notification   being   sent   to   the  workflow  user   having   an   action.   For   example,  when   content   author   is   pushing   an   article   to   the  NEEDS  REVIEW  state,  the  content  editor  in  charge  is  getting  the  following  email  notification:  

 Subject:  [Publishing-­‐Workflow][Review]  content  review  requested    Dear  editor,    Please  review  the  following  content  [node:url:absolute].    Sincerely,  your  friendly  workflow  engine.    

Page 83: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       83  

Things  to  keep  in  mind:  • Important  point  worth  mentioning  –  although  all  Drupal  users  authenticated  against  Active  

Directory,   only   users   belonging   to   the   organization   owning   the   web   site   can   pass  authentication;  

• Proposed  roles  and  transition  states  are  not  set  in  stone  and,  although  supporting  the  real-­‐life   editorial  process,   they  have  been  provided   for   example  purposes  only.  The  workflow  can  be  changed  to  correspond  the  editorial  workflow  defined  in  your  organization;  

• Notification  emails  can  be  either  sent  automatically  to  all  users  given  certain  role,  which  is  quite  practical   in   smaller   organizations,   or   specific   user  with   given   role  may  be   selected.  For  example,  the  Content  Author  has  been  assigned  a  certain  Editor  for  the  articles  on  given  subject.  In  this  case  when  content  is  submitted  for  review,  the  Author  can  choose  assigned  Editor  and  only  this  specific  Editor  will  be  notified  about  pending  review.  

 

Content  Publishing  Anyone  familiar  with  Drupal  may  wonder,  why  additional  content  publishing  or  deployment  step  is  needed.  Normally,  when  you  published   the  content   in   the  Drupal,   it’s  visible   to   the  site  users.  That’s  true,  this  content  management  model  is  working  fine  for  simple  deployments,  however,  in  more  complex  scenarios  with  multiple  hosting  environments  this  mechanism  falls  short  and  here  is  why…    Historically,  Drupal  CMS  does  not  differentiate  environments  where  content  is  created  and  where  content  is  displayed.  It  does  mean  that  content  author  (oftentimes  editor  and  site  administrator  in  the  same  person)  is  logging  in  to  the  Drupal  site  administration  area  and  making  content  changes.  These  content  changes  are  stored  in  the  database  or  in  the  file-­‐system  and  first  staged,  i.e.  not  yet  visible   to   the  site  visitors.  Whenever  editor  decides  that   the  content   is  ready,  she   is  pushing  the  publish  button  and  voilà  –  the  content  is  live.    In  the  real  life  and  more  complex  scenarios  it  does  not  work  that  way.  What  if  …  

• There   are   multiple   content   authors   and   editors,   often   belonging   to   virtual   and  geographically  distributed  teams.  

• There  are  multiple  versions  of  the  same  content.  What  is  the  source  of  truth?  • There  are  multiple  environments  and  multiple  site  instances.  Where  to  update  the  content?  • The   content   updates   are   only   allowed   from   the   intranet   or   other   dedicated   and  

appropriately  secured  environment.  • The  internet-­‐facing  website  doesn’t  allow  login  and  site  management  for  security  reasons.  • The  internet-­‐facing  website  is  hardened  for  further  security  and  CMS-­‐related  modules  and  

features  are  disabled  or  even  stripped-­‐down.  • The  content  deployment  must  be  performed  during  specific  well-­‐defined  change  windows.  • The  content  deployment  must  be  done  to  several  target  sites.  

 Although,  some  of  the  points  listed  above  may  be  addressed  by  using  certain  Drupal  modules  and  extensions,   eventually,   we   are   coming   to   a   need   to   separate   content   authoring   and   content  publishing  mechanisms.  This  way,  all  content  modification  activities  can  be  conducted  in  isolated  and  secure  “authoring”  environment.  Whenever  desired  result   is  achieved,  the  content  is  pushed  (or  deployed)  to  the  internet-­‐facing  instance  or  instances.    

Page 84: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       84  

 

Figure  23  -­‐  Content  Publishing  Process  

From  a  technical  perspective  this  boils  down  to  copying  the  content  and  some  site  configurations  from   Drupal   website   instance   located   in   one   environment   to   another   instance   in   a   different  environment.   There   are   multiple   approaches   possible,   ranging   from   backup-­‐restore   procedure  requiring  website  downtime  to  incremental  transactional  updates,  not  impacting  website  uptime  or  even  caches  (at  least  unaffected  parts  of  them).    Proposed  approach  is  to  employ  drush  sync  (Unix  rsync  in  background)  for  copying  modified  file-­‐system   objects   and   to   update   content   entities   using   XML-­‐RPC   protocol   implemented   by  Deploy  module  as  depicted  on  the  diagram  below.    There  is  a  comprehensive  DrupalCon  presentation  discussing  in  more  details  issues  and  possible  solutions  for  the  modern  publishing  systems:  https://www.youtube.com/watch?v=EJDGfye3OuQ.    Things  to  keep  in  mind:  

• Multiple   content  deployment  models  possible:   on-­‐demand  updates  performed  as   soon  as  the   content   is   approved,   content   batches   accumulated   over   some   periods   of   time   and  pushed   altogether   or   scheduled   content   deployment   jobs   executed   during   well-­‐defined  content  publishing  windows;  

• Since   there   is   no   content   modification   and   site   administration   performed   through   this  internet-­‐facing  instance,  it  can  be  stripped  from  major  part  of  Drupal  modules  and  code  to  bare  minimum  and  can  be   locked  down,   thus,  making  attack  surface  orders  of  magnitude  smaller  or  even  completely  preventing  some  classes  of  website  attacks;  

Page 85: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       85  

• The  Drupal   features  must  be  used   for  distributing  configuration  changes,  however,   this   is  not  exactly  content  publishing,  rather  configuration  management  task,  which  may  require  different  implementation  approaches  depending  on  Drupal  version;  

• Updating   website   content   in   runtime  may   require   additional   steps   on   the   CDN   (content  caching   layer).   The   parts   of   the   cache   for   the   content   tree   must   be   flushed   and   re-­‐populated.   This   can   be   further   automated   using   APIs   or   using   other   content   flush  mechanisms  depending  on  the  CDN  in  use;  

• Drupal   caching  mechanisms  must   be   also   aligned  with   content   update   strategy   to   avoid  caching   the   stale   content   or   dropping   caches   for   unchanged   content   parts,   effectively  reducing  or  eliminating  effect  of  caching,  which  can  put  additional  stress  on  infrastructure,  if  content  updates  performed  frequently.  

Active  Directory  Structure  The  Active  Directory  Structure  is  following  these  simple  design  principles:  

• Platform-­‐scoped  AD  objects  are  separate  from  Hosting-­‐scope  AD  objects;  • Hosting-­‐scoped  AD  objects  may  refer  to  Platform-­‐scope  AD  objects,  not  other  way  around;  • All   AD   objects   belonging   to   a   hosted   organization   are   encapsulated   in   dedicated   OU  

container;  • The  Platform  may  have  own  Users,  Groups  and  Service  Accounts;  • Each  hosted  organization  may  also  have  own  Users,  Groups  and  Service  Accounts;  

 

 Figure  24  -­‐  Example:  MS  Active  Directory  Structure  

Let’s   describe   in   more   details   the   structure   presented   above.   For   the   sake   of   simplicity   all   AD  objects   stored   under   the   root-­‐level   HOSTING   org-­‐unit.   It   does   include   the   following   platform-­‐scoped  objects  in  corresponding  containers:  

• Groups  –  platform-­‐scoped  LDAP  groups;  • Service  Accounts  –  platform-­‐scoped  service  accounts;  • Users  –  platform-­‐scoped  users;  

 

Page 86: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       86  

The  hosting-­‐scoped  AD  objects  are  further  separated  and  stored  under  the  EXT  org-­‐unit.  The  org-­‐units  provisioned  for  each  hosted  organization  are  provisioned  here.  For  example  AD  objects  for  the  Test  Agency  organization  are  located  under  /HOSTING/EXT/Test  Agency  path  in  LDAP  tree.    Like  with  platform-­‐scope  AD  objects,  the  hosting-­‐scoped  AD  objects  for  particular  organization  are  stored  under   its   org-­‐unit.   Following   the   above   example,   the  Test  Agency   org-­‐unit   has   its   objects  stored  in  corresponding  containers:  

• Groups  –  hosting-­‐scoped  LDAP  groups  belonging  to  specific  hosting  org-­‐unit;  • Service  Accounts  –  hosting-­‐scoped  service  accounts  belonging  to  specific  org-­‐unit;  • Users  –  hosting-­‐scoped  users  belonging  to  specific  hosting  org-­‐unit;  

 For   example,   all   users   belonging   to   the   Test   Agency   will   be   provisioned   under   the  /HOSTING/EXT/Test  Agency/Users  path  in  LDAP  tree.  These  users  can  belong  to  both  groups  from  the   Platform   scope   /HOSTING/Groups   and   to   groups   from   the   hosted   org-­‐unit   scope  /HOSTING/EXT/Test  Agency/Groups.    Eventually,  the  LDAP  tree  is  schematically  looking  like  following:  

 /[HOSTING]                                                -­‐  Root  OU  for  hosting-­‐related  AD  objects          /Groups                                              -­‐  Platform  Groups          /Service  Accounts                          -­‐  Platform  Service  Accounts          /Users                                                -­‐  Platform  Users          /[EXT]                                                -­‐  OU  for  External  hosted  organizations                  /[Organization  1]                  -­‐  OU  for  each  organization                          /Groups                              -­‐  Organization  Groups                          /Service  Accounts          -­‐  Organization  Service  Accounts                          /Users                                -­‐  Organization  Users                  /[Organization  2]                          /Groups                          /Service  Accounts                          /Users                  ...                  /[Organization  N]                          /Groups                          /Service  Accounts                          /Users    

Practically   speaking,   such   LDAP   structure   will   allow   the   user   belonging   to   a   particular  organization,  say  Test  Agency  to  be  a  member  of  GitLab  Users  platform  group  and  thus  gain  access  to   the  GitLab  code  repositories  At   the  same  time  this  user  may  also  belong  to  a   local  group  Test  Agency  Sonar  Users  and  thus  gain  access  to  the  Test  Agency’s  Sonar  projects.    Things  to  keep  in  mind:  

• Obviously,   creating   one   AD   structure   to   fit   all   possible   especially   unknown   future  requirements  is  impossible.  The  presented  AD  structure  and  design  principles  are  thought  to  provide  flexible  enough  foundation  to  be  extended  and  adapted  to  fit  new  and  changed  project  requirements;  

• Platform  Components  are  heavily  relying  on  given  AD  structure,   i.e.  authorization  groups,  service  accounts  and  platform  users  are  looked-­‐up  in  specific  containers.  Shall  AD  structure  need  adjustments,  all  Platform  Components  depending  on  AD  must  be  reviewed  and  their  configuration  must  be  adjusted  accordingly;  

• Due  to  AD  requirements  user  login  names  must  be  unique  within  enterprise.  Although  you  may  have  two  users  with  the  same  “First  Last”  name  in  different  containers,  the  login  name  attribute  must  be  still  unique;  

Page 87: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       87  

• Due  to  AD  requirements  the  group  names  must  always  be  unique,  even  if  they  are  located  in  different  LDAP  containers  and  org-­‐units;  

• Platform   does   not   put   any   limitations   for   AD   object   names   and   generally   MS   ADC  documentation  must  be  checked  for  AD-­‐specific  limitations  and  naming  requirements;  

• During   the   POC   Managed   Service   Accounts   were   not   explored.   All   Service   Accounts   are  plain  accounts;  

• Changing   Service   Accounts   passwords   is   a   simple   and   straightforward   operation,   since  passwords   are   stored   and  managed  by   Secure   Storage   service   on   the  platform   side.   This  makes   regular   password   update   operation   very   easy   to   automate   and   implement  programmatically;  

• The   only   exception   is   Drupal   Service   Account,   whose   credentials   are   stored   in   the   AD  Feature   configuration.   This   feature   code   may   be   extended   to   fetch   credentials   from   the  Secure  Store  in  the  run-­‐time.  

GitLab  Repository  Structure  The   GitLab   Source   Code   Management   platform   access   and   authorization   matrix   can   be   simply  mapped   to   the   LDAP   structure   in   GitLab   Enterprise   Edition   (EE).   The   Community   Edition   (CE),  however,   supports   only   a   subset   of   this   functionality,   which   is   nonetheless   sufficient   for  implementing  user  and  project  isolation  and  role  based  access.    The   GitLab   platform   supports   a   concept   of   project   and   project   namespace,   sometimes   called   a  project   group.   These   project   groups   and  projects  may   have   a   different   scope   and   visibility.   The  users  may  get  assigned  certain  project  roles  and  permissions  for  project  groups  and  projects.    The  following  rules  have  been  established:  

• Each  LDAP  organization  has  one  and  only  one  project  group  assigned;  • The  project  group  name  is  derived  from  the  LDAP  org-­‐name  by  lowercasing  it’s  letters  and  

replacing  spaces  with  the  underscore  (‘_’)  character.  For  example,  the  Test  Agency  LDAP  org  gets  assigned  the  test_agency  GitLab  project  group;  

• Each   GitLab   project   group   is   provisioned  with   private   visibility   level.   Meaning   that   only  users,  who  have  been  explicitly  granted  access  permission,  can  access  this  project  group.  

• Unless   different   specified,   by   default,   GitLab   users   authenticated   via   LDAP   are   given  developer  project  role.  

 Besides  having  dedicated  project  group  for  every  LDAP  organization,  several  project  groups  have  been   reserved   for   the   platform   itself.   The   group   holding   platform   code   is   called   vzpoc.   It   does  contain  the  following  projects:  

• adtool  -­‐  Unix  command  line  utility  for  Active  Directory  administration;  • deploy   -­‐   This   project   contains   Platform   CLI:   the   number   of   tools   along   with   their  

configurations   for   managing   web   containers,   code   deployment   and   performing   other  platform  management  tasks;  

• d7-­‐vzbase-­‐7.50   –  The  Drupal  7.50  built   from  VzBase  profile  with  AD   integration  and   the  Publishing  Workflow  features;  

• drupal-­‐base  -­‐  This  is  a  boilerplate  directory  structure  for  starting  a  new  Drupal  project;  • foundation   –   This   project   contains   composer   scripts   for   bootstrapping   platform  

foundation  services;  • jenkins  –  This  project  contains  Jenkins  workflows  and  configuration  store;  • monitor  –  The  add-­‐on  project,  contains  health-­‐check  monitors  for  LB  VIPs;  • vzbase-­‐7.50  –  This  project  contains  VzBase  Installation  Profile  for  Drupal  7.50;  

 

Page 88: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       88  

Another   platform   project   group   Images   has   been   created   to   store   all   container   image   projects.  Eventually,  the  GitLab  project  structure  is  looking  like  following:  

 /images/          /image_1          /image_2          …          /image_N    /vzpoc/          /adtool          /deploy          /drupal-­‐base          /foundation          /jenkins          /monitor          /d7-­‐vzbase-­‐7.50          /vzbase-­‐7.50          /…    /<ldap_org_name_1>/          /project_1          /project_2          …          /project_N    

Things  to  keep  in  mind:  • Actually,   the  GitLab  project   group  name   can  be  made   equal   to   LDAP  org  name.   This  will  

require,   however,   maintaining   a   separate   GitLab   project   path   property,   which   must   be  HTTP  URL  friendly  and  that  will  be  used  behind  the  scene,  when  specific  project  with  the  given  name  being  addressed;  

• The  platform  projects  names  and  their  group  names  are  not   fixed  and  can  be  adjusted  as  needed.  Since  platform  is  heavily  relying  on  source  control,  the  platform  configuration  must  be  updated  to  account  for  those  changed  names.  

• The   name   validators   implemented   in   Platform   CLI   tools   are   usually   checking   against  predefined   string   length   and   a   subset   of   allowed   characters   quoted   by   vendor.   These  validators  have  to  be  adjusted  per  specific  project  requirements.  

Management  Tasks  and  Workflows  All   platform  management   and   lifecycle   tasks   have   corresponding   jobs   in   Jenkins,   thus   allowing  RBAC  platform  management.  These  platform  management  and  lifecycle  tasks  can  be  divided  into  two  groups:   low-­‐level  tasks  and  workflows.  Generally  speaking,  the  workflow  is  an  ordered  list  of  lifecycle   tasks.   There   are  workflows   for   creating   or   adding   new   hosting   object   and,   eventually,  removing  or  deleting  this  object.  The  workflows  for  updating  or  modifying  hosting  objects  are  not  provided  due  to  practicality  reasons.    Besides   doing   additional   parameter   validation   and   invoking   sub-­‐tasks   in   predefined   order   the  workflows  providing  several  more  benefits:  

• Workflows  are  following  pipeline  as  a  code  paradigm  and  created  using  Groovy  code;  • The  Groovy  code  for  workflows  is  stored  under  source  control;  • The  workflow  parameters  pre-­‐populated  using  dynamic  lists  and  responsive  controls;  • Workflows  can  be  paused,  stopped  and  continued  at  any  point  in  time.  They  can  be  also  re-­‐

played  at  a  later  point  in  time.    

Page 89: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       89  

The  following  high-­‐level  workflows  have  been  defined:  • Organization  Setup  –  creates  all  required  hosting  objects  and  structures  for  newly  added  

organization.  It  does  include  the  following  sub-­‐tasks:  o Create  LDAP  Org  o Create  GitLab  Group  o Create  “Sonar  Users”  LDAP  Group  o Create  SonarQube  Authorization  Group  o Create  SonarQube  Permission  Template  o Add  LDAP  Group  with  “Browse”  Permission  to  Template  o Add  LDAP  Group  with  “See  Source  Code”  Permission  to  Template  

• Organization   Remove   –   removes   hosting   organization   and   all   related   objects.   It   does  include  the  following  sub-­‐tasks:  

o Remove  GitLab  Group  o Remove  SonarQube  Projects  o Remove  SonarQube  Permission  Template  o Remove  SonarQube  Authorization  Group  o Remove  “Sonar  Users”  LDAP  Group  o Remove  LDAP  Org  

• Project  Setup  –  creates  new  GitLab  project  owned  by  given  organization  • Project  Remove  –  removes  GitLab  project  owned  by  given  organization  • User   Setup   –   creates   new   hosting   user   for   the   given   organization.   It   does   include   the  

following  sub-­‐tasks:  o Create  LDAP  Account  o Add  Account  to  “Sonar  Users”  LDAP  Group  o Add  Account  to  “GitLab  Users”  LDAP  Group  o Create  GitLab  Account  o Grant  Account  Access  to  Organization  Projects  

• User   Remove   –   removes   hosting   user   for   the   given   organization.   It   does   include   the  following  sub-­‐tasks:  

o Remove  GitLab  Account  o Remove  LDAP  Account  

• Website   Setup   –   deploying   website   and   all   related   hosting   objects   for   the   given  Organization  and  its  project.  It  does  include  the  following  sub-­‐tasks:  

o Deploy  Website  Project  o Deploy  Website  Container  o Enable  Website  VIP  o Test  Website  Monitor  

• Website   Remove   –   removes  website   and   all   related   hosting   objects.   It   does   include   the  following  sub-­‐tasks:  

o Remove  Website  Container  o Remove  Website  Project  

• Code  Analyzer  –  submits  specified  project  owned  by  given  Organization  for  analysis.  The  analysis  results  may  be  accessed  via  the  Sonar  Portal  

• Container   Image   Builder   –   performs   container   image   build   from   its   specification.   The  image  is  appropriately  tagged,  tested  and  submitted  to  the  Image  Registry.  

 The  sub-­‐tasks  used  by  the  platform  management  workflows  are  stored  in  the  Admin  Tools  folder.  Unlike   workflows,   the   low-­‐level   tasks   do   not   pre-­‐populate   values   and   not   using   dynamic  parameters,  thus  giving  more  control  and  flexibility  to  the  platform  administrator.      

Page 90: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       90  

These  low-­‐level  tasks  may  be  executed  in  arbitrary  order.  The  user  triggering  task  execution  must  ensure  prerequisites  and  provide  parameters  following  predefined  scheme  and  naming  standards.    Below  is  a  list  of  low-­‐level  tasks:  

 $  java  -­‐jar  ./war/WEB-­‐INF/jenkins-­‐cli.jar  -­‐s  http://localhost:8080  list-­‐jobs  Admin_Tools  gitlab_group_add  gitlab_group_adduser  gitlab_group_del  gitlab_group_deluser  gitlab_group_list  gitlab_group_users  gitlab_project_add  gitlab_project_del  gitlab_project_list  gitlab_user_add  gitlab_user_del  gitlab_user_list  ldap_group_add  ldap_group_adduser  ldap_group_del  ldap_group_deluser  ldap_group_list  ldap_group_users  ldap_org_add  ldap_org_del  ldap_org_list  ldap_user_add  ldap_user_del  ldap_user_groups  ldap_user_list  sonar_group_add  sonar_group_del  sonar_project_del  sonar_template_add  sonar_template_addgroup  sonar_template_del  sonar_template_delgroup  web_container_deploy  web_container_health  web_container_ipmap  web_container_list  web_container_remove  web_container_stats  web_project_deploy  web_project_remove  web_vip_down  web_vip_status  web_vip_test  web_vip_up    

The   Test   folder   contains   “Test   Platform”   and   “Test   Setup”   tasks.   See   the   Test   CLI   chapter   for  examples  and  additional  details.    Things  to  keep  in  mind:  

• Management  workflows  are  using  scriptler  scripts   for  generating  and  populating  dynamic  parameter  lists.  Those  scripts  are  shared  among  multiple  Jenkins  tasks;  

• Management  Workflows  and  lifecycle  tasks  propagating  return  codes  from  Platform  tools;    • Management   workflows   don’t   alert   warnings   and   don’t   attempt   recovery.   When   issue  

occurred  in  one  sub-­‐task,  the  whole  workflow  is  marked  as  failed.  And  other  way  around,  if  workflow   execution   succeeded   it   does   mean   that   every   workflow   sub-­‐task   has  accomplished  successfully.  

Page 91: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       91  

Platform  Startup  There   are   several   ways   for   ensuring   repeatable   and   consistent   service   startup,   among   them  systemd   definitions,   custom   shell   scripts   and   docker-­‐composer   scripts.   Without   performing  thorough  comparison,  which  goes  beyond  the  scope  of  this  paper,  we’ll  say  that  the  latter  option  has  been  chosen  to  manage  platform  service  startup,  although  it  does  have  own  shortcomings  that  will  be  outlined  later.    Below  is  an  example  of  composer  definition  file  allowing  booting  up  all  required  services:  

 version:  '2'  services:    ##############################################################################  ###  Registry:  TLS  enabled  local  docker  image  repository  ###  ==========================================================================  ##  Depends  on:  ##  1.  Host  folders:  ${VOL_DATA}/registry/{certs,config,data}  ##  2.  Service  config  file:  ${VOL_DATA}/registry/config/config.xml  ##  3.  TLS  certificates:  ${VOL_DATA}/registry/certs  #  ##  Ensuring  dependencies:  ##  1.  Creating  host  folders  #  $  mkdir  -­‐p  ${VOL_DATA}/registry/{certs,config,data}  #  ##  2.  Creating  service  config  #  $  cat  <<EOT  >  ${VOL_DATA}/registry/config/config.xml  #  version:  0.1  #  log:  #      level:  info  #      fields:  #          service:  registry  #          environment:  production  #  storage:  #          delete:  #              enabled:  true  #          cache:  #              blobdescriptor:  inmemory  #          filesystem:  #              rootdirectory:  /var/lib/registry  #          maintenance:  #              uploadpurging:  #                  enabled:  true  #                  age:  168h  #                  interval:  24h  #                  dryrun:  false  #          redirect:  #              disable:  false  #  http:  #          addr:  :5000  #          debug:  #              addr:  :5001  #          tls:  #              certificate:  /certs/registry.crt  #              key:  /certs/registry.key  #          headers:  #              X-­‐Content-­‐Type-­‐Options:  [nosniff]  #  EOT  #  ##  3.  Creating  TLS  certificates  #  $  mkdir  -­‐p  ~/certs  &&  cd  ~/certs  #  $  openssl  req  -­‐x509  -­‐nodes  -­‐days  365  -­‐newkey  rsa:2048  \  #  -­‐subj  "/C=DE/ST=HE/L=Frankfurt/O=Verizon/OU=Managed  Hosting/CN=registry.poc/[email protected]"  \  #  -­‐keyout  registry.key  -­‐out  registry.crt  

Page 92: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       92  

#  ##  4.  Deploying  certificate  to  container  volume  #  $  sudo  cp  ~/certs/registry.*  ${VOL_DATA}/registry/certs  #  ##  5.  Deploying  certificates  to  the  Docker  certificate  store  (on  each  docker  host)  #  $  sudo  mkdir  -­‐p  /etc/docker/certs.d/registry.poc:5000  #  $  sudo  cp  ~/certs/registry.crt  /etc/docker/certs.d/registry.poc\:5000/ca.crt  #  ##  6.  Restarting  docker  #  $  systemctl  restart  docker.service  #  ##  7.  Validating  registry  service  #  $  docker  tag  busybox  registry.poc:5000/poc/busybox:v1  #  $  docker  push  registry.poc:5000/poc/busybox:v1  #  $  curl  -­‐-­‐cacert  certs/registry.crt  -­‐X  GET  https://registry.poc:5000/v2/poc/busybox/tags/list  #  ##  Startup  command  example:  #  $  docker  run  -­‐-­‐name  registry  -­‐-­‐hostname  registry  -­‐-­‐detach=true  -­‐-­‐restart=always  \  #    -­‐-­‐env  REGISTRY_HTTP_TLS_CERTIFICATE=/certs/registry.crt  -­‐-­‐env  REGISTRY_HTTP_TLS_KEY=/certs/registry.key  \  #    -­‐-­‐volume  ${VOL_DATA}/registry/certs:/certs:ro  -­‐-­‐volume  ${VOL_DATA}/registry/data:/var/lib/registry:rw  \  #    -­‐-­‐volume  ${VOL_DATA}/registry/config/config.xml:/etc/docker/registry/config.xml:ro  \  #    -­‐-­‐publish  ${UTIL_HOST}:5000:5000  \  #    registry:2.5        registry:          container_name:  registry          image:  registry:2.5          hostname:  registry          dns:              -­‐  ${DNS_SERVER1}              -­‐  ${DNS_SERVER2}          ports:              -­‐  ${UTIL_HOST}:5000:5000          volumes:              -­‐  ${VOL_DATA}/registry/certs:/certs:ro              -­‐  ${VOL_DATA}/registry/config/config.xml:/etc/docker/registry/config.xml:ro              -­‐  ${VOL_DATA}/registry/data:/var/lib/registry          environment:              -­‐  REGISTRY_HTTP_TLS_CERTIFICATE=/certs/registry.crt              -­‐  REGISTRY_HTTP_TLS_KEY=/certs/registry.key          cpu_shares:  512          mem_limit:  2G          security_opt:              -­‐  no-­‐new-­‐privileges          restart:  on-­‐failure:5          read_only:  false    ##############################################################################  ###  GitLab:  git  repository  server  ###  ==========================================================================  ##  Depends  on:  ##  1.  Host  folders:  ${VOL_DATA}/gitlab/{config/ssl,logs,data}  ##  2.  TLS  certificates:  ${VOL_DATA}/gitlab/config/ssl  #  ##  Ensuring  dependencies:  ##  1.  Creating  host  folders  #  $  mkdir  -­‐p  ${VOL_DATA}/gitlab/{config/ssl,logs,data}  #  ##  2.  Creating  TLS  certificates  #  $  openssl  req  -­‐x509  -­‐nodes  -­‐days  365  -­‐newkey  rsa:2048  \  #  -­‐subj  "/C=DE/ST=HE/L=Frankfurt/O=Verizon/OU=Managed  Hosting/CN=gitlab.poc.local/[email protected]"  \  #  -­‐keyout  ${VOL_DATA}/gitlab/config/ssl/gitlab.poc.local.key  -­‐out  ${VOL_DATA}/gitlab/config/ssl/gitlab.poc.local.crt  #  ##  Startup  command  example:  #  $  docker  run  -­‐-­‐name  gitlab  -­‐-­‐detach=true  -­‐-­‐restart  always  -­‐-­‐hostname  gitlab.poc  \  

Page 93: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       93  

#    -­‐-­‐publish  ${UTIL_HOST}:8443:443  -­‐-­‐publish  ${UTIL_HOST}:8080:80  -­‐-­‐publish  ${UTIL_HOST}:2222:22  \  #    -­‐-­‐volume  ${VOL_DATA}/gitlab/config:/etc/gitlab  \  #    -­‐-­‐volume  ${VOL_DATA}/gitlab/logs:/var/log/gitlab  \  #    -­‐-­‐volume  ${VOL_DATA}/gitlab/data:/var/opt/gitlab  \  #    gitlab/gitlab-­‐ce:8.6.8-­‐ce.0        gitlab:          container_name:  gitlab          image:  gitlab/gitlab-­‐ce:8.6.8-­‐ce.0          hostname:  gitlab          dns:              -­‐  ${DNS_SERVER1}              -­‐  ${DNS_SERVER2}          ports:              -­‐  ${UTIL_HOST}:2222:22              -­‐  ${UTIL_HOST}:8080:80              -­‐  ${UTIL_HOST}:8443:8443          volumes:              -­‐  ${VOL_DATA}/gitlab/config:/etc/gitlab              -­‐  ${VOL_DATA}/gitlab/data:/var/opt/gitlab              -­‐  ${VOL_LOGS}/gitlab:/var/log/gitlab          environment:              GITLAB_OMNIBUS_CONFIG:  |                  external_url  'https://gitlab.poc.local'                  gitlab_rails['gitlab_shell_ssh_port']  =  2222          cpu_shares:  512          mem_limit:  2G          security_opt:              -­‐  no-­‐new-­‐privileges          restart:  on-­‐failure:5    ##############################################################################  ###  Jenkins:  CI/automation  server  ###  ==========================================================================  ##  Depends  on:  ##  1.  Host  folders:  ${VOL_DATA}/jenkins  #  ##  Ensuring  dependencies:  ##  1.  Creating  host  folders  #  $  mkdir  -­‐p  ${VOL_DATA}/jenkins  #  ##  Startup  command  example:  #  $  docker  run  -­‐-­‐name=jenkins  -­‐-­‐hostname  jenkins  -­‐-­‐detach=true  -­‐-­‐restart=always  \  #    -­‐-­‐cpu-­‐shares  512  -­‐-­‐memory  2G  \  #    -­‐-­‐volume=${VOL_DATA}/jenkins:/var/jenkins_home  \    #    -­‐-­‐publish  10.169.64.245:8080:8080  -­‐-­‐publish  10.169.64.245:50000:50000  \  #    jenkins:2.7.3-­‐alpine        jenkins:          container_name:  jenkins          image:  jenkins:2.7.3-­‐alpine          hostname:  jenkins          dns:              -­‐  ${DNS_SERVER1}              -­‐  ${DNS_SERVER2}          ports:              -­‐  ${UTIL_HOST}:8888:8080              -­‐  ${UTIL_HOST}:50000:50000          volumes:              -­‐  ${VOL_DATA}/jenkins:/var/jenkins_home          tmpfs:              -­‐  /run              -­‐  /tmp:exec          cpu_shares:  512          mem_limit:  2G          security_opt:              -­‐  no-­‐new-­‐privileges          restart:  on-­‐failure:5          read_only:  false  

Page 94: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       94  

 ##############################################################################  ###  Vault:  secure  credentials  storage  ###  ==========================================================================  ##  Depends  on:  ##  1.  Host  folders:  ${VOL_DATA}/vault/{config,data,ssl}  ##  2.  Service  config  file:  ${VOL_DATA}/vault/config.hcl  ##  3.  TLS  certificates:  ${VOL_DATA}/vault/ssl/vault.{crt,key}  ##  4.  Services:  registry  #  ##  Ensuring  dependencies:  ##  1.  Creating  host  folders  #  $  mkdir  -­‐p  ${VOL_DATA}/vault/{config,data,ssl}  #  ##  2.  Creating  service  config  #  $  cat  <<EOT  >${VOL_DATA}/vault/config.hcl  #  backend  "file"  {  #      path="/vault/data"  #  }  #  listener  "tcp"  {  #      address  =  "0.0.0.0:8200"  #      tls_disable  =  0  #      tls_key_file  =  "/vault/ssl/vault.key"  #      tls_cert_file  =  "/vault/ssl/vault.crt"  #  }  #  EOT  #  ##  3.  Creating  TLS  certificates  #  $  openssl  req  -­‐x509  -­‐nodes  -­‐days  365  -­‐newkey  rsa:2048  \  #  -­‐subj  "/C=DE/ST=HE/L=Frankfurt/O=Verizon/OU=Managed  Hosting/CN=vault.poc.local/[email protected]"  \  #  -­‐keyout  ${VOL_DATA}/vault/ssl/vault.key  -­‐out  ${VOL_DATA}/vault/ssl/vault.crt  #  ##  Startup  command  example:  #  $  docker  run  -­‐-­‐name  vault  -­‐-­‐detach=true  -­‐-­‐cap-­‐add  IPC_LOCK  \  #    -­‐-­‐publish  ${UTIL_HOST}:8200:8200  -­‐-­‐env  VAULT_ADDR=https://127.0.0.1:8200  -­‐-­‐env  VAULT_SKIP_VERIFY=1  \  #    -­‐-­‐volume  /var/data/vault/config.hcl:/vault/config.hcl  \  #    -­‐-­‐volume  /var/data/vault/data:/vault/data  \  #    -­‐-­‐volume  /var/data/vault/ssl:/vault/ssl  \  #    registry.poc:5000/poc/vault  server  -­‐config  /vault/config.hcl  #    ##  WARNING:  after  start  the  vault  storage  is  sealed  and  must  be  unsealed    #  prior  to  the  first  use.  The  following  command  must  be  executed  3  times    #  and  3  out  of  5  vault  keys  must  be  provided.  #    #  $  docker  exec  -­‐it  vault  vault  unseal        vault:          container_name:  vault          image:  ${REGISTRY}/vault          depends_on:              -­‐  registry          command:  server  -­‐config  /vault/config.hcl          hostname:  vault          dns:              -­‐  ${DNS_SERVER1}              -­‐  ${DNS_SERVER2}          ports:              -­‐  ${UTIL_HOST}:8200:8200          volumes:              -­‐  ${VOL_DATA}/vault/config.hcl:/vault/config.hcl:ro              -­‐  ${VOL_DATA}/vault/ssl:/vault/ssl:ro              -­‐  ${VOL_DATA}/vault/data:/vault/data:rw          environment:              -­‐  VAULT_ADDR=https://127.0.0.1:8200              -­‐  VAULT_SKIP_VERIFY=1          cpu_shares:  10          mem_limit:  25M  

Page 95: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       95  

       security_opt:              -­‐  no-­‐new-­‐privileges          cap_add:              -­‐  IPC_LOCK          restart:  on-­‐failure:5          read_only:  false    ##############################################################################  ###  influxDB:  time-­‐series  database  ###  ==========================================================================  ##  Depends  on:  ##  1.  Host  folders:  ${VOL_DATA}/influxdb  ##  2.  Services:  registry  #  ##  Ensuring  dependencies:  ##  1.  Creating  host  folders  #  $  mkdir  -­‐p  ${VOL_DATA}/influxdb  #  ##  Startup  command  example:  #  $  docker  run  -­‐-­‐name=influxdb  -­‐-­‐detach=true  -­‐-­‐restart=always  \  #    -­‐-­‐cpu-­‐shares  512  -­‐-­‐memory  1G  -­‐-­‐memory-­‐swap  1G  \  #    -­‐-­‐volume=${VOL_DATA}/influxdb:/influxdb  -­‐-­‐env  ADMIN_USER="root"  \  #    -­‐-­‐publish  8083:8083  -­‐-­‐publish  8086:8086  -­‐-­‐expose  8090  -­‐-­‐expose  8099  \  #    ${REGISTRY}/influxdb        influxdb:          container_name:  influxdb          image:  ${REGISTRY}/influxdb          depends_on:              -­‐  registry          hostname:  influxdb          dns:              -­‐  ${DNS_SERVER1}              -­‐  ${DNS_SERVER2}          ports:              -­‐  ${UTIL_HOST}:8083:8083              -­‐  ${UTIL_HOST}:8086:8086          expose:              -­‐  8090              -­‐  8099          volumes:              -­‐  ${VOL_DATA}/influxdb:/var/lib/influxdb          environment:              -­‐  PRE_CREATE_DB=cadvisor              -­‐  ADMIN_USER=root          cpu_shares:  512          mem_limit:  1G          memswap_limit:  1G          security_opt:              -­‐  no-­‐new-­‐privileges          restart:  on-­‐failure:5          read_only:  false    ##############################################################################  ###  cAdvisor:  container  advisor  ###  ==========================================================================  ##  Depends  on:  ##  1.  Storage  driver  defaults  (credentials/db  name)  ##  2.  Services:  influxdb  #  ##  Startup  command  example:  #  $  docker  run  -­‐-­‐name=cadvisor  -­‐-­‐hostname=`hostname`  -­‐-­‐detach=true  -­‐-­‐restart=always  \  #    -­‐-­‐cpu-­‐shares  100  -­‐-­‐memory  500m  -­‐-­‐memory-­‐swap  1G  \  #    -­‐-­‐volume=/:/rootfs:ro  -­‐-­‐volume=/var/run:/var/run:rw  \  #    -­‐-­‐volume=/sys:/sys:ro  -­‐-­‐volume=/var/lib/docker/:/var/lib/docker:ro  \  #    -­‐-­‐publish=8080:8080  \  #    google/cadvisor:v0.23.8  -­‐storage_driver=influxdb  -­‐storage_driver_db=cadvisor\  #      -­‐storage_driver_host=${INFLUXDB_HOST}:8086    

Page 96: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       96  

   cadvisor:          container_name:  cadvisor          image:  google/cadvisor:v0.23.8          command:  -­‐storage_driver=influxdb  -­‐storage_driver_host=${INFLUXDB_HOST}:8086          depends_on:              -­‐  influxdb          hostname:  util          dns:              -­‐  ${DNS_SERVER1}              -­‐  ${DNS_SERVER2}          ports:              -­‐  ${UTIL_HOST}:18080:8080          volumes:              -­‐  /:/rootfs:ro              -­‐  /var/run:/var/run:rw              -­‐  /var/log:/var/log:ro              -­‐  /sys:/sys:ro              -­‐  /var/lib/docker/:/var/lib/docker:ro          cpu_shares:  100          mem_limit:  500m          memswap_limit:  1G          security_opt:              -­‐  no-­‐new-­‐privileges          restart:  on-­‐failure:5          read_only:  false    ##############################################################################  ###  Grafana:  metric  analytics  and  visualization  suite  ###  ==========================================================================  ##  Depends  on:  ##  1.  Host  folders:  ${VOL_DATA}/grafana,  ${VOL_LOGS}/grafana  ##  2.  Services:  registry,  influxdb  #  ##  Ensuring  dependencies:  ##  1.  Creating  host  folders  #  $  mkdir  -­‐p  ${VOL_DATA}/grafana  ${VOL_LOGS}/grafana  #  ##  Startup  command  example:  #  $  docker  run  -­‐-­‐name=grafana  -­‐-­‐hostname  grafana.poc  -­‐-­‐detach=true  -­‐-­‐restart=always  \  #    -­‐-­‐cpu-­‐shares  50  -­‐-­‐memory  50m    -­‐-­‐publish=3000:3000  \  #    ${REGISTRY}/grafana:2.6.0  #  ##  First  boot  setup  for  Grafana:  creating  DSN  and  adding  dashboards  #  $  ./setup.sh  cadvisor  dashboards        grafana:          container_name:  grafana          image:  ${REGISTRY}/grafana:2.6.0          depends_on:              -­‐  registry              -­‐  influxdb          hostname:  grafana          dns:              -­‐  ${DNS_SERVER1}              -­‐  ${DNS_SERVER2}          ports:              -­‐  ${UTIL_HOST}:3000:3000          volumes:              -­‐  ${VOL_DATA}/grafana:/var/lib/grafana              -­‐  ${VOL_LOGS}/grafana:/var/log/grafana          cpu_shares:  50          mem_limit:  50m          memswap_limit:  500m          security_opt:              -­‐  no-­‐new-­‐privileges          restart:  on-­‐failure:5          read_only:  false    ##############################################################################  

Page 97: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       97  

###  SonarQube  ###  ==========================================================================  ##  Depends  on:  ##  1.  Host  folders:  ${VOL_DATA}/sonar/{conf,data,temp},  ${VOL_LOGS}/sonar  ##  2.  Services:  registry  ##  3.  Persistent  DB  storage:  MySql  DB  instance  #  ##  Ensuring  dependencies:  ##  1.  Creating  host  folders  #  $  mkdir  -­‐p  ${VOL_DATA}/sonar/{conf,data,temp}  ${VOL_LOGS}/sonar        sonar:          container_name:  sonar          image:  ${REGISTRY}/sonar:5.6.1          depends_on:              -­‐  registry          ports:              -­‐  ${UTIL_HOST}:9000:9000          environment:              -­‐  SONARQUBE_JDBC_URL=jdbc:mysql://${MYSQLDB_HOST}:3306/sonar?useUnicode=true&characterEncoding=utf8&rewriteBatchedStatements=true          volumes:              -­‐  ${VOL_DATA}/sonar/conf:/opt/sonarqube/conf              -­‐  ${VOL_DATA}/sonar/data:/opt/sonarqube/data              -­‐  ${VOL_DATA}/sonar/temp:/opt/sonarqube/temp              -­‐  ${VOL_LOGS}/sonar:/opt/sonarqube/logs          tmpfs:              -­‐  /run              -­‐  /tmp          cpu_shares:  512          mem_limit:  2G          security_opt:              -­‐  no-­‐new-­‐privileges          restart:  on-­‐failure:5          read_only:  false    networks:      default:          driver:  bridge    

We  can  use  the  following  script  to  start  required  foundation  services  depending  on  the  host  role:  

 #!/usr/bin/env  bash    #  enabling  strict  mode  http://redsymbol.net/articles/unofficial-­‐bash-­‐strict-­‐mode/  set  -­‐euo  pipefail  IFS=$'\n\t'    declare  -­‐x  HOSTNAME=${HOSTNAME:-­‐$(hostname  -­‐s)}    [[  -­‐f  .env  ]]  ||  {  echo  "environment  settings  (.env)  missing,  exiting...";  exit  127;  }    case  ${HOSTNAME}  in      util*)  yml=util.yml  ;;      wb*)      yml=web.yml    ;;      *)          echo  "unknown  system  role,  exiting...";  exit  127;  esac    docker-­‐compose  -­‐-­‐file  ${yml}  config  -­‐-­‐quiet  \  &&  docker-­‐compose  -­‐-­‐file  ${yml}  up  -­‐d  \  ||  {  echo  "errors  in  ${yml},  exiting...";  exit  1;  }    

Page 98: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       98  

Things  to  keep  in  mind:  • Although   docker-­‐composer   is   starting   services   in   proper   order,   defined   via   depends_on  

statements,   it   does   not   ensure  whether   service   startup   completed   and   service   is   able   to  perform  its  function.  For  example,  services  depending  on  image  registry  may  fail  to  start  if  they   attempt   fetching   their   images   before   registry   service   is   fully   up   and   running.  Composer   developers   claiming   that   checking   service   startup   completion   and   health   is  outside  composer’s  remit  and  it  must  be  performed  using  different  external  tools.  Indeed,  several   solutions   have   been   proposed.   They   are   far   from   being   elegant,   but   they   do   the  trick.  For  example,  see  https://github.com/vishnubob/wait-­‐for-­‐it/    

• Composer  definition  is  static  and  does  not  allow  conditions  or  parameters.  The  only  way  to  inject  variables  into  the  script  at  the  moment  is  using  environment  variables.  The   .env   file  located  in  the  current  folder  (by  default  or  other  file  specified  via  command  line  options)  is  sourced  by  composer  and  variables  defined  there  may  be  used  in  the  composer  definition;  

• Not  all  Docker  runtime  options  are  implemented  by  composer  specification  yet.  Hopefully,  it’s  a  matter  of  time  and  composer  specification  will  catch  up  over  time  and  provide  missing  definitions.  

 

Base  OS  Image  Often  underpaid  attention  or  even  overlooked,  the  Base  OS  selection  question  is  one  of  the  most  important  topics  that  must  be  addressed  early  at  the  design  stage  and  here  is  why.    

The  OS  Image  Inside  Container  In   theory   you  may   use   any   Linux   based   OS   and   “it   will   run”   inside   container.   There   is   a   huge  difference  between  “just  run”  and  “optimally  run”,  though.  Not  to  mention  that  OS  image  packaged  inside   container   is   not   really   “running”   inside   container   and   rather   providing   runtime   and  dependencies  for  the  applications  packaged  inside  container,  which  are  still  executed  on  top  of  the  host  OS  kernel.      Basically,   inside  container  you  need  to  have  only  your  application  and  it’s  dependencies.   Indeed,  just   take  a   look  at  busybox  or  cadvisor   container   images.  They  don’t  package  any  OS  or   libraries  along  with  statically  linked  binaries.  Unfortunately,  not  all  applications  can  be  compiled  statically  and   in   some   cases   it   does   not   even   make   any   sense   to   attempt.   The   objective   is   to   optimally  package  applications  in  containers,  not  re-­‐developing  and  re-­‐building  them  from  scratch.    This  is  where  modern  OS  packagers  coming  to  help.  Eventually,  if  you  deploy  RPM  or  DEB  package  it   shall  bring  all  dependencies  along.  While   this  holds   true,  very  soon  we  will   realize,   that   those  dependencies   are  having  own  dependencies,  which   are  having  dependencies   too   and   so  on…   In  reality,  a  single  package  may  have  a  long  tail  of  dependencies  you’d  never  think  of.  At  the  end  of  the   day   you’re   ending   up   installing   hundreds   of   binaries,   libraries   and   files   in   order   to   satisfy  dependencies   of   your   single   required   package.   Still,   this  will   get   you   to   a   better   place   than   just  using  vanilla  OS  image  of  your  choice  as  your  base  image.      Actually,   you   can   approach   this   problem   from   both   directions.   Either   choose   standard,   vendor  provided  base  OS  image  and  remove  unnecessary  packages  and  files  or  you  can  start  from  blank  image  and  add  only  required  packages  and  files.  Both  approaches  are  requiring  expert  knowledge  on  Linux  OS  innards,  though.    

Page 99: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       99  

What   if   you   don’t   care   about   removing   unneeded   stuff   and  making   your   container   images   very  lean?  Eventually,  storage  is  quite  cheap  these  days.  Well,  consider  this  math:  

• Standard  Linux  OS  distributions  put  into  container  will  take  ~600-­‐700MB  is  size  • Add  J2EE  application  or  some  other  runtime  and  you’ll  quickly  hit  1GB  threshold  • It’s  not  uncommon  to  see  container  images  1-­‐2+  GB  in  size  • For  every  one  of  your  20-­‐30  (or  more?)  apps  you’ll  need  another  image  

 The  real  life  experience  showing  that  those  numbers  multiply  rather  quickly  and  you  won’t  notice  how   your   container   images   are   taking   hundreds   of   gigabytes.   Even  with   effective,   layered   file-­‐system  employed  by  Docker,  the  image  sprawl  will  sooner  or  later  take  its  tax.    Now,  let’s  take  pretty  much  any  Linux  OS  image  with  minimal  package  set  installed.  What  we  got  inside:  kernel,  tons  of  drivers,  couple  shells,  several  scripting  languages,  documentation  and  man  pages,  application  supervisors  and  startup  scripts,  etc.  Do  you  really  need  this  stuff  inside  each  and  every  container?  Do  you  need  to  store,  multiply  and  carry  along  hundreds  or  thousands  of  copies  of  this  unneeded  stuff,  absolutely  irrelevant  to  your  application  and  its  functions?  Sure,  layered  or  COW  container  file-­‐system  can  help  to  an  extent,  but  don’t  expect  it  to  solve  the  issue  for  you.    

One  vs.  Multiple  Applications  Another,   almost   religious   subject   –   is   container   supposed   to   run   a   single   binary   or   multiple  processes?   There   is   a   part   of   community,   which   has   taken   a   notion   of   micro-­‐services   and  decomposition   literally   and   attempting   to  package   every   single  binary   to   a   dedicated   container.  This  results  in  overly  complex  orchestration  mechanisms,  integration  and  performance  issues.  Are  there   any   benefits?   Not   really.   Another   part   of   community   is   attempting   to   package   everything  inside  a  single  container,  basically  thinking  of  container  as  a  “lightweight  VM”  -­‐  the  flawed  notion  spread   by   multiple   analysts.   Heck,   they   even   run   SSH   inside   container,   to   login   and   update  packages  inside.  Obviously,  this  is  completely  defeating  the  purpose  of  containers.    The  right  approach  is  a  midway:  you  can  have  multiple  binaries  or  applications  packaged  inside  a  single  container  as  long  as  they  are  contributing  to  the  same  business  function  and  where  further  decomposition  does  not  bring  any   tangible  benefits.  Good  example  would  be   the  web  container.  Does  it  make  sense  to  have  one  container  running  just  Apache  and  another  one  running  PHP-­‐FPM?  Technically  it’s  doable,  but  what  benefits  it  will  give  us,  not  counting  added  complexity?  There  are  none.  Hence  why,  Apache  and  PHP-­‐FPM  made  it   into  the  same  container  and  since  inter-­‐process  communication  is  not  traversing  the  container  border  we  won’t  experience  increased  latency  that  would  surely  bit  us  sooner  or  later  otherwise.    

Process  Supervisor  Now,  one  would  ask,  if  there  are  multiple  processes  inside  a  single  container,  how  will  they  start  and   who   will   supervise   them?   This   naturally   leads   us   to   another   important   aspect   –   process  supervision  inside  container.      Can  we  use  existing  mechanisms  provided  by  modern  Linux  OS?  Not  really,  they  were  designed  for  a  different  purpose  and  often  are  real  overkill  for  our  needs.  So,  there  is  a  need  for  a  lightweight  process   supervisor   that   will   take   care   of   process   startup   and,   even   more   importantly,   their  lifecycle  too.      Now  it  will  get  a  bit  spooky…  Someone  or  something  has  to  reap  zombies.    We  won’t   go   into  depth  of  Unix  process   lifecycle,   just  mention   that   there   is   a   special  process   in  Unix  OS,  often-­‐called   init  or  PID  1  process.  One  of   its  tasks  is  to  take  care  of  orphaned  processes  

Page 100: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       100  

(those  having  no  parent)  and,  if  needed,  adopt  them  and  manage  their  lifecycle.  If  no  one  cares  of  orphaned   processes,   at   some   point   they   will   die   and   becoming   a   zombie.   They   can’t   be   killed,  obviously,   since   they   are   already   dead.   Over   time,   if   not   paid   attention,   they   can   turn   system  management  to  a  horror  movie…  Ever  seen  a  system  with  thousands  of  zombie  processes?    Looking  to  this,  it’s  a  very  good  idea  to  have  a  process  supervisor  even  if  your  container  is  running  a  single  process.  Unless,  this  single  process  is  able  to  handle  all  signals  sent  to  container  properly  and  does  not  fork  children  processes.    

Quick  Summary  Let’s  summarize  what  we  learned  above:  

• Statically   linked   self-­‐contained   applications   don’t   require   any   OS   at   all.   They   are   simple,  lean,  effective  and  coming  close  to  a  micro-­‐service  architecture  dreams  and  promises;  

• Using   the  whole  Linux  distribution  as   the  base   image   for   your   containers   is  BAD!   It  may  work  “in  vitro”,  but  is  rather  bad  idea  to  use  “in  vivo”;  

• Using  package  managers  is  good,  since  they  taking  care  of  dependencies,  however,  beware  the  long  tail  of  dependencies;  

• It’s   not   bad   to   have  multiple   processes   and   applications   inside   one   container,   as   long   as  they  fulfilling  or  supporting  the  same  business  function;  

• Processes   in   container   do   require   supervisor   to   pass   signals   and   manage   children   and  orphans!  If  you  won’t  supervise  them,  beware  side  effects…  

 Also,  take  a  look  at  the  following  pages  and  articles:  https://phusion.github.io/baseimage-­‐docker/#intro  https://blog.phusion.nl/2015/01/20/docker-­‐and-­‐the-­‐pid-­‐1-­‐zombie-­‐reaping-­‐problem/  https://www.ctl.io/developers/blog/post/optimizing-­‐docker-­‐images/  http://jonathan.bergknoff.com/journal/building-­‐good-­‐docker-­‐images  https://www.dajobe.org/blog/2015/04/18/making-­‐debian-­‐docker-­‐images-­‐smaller/    So,   what   are   the   viable   options?   The   community   reacted   to   most   issues   outlined   above   and  explored   various   options   ranging   from   stripped-­‐down  OS   images,  made   out   of   existing   popular  Linux   distributions   to   new,   purpose-­‐built   distributions   created   to   support   specific   needs   of  containerized  applications:  

• Alpine  Linux  https://github.com/gliderlabs/docker-­‐alpine    • Phusion  https://github.com/phusion/baseimage-­‐docker    • Debian  Slim  (internal  image  repository)  

 This  POC  project  has  explored  various  options  and  all  images  have  been  built  using  either  Alpine  Linux  base  image  or  custom  Debian  Slim  image.  The  Debian  Slim  is  basically  Debian  8.1  stripped-­‐down   to   <50MB,   yet   fully   functional   Debian   Linux   distribution.   Besides   removing   unneeded  packages  and  files,  without  breaking  any  functionality,  there  have  been  several  configurations  and  optimizations  put  in  place,  so  that  this  image  runs  well  in  container  environment.    Talking  of  process  supervisors,  there  are  multiple  choices  available  too:  

• S6  Supervisor  https://github.com/just-­‐containers/s6-­‐overlay    • Supervisord  http://supervisord.org/  • Dumb-­‐Init  https://github.com/Yelp/dumb-­‐init    • My_init  https://github.com/phusion/baseimage-­‐docker/blob/master/image/bin/my_init    

 

Page 101: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       101  

The  best  experience  has  been  made  with  S6  supervisor  and  it’s  been  used  for  most  custom-­‐built  images  during  the  POC  project.    Things  to  keep  in  mind:  

• Unfortunately,  there  is  no  single  best  answer,  when  it  comes  to  choosing  the  base  image  for  your   containers   images.   It   will   depend   a   lot   on   specific   project   requirements,   available  experience  and  organizational  practices  and  policies.  Nonetheless,  it  should  be  clear  by  now  that   base   image   selection   topic   worth   a   separate   discussion   and   must   be   given   careful  consideration;  

• The   same   goes   for   container   supervisor.   There   are  number   of   good   solutions   and  one  of  them  has  to  chosen  and  used  as  a  standard  solution  across  all  container  images;  

• Base  image  is  a  living,  dynamic  construct.  It  must  be  adjusted,  rebuilt  and  modified  due  to  changed  requirements,   security  needs  and  other  reasons.  Therefore,   image  build  pipeline  should   be   setup   as   a   Continuous   Delivery   workflow   and   all   images   must   be   rebuilt  continuously  and  automatically;  

• Creating  own  base   image   is  not  one-­‐time  task.   It  has  to  be  maintained  and  managed  over  time.   The   image   has   to   be   rebuilt   with   updated   and   patched   packages.   Obviously,   all  depending  images  have  to  be  rebuilt  as  well.  

 

Storage  Scalability  in  Docker  Historically,   the  Docker  project  relied  upon  AUFS  file-­‐system.  Unfortunately,   the  AUFS  has  never  made   it   into   upstream   Linux   kernels   and   due   to   this   –   unsupported   by   Red  Hat   OS   family   and  disabled  by  default  for  Ubuntu  Linux.  This  leaves  us  with  the  following  options:  

• Device-­‐Mapper  loopback  (or  loop-­‐lvm)  –  default  configuration  for  RHEL;  • Device-­‐Mapper  (or  direct-­‐lvm)  –  recommended  by  RHEL  configuration;  • BTRFS  –  Docker’s  upstream  default;  • OverlayFS  –  considered  experimental  by  RHEL;  • ZFS  –  is  just  getting  popularity  and  traction  in  container  community.  

 More  details,  in  depth  comparison  and  overview  can  be  found  in  the  following  articles:  

• Red   Hat   Developers   Blog:   Comprehensive   Overview   of   Storage   Scalability   in   Docker  http://developerblog.redhat.com/2014/09/30/overview-­‐storage-­‐scalability-­‐docker/  

• The  Deep  Dive   into  Docker   Storage  Drivers   http://jpetazzo.github.io/assets/2015-­‐07-­‐01-­‐deep-­‐dive-­‐into-­‐docker-­‐storage-­‐drivers.html    

• Docker:  Understanding  images,  containers  and  storage  drivers  https://docs.docker.com/engine/userguide/storagedriver/imagesandcontainers/    

 Below  we’ll   provide   just   a   quick   summary   for   those   storage   options.   The   good   news,   there   are  multiple  options  to  choose  from.  Not  so  good  news,  as  it  often  a  case,  there  is  no  single  best  choice  and  each  option  is  having  strong  and  week  sides.  Nonetheless,  a  general  guidance  can  be  given:  

• If  you  do  PaaS  or  other  high-­‐density  environment  -­‐  OverlayFS  is  your  best  option;  • If  you  put  big  writable  files  on  the  CoW  file-­‐system  –  something  you  should  not  be  doing,  

really  –  Device  Mapper,  BTRFS  or  ZFS  would  be  a  right  choice;  • If  memory  is  scarce  or  limited,  ZFS  should  be  avoided;  • If  you  want  mature  file-­‐system  backend  with  little  or  no  maintenance  –  Device  Mapper  or  

Direct-­‐LVM  to  be  precise,  is  the  way  to  go.    

Page 102: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       102  

This   is   leaving  us  with   two  storage  backend  candidates   for   the  hosting  platform:  OverlayFS  and  Direct-­‐LVM.   At   the   time   of   writing,   Red   Hat   is   still   considering   OverlayFS   an   experimental  technology  and  recommends  using  Direct-­‐LVM  for  production  workloads.    Things  to  keep  in  mind:  

• The  OverlayFS  is  requires  so-­‐called  backing  file-­‐system.  It  is  possible  to  use  both  EXT4  and  XFS  for  this,  however,  XFS  is  clear  winner  in  terms  of  performance;  

• The  OverlayFS  is  using  shared  page  cache,  which  is  making  this  file-­‐system  a  clear  winner  in  terms  of  memory  usage  and  performance;  

• Red  Hat  is  actively  developing  and  contributing  to  OverlayFS  project  (same  as  BTRFS  and  Direct-­‐LVM)  development;  

• The  POC  project  has  explored  both  Direct-­‐LVM  and  OverlayFS  options  and  OverlayFS   is  a  clear  winner  from  both  performance  and  operation  perspective;  

• The  OverlayFS2  has  not  been  tested.  It  does  require  4.x  kernel  not  provided  by  RHEL;  • Red  Hat  performance   engineers   are   calling  OverlayFS   a  winner   in   terms  of   performance.  

There  must  be  some  reasons,  such  as  supporting  SELinux  on  OverlayFS,  which  are  keeping  Red  Hat   from  making   this   file-­‐system  a   favorite   choice..  After  multiple  enquiries  Red  Hat  did  not  provide  any  answers.  

   

Loop  LVM  This  is  default  configuration  in  Red  Hat  OS  family.  A  quick  summary:  

• By  default,  Docker  puts  data  and  metadata  on  a  loop  device  backed  by  a  sparse  file;  • This  is  great  from  a  usability  point  of  view  (zero  configuration  needed);  • But  terrible  from  a  performance  point  of  view.  Each  time  a  container  writes  to  a  new  block,  

a   block   has   to   be   allocated   from   the   pool,   and   when   it's   written   to,   a   block   has   to   be  allocated  from  the  sparse  file.  Sparse  file  performance  isn't  great  anyway.  

 This  option  is  best  used  for  getting  started  trials  and  sandbox  labs  to  get  familiar  with  Docker  and  containers.  For  anything  more  demanding  in  terms  of  performance  use  other  options.    

Direct-­‐LVM  The  same  idea,  just  using  read  devices  for  data  and  metadata:  

• Each  container  gets  its  own  block  device  with  a  real  file-­‐system  on  it;  • So  you  can  also  adjust  (with  -­‐-­‐storage-­‐opt):  

o file-­‐system  type;  o file-­‐system  size;  o discard  (if  you  use  SSD);  

• Caveat:  when  you  start  1000x  containers,  the  files  will  be  loaded  1000x  from  disk!    Although  Red  Hat  is  claiming  this  is  the  best  option  to  run  your  containers  today,  in  reality  it’s  not.  You’re  getting  good,  but  not  the  best  performance  and  often  you  can’t  remove  stopped  container,  because   its   devices   are   still   mounted.   Whether   it’s   a   matter   of   adding   longer   time-­‐out   when  stopping  container  or  not,  prepare  for  additional  hassle.    

Page 103: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       103  

BTRFS  Considered   by   many   as   the   most   natural   fit   for   Docker.     It   meets   the   basic   requirements   of  supporting  CoW  (Copy-­‐on-­‐write),   it’s  performing  moderately,   and  being  actively  developed.  The  BTRFS  does  not  currently  support  SELinux,  nor  does  it  allow  page  cache  sharing.  To  summarize:  

• BTRFS  works  by  dividing  its  storage  in  chunks;  • A  chunk  can  contain  data  or  metadata;  • You  can  run  out  of  chunks  (and  get  “No  space  left  on  device”),  even  though  df  shows  space  

available  because  the  chunks  are  not  full.  So  this  file-­‐system  is  not  exactly  maintenance  free.  • There  is  not  much  to  tune.  

 When  BTRFS  shortcomings  will  be  addressed  and  file-­‐system  will  mature  a  bit,  it  may  become  an  appealing  option.  For  now,  it’s  not  really  ready  for  a  primetime.    

OverlayFS  The  OverlayFS  or  simply  overlay   is  a  modern  union  file-­‐system  that  also  meets  the  basic  Docker  requirements  and  may  be  considered  as  a  successor  for  AUFS.  The  quick  description  of  OverlayFS  is  that  it  combines  a  lower  (parent)  an  upper  (child)  file-­‐system  and  a  workdir  (on  the  same  file-­‐system  as  the  child).    The   lower  file-­‐system  is  the  base   image,  and  when  you  create  new  Docker  containers,  a  new  upper  file-­‐system  is  created  containing  the  deltas.      A  quick  summary:  

• Identical  files  are  hard-­‐linked  between  images.  This  avoids  doing  composed  overlays;  • It  is  very  fast;  • It  allows  for  page  cache  sharing;  • Not  much  to  tune  at  this  point,  however  OverlayFS2  is  providing  more  options;  • Performance  should  be  slightly  better  than  AUFS:  

o No  stat()  explosion;  o Good  memory  use;  o Slow  copy-­‐up,  still!  

 Looking  to  the  past  two  year  developments,  OverlayFS  has  gained  a  lot  of  traction  and  attention.  There   is   OverlayFS2   driver   too,   supported   by   4.x+   kernels,   with   even   better   performance   and  additional  features.  Nonetheless,  the  future  for  this  storage  option  is  looking  bright.  

ZFS  We  won’t  be  spending  time  talking  about  all  virtues  of  ZFS.  Long  story  short  –  it’s  one  of  the  best  file-­‐systems  out  there.  Still,  your  mileage  may  vary,  when  you  employ  it  specifically  for  containers.  A  quick  summary:  

• From  operation  perspective  somewhat  similar  to  BTRFS;  • Fast  snapshot,  fast  creation,  fast  snapshot,  compression...  • Epic  performance  and  reliability,  though  your  mileage  may  vary;  • ZFS   has   the   reputation   to   be   quite   memory   hungry.   Yes   you   got   to   pay   for   those   nifty  

features.  You  probably  don't  want  that  in  a  tiny  VM;  • Setup  and  operation  may  require  specific  expertise.  

 Despite   ZFS   reliability   and   features,   one   needs   to   consider   the   bigger   picture   and   trade-­‐offs  required  for  using  this  great  file-­‐system  for  containers.  

Page 104: Website in a Box or the Next Generation Hosting Platform

 Website  in  a  Box  or  the  Next  Generation  Hosting  Platform  

Copyright  2016                                                        All  Rights  Reserved.  Not  for  disclosure  without  written  permission.       104  

Conclusion  Presented   research   cannot   be   seen   as   an   ultimate   guide   for   building   your   hosting   platform,  however,  it  does  explore  various  options  and  provides  an  architecture  blueprint,  thus,  laying  down  a  solid  foundation  for  production  ready,  productized  solution.    This  research  started  with  exploring  containerization  technology,  its  features  and  its  readiness  for  supporting   various  workload   types.   The   research   continued   further   exploring  DevOps   practices  for   implementing   and   supporting   application   lifecycle,   from  development   stage   and   all   the  way  down   to   the   production   deployment.   All   life-­‐cycle   steps   have   been   automated   and   made  repeatable   by  using   standardized  workflows   and  naming   schemes.   It  was   the   time   to   transition  from  the  bunch  of  scripts  to  the  orchestrated  platform,  built  out  of  well-­‐defined  services.    The  proof-­‐of-­‐concept  project  demonstrated  that  PaaS  solution  employing  containers  for  workload  packaging   and   isolation   is   a   very   good   fit   for  web   application   hosting   requirements.   That   said,  nothing  speaks  against  using  this  platform  for  hosting  other  types  of  workloads  and  applications  that   may   be   decomposed   to   standalone   components   or   services,   namely   micro-­‐services   based  applications.   The   platform   itself   is   following   the   same   design   paradigm   and   composed   from  multiple  loosely  coupled  services  with  well-­‐defined  interfaces.    Special   attention   has   been   paid   to   Drupal   CMS   deployment,   configuration   and   publishing  workflows.   The  main   objective   was   to   demonstrate   how   presented   platform   could   be   used   for  hosting  real-­‐life  applications  and  implementing  Drupal-­‐as-­‐a-­‐Service  solution.    It  is  expected  that  this  work  will  serve  as  a  foundation  for  new  products  and  solutions  at  the  same  time  providing  an  extensive  overview  and  educational  material  for  a  broad  audience.