Website in a Box or the Next Generation Hosting Platform

Copyright 2016 All Rights Reserved. Not for disclosure without written permission.

Website in a Box

The Next Generation Hosting Platform

Slava Vladyshevsky Alex Kostsin

Website in a Box or the Next Generation Hosting Platform

Copyright 2016 All Rights Reserved. Not for disclosure without written permission. 2

Table of Contents PLATFORM OVERVIEW .................................................................................................................................................... 4 INFRASTRUCTURE OVERVIEW ................................................................................................................................................................. 4 NETWORK SETUP OVERVIEW .................................................................................................................................................................. 6

PLATFORM USER ROLES .................................................................................................................................................. 7 PLATFORM COMPONENTS .............................................................................................................................................. 8 PLATFORM SERVICES ................................................................................................................................................................................ 9 Stats Collector ......................................................................................................................................................................................... 9 Stats Database ..................................................................................................................................................................................... 11 Image Registry ..................................................................................................................................................................................... 13 Image Builder ....................................................................................................................................................................................... 15 Deployment Service ............................................................................................................................................................................ 16 Container Provisioning Service ..................................................................................................................................................... 17 Reporting Service ................................................................................................................................................................................ 19 Persistent Volumes ............................................................................................................................................................................. 19 Volume Sync-‐Share Service ............................................................................................................................................................. 20 Persistent Database Storage .......................................................................................................................................................... 22

Database Driver ..................................................................................................................................................................................................................... 23 Percona XtraDB Cluster Limitations ............................................................................................................................................................................. 24

Secure Storage ..................................................................................................................................................................................... 24 Identity Management Service ........................................................................................................................................................ 26 Load-‐Balancer Service ...................................................................................................................................................................... 29 SCM Service ............................................................................................................................................................................................ 30 Workflow Engine ................................................................................................................................................................................ 32 SonarQube Service .............................................................................................................................................................................. 35 Sonar Database .................................................................................................................................................................................... 36 Sonar Scanner ...................................................................................................................................................................................... 36

PLATFORM INTERFACES ........................................................................................................................................................................ 40 API Endpoints ....................................................................................................................................................................................... 40 Command Line Interfaces ............................................................................................................................................................... 40

Platform CLI ............................................................................................................................................................................................................................. 40 Docker CLI ................................................................................................................................................................................................................................ 48

Web Portals ........................................................................................................................................................................................... 48 Stats Visualization Portal ................................................................................................................................................................................................... 48 GitLab Portal ............................................................................................................................................................................................................................ 49 Sonar Portal ............................................................................................................................................................................................................................. 50 Platform Orchestration Portal ......................................................................................................................................................................................... 50

OTHER COMPONENTS ............................................................................................................................................................................ 51 Docker Engine ......................................................................................................................................................................................................................... 51 Docker Containers ................................................................................................................................................................................................................. 51

PLATFORM CAPACITY MODEL ..................................................................................................................................... 52 PLATFORM SECURITY ..................................................................................................................................................... 54 USER NAMESPACE REMAP .................................................................................................................................................................... 54 DOCKER BENCH FOR SECURITY ............................................................................................................................................................ 57 WEB APPLICATION SECURITY .............................................................................................................................................................. 57

PLATFORM CHANGE MANAGEMENT .......................................................................................................................... 59 DRUPAL HOSTING ............................................................................................................................................................ 60 DRUPAL SITE COMPONENTS ................................................................................................................................................................. 61 DRUPAL CONTAINER COMPONENTS .................................................................................................................................................... 62 DRUPAL CONTAINER PERFORMANCE ................................................................................................................................................. 64 Sizing Considerations ........................................................................................................................................................................ 64 Apache vs. NGINX ................................................................................................................................................................................ 66 Performance Test ................................................................................................................................................................................ 67 Process Size Conundrum .................................................................................................................................................................. 69



DRUPAL PROJECT CREATION ................................................................................................................................................................ 73 DRUPAL WEBSITE DEPLOYMENT ........................................................................................................................................................ 76 Web Project Deployment ................................................................................................................................................................. 76 Web Container Deployment ........................................................................................................................................................... 80 Website Deployment Workflow .................................................................................................................................................... 80

EDITORIAL WORKFLOW ........................................................................................................................................................................ 81 CONTENT PUBLISHING ........................................................................................................................................................................... 83

ACTIVE DIRECTORY STRUCTURE ............................................................................................................................... 85 GITLAB REPOSITORY STRUCTURE ............................................................................................................................. 87 MANAGEMENT TASKS AND WORKFLOWS ............................................................................................................... 88 PLATFORM STARTUP ..................................................................................................................................................... 91 BASE OS IMAGE ................................................................................................................................................................. 98 THE OS IMAGE INSIDE CONTAINER .................................................................................................................................................... 98 ONE VS. MULTIPLE APPLICATIONS ...................................................................................................................................................... 99 PROCESS SUPERVISOR ............................................................................................................................................................................ 99 QUICK SUMMARY ................................................................................................................................................................................. 100

STORAGE SCALABILITY IN DOCKER ........................................................................................................................ 101 LOOP LVM ............................................................................................................................................................................................ 102 DIRECT-‐LVM ........................................................................................................................................................................................ 102 BTRFS ................................................................................................................................................................................................... 103 OVERLAYFS .......................................................................................................................................................................................... 103 ZFS ......................................................................................................................................................................................................... 103

CONCLUSION ................................................................................................................................................................... 104 Figure Register Figure 1 -‐ Infrastructure Diagram .................................................................................................................................... 5 Figure 2 -‐ Foundation Infrastructure Diagram ........................................................................................................... 6 Figure 3 -‐ High-‐level Network Diagram ......................................................................................................................... 7 Figure 4 -‐ Platform Components ....................................................................................................................................... 8 Figure 5 -‐ cAdvisor Web UI: CPU usage ...................................................................................................................... 10 Figure 6 -‐ InfluxDB Web Console ................................................................................................................................... 12 Figure 7 -‐ Image Builder UI .............................................................................................................................................. 15 Figure 8 -‐ Sonar Project Dashboard ............................................................................................................................. 38 Figure 9 -‐ Sonar Issue Report .......................................................................................................................................... 39 Figure 10 -‐ Stats Visualization and Analysis Portal ............................................................................................... 48 Figure 11 -‐ GitLab Portal ................................................................................................................................................... 49 Figure 12 -‐ Sonar Portal ..................................................................................................................................................... 50 Figure 13 -‐ Platform Orchestration Portal ................................................................................................................. 51 Figure 14 – Platform Capacity Model ........................................................................................................................... 52 Figure 15 -‐ Drupal CMS: Configuration Portal ......................................................................................................... 60 Figure 16 -‐ Drupal Site Components ............................................................................................................................ 61 Figure 17 -‐ Web Container Components .................................................................................................................... 62 Figure 18 -‐ Stress Test Results ........................................................................................................................................ 67 Figure 19 -‐ Drupal Project Creation Process ............................................................................................................ 73 Figure 20 -‐ Drupal Project Deployment Process ..................................................................................................... 77 Figure 21 -‐ Website Deployment Workflow ............................................................................................................. 81 Figure 22 -‐ Editorial Workflow ....................................................................................................................................... 82 Figure 23 -‐ Content Publishing Process ...................................................................................................................... 84 Figure 24 -‐ Example: MS Active Directory Structure ............................................................................................ 85



Platform Overview This document provides in depth overview for the Proof of Concept project, hereinafter POC, for container-‐based LAMP web hosting. This POC project has been performed to verify technical feasibility and architectural assumptions as well as to demonstrate the prospect customer our expertise in this domain. It’s assumed that this project or its parts will be adopted and productized. No clear requirements have been provided. Therefore, the overall design and architectural decisions have been mostly governed by the following assumptions:

• The platform must provide fully managed website placeholders that will be populated with customer-‐provided code and assets;

• The platform must provide LAMP (Linux, Apache, PHP, MySQL) run-‐time environment; • The platform architecture must be similar to existing Windows hosting platform; • The platform must guarantee high-‐availability for production workloads; • The platform must prevent the noisy-‐neighbors effect, i.e. websites sharing the same

infrastructure must not impact each other performance; • The platform must support different website sizes and resource allocation profiles; • The platform must guarantee resources and be able to report on their usage;

From early project stages it’s been assumed that hosting platform will utilize Linux containers technology popularized by Docker and often referred to as Docker Containers. Obviously, Docker is a good fit for such hosting platform since Docker Containers:

• Allowing for much higher workload density than VMs; • Providing enough workload isolation and containment; • Enabling granular resource management and reporting; • Considered the future of PaaS.

Soon it became apparent that there is much more required than Docker alone for supporting platform requirements and some additional services and components are essential for providing reliable hosting services. Over time, the set of Docker containers and bunch of scripts to manage them evolved into the real platform with well-‐defined services, components and interfaces between them. Operational procedures and workflows have been automated and exposed via different interface to enable future integration and instrumentation. The platform architecture, design approach and processes heavily relying on Twelve-‐Factor App principles. For more details see https://12factor.net/.

Infrastructure Overview Originally the platform has been built on top of Kubernetes cluster for simplified container scheduling and orchestration. Due to the lack of expertise in Support Organization and little acceptance within the account team, this approach has been discontinued and Platform Infrastructure setup followed and adopted as much as possible existing Windows hosting platform architecture.



Figure 1 -‐ Infrastructure Diagram

The POC farm infrastructure is mimicking existing web-‐farm setup for Windows hosting:

• All inbound network traffic is passing CDN/WAF; • The network is split into two security zones: DMZ and TRUST; • The front-‐end services and service components are hosted in DMZ subnet; • The back-‐end and secured components are located in TRUST subnet; • When coming from CDN/WAF, the network traffic is passing firewalls and load-‐balancers; • Production HTTP/S VIPs are passing traffic to HA pair of web instances; • Other HTTP/S VIPs, e.g. Staging are passing traffic to a singe end-‐point; • The TRUST subnet contains DB servers: a cluster for production workloads and a single

instance for staging use; • All platform services and components are running in corresponding containers with

exception for DB instances, which are running directly on host OS. There is additional shared farm, so called Utility or Foundation, one per DC, where various utility services shared across multiple farms and websites being hosted. For production deployment it may be beneficial from security standpoint to place some foundation services into TRUST subnet.



Figure 2 -‐ Foundation Infrastructure Diagram

It is envisioned that existing foundation farm will need to be extended with at least two additional systems for providing required foundation services. This is assuming that the rest of existing foundation services such as Active Directory, DNS, SMTP, NTP, … will be shared with the new platform.

Network Setup Overview The diagram below is showing a logical view on the hosting network structure. It’s worth mentioning that besides TRUST and DMZ VLANs, the Docker is adding one more layer of indirection by creating at least one network bridge per Docker host to pass traffic between containers and external world. There are number of solutions emerged over past couple years, bringing SDN and network virtualization capabilities to container eco-‐system. During this POC project we won’t be exploring these network abstraction solutions and will use standard network stack provided by Docker.



Figure 3 -‐ High-‐level Network Diagram

Platform User Roles The user role definition is tightly bound to the definition scope. The following scopes defined:

• Platform Scope – platform-‐wide scope, including all hosted organizations and applications;

• Organization Scope – includes organization owned objects and applications; • Application Scope – includes objects and components pertinent to a given application;

Specific user roles and their mapping will be dictated by the particular use-‐case and processes accepted within hosting organization. For the sake of simplicity we’ll assume the following major roles defined in the scope of proposed hosting platform:

• Authorized User – a user that passed authentication and has been assigned corresponding permissions:

o Administrator – a management user performing administrations tasks; o Developer – a developer, an individual writing and testing the code; o Content Manager – an editor, an individual authoring and managing the web site

content; • Anonymous User – a website visitor coming from the public Internet;

The Identity Management (IdM) Service performs mapping between user identity and its associated roles. This is implemented using LDAP grouping mechanisms.



Things to keep in mind: • User role depending on the scope, e.g. one user may be Developer in one organization and

act as Content Manager in other organization. While this is possible, generally such cross-‐organization role assignments are discouraged;

• One may differentiate Platform User Roles and Application User Roles for the Applications deployed on the platform. However, both user types are authenticated and authorized using the same IdM Platform Service and as such making no real difference. For example Drupal user roles are subset of platform user roles;

• Both Applications and Platform using IdM Service currently, however, it’s not a mandatory requirement. Additional or alternative Authentication Mechanisms may be used too. For example many Platform services have local user database and local administrative accounts in order to be able to act autonomously in case of IdM Service failure or other issues;

• The website visitor is not required to pass authentication and granted the Anonymous User role by default.

Platform Components Below is the high-‐level diagram of the Platform components. Connectors depicting major1 communication channels and interactions between services and generally may be seen as the “using” statement. The dotted-‐line connectors are showing alternative path.

Figure 4 -‐ Platform Components

1 Major is referring to the fact that some dependencies are not shown to avoid diagram clutter. E.g. pretty much all platform components depending on Persistent Volumes and this is not depicted here.



Different components marked with different colors to differentiate their types: • Red components are administrative or management portals; • Yellow components are Platform Services, generally speaking – containers; • Blue components are development portals; • Grey components are general-‐purpose platform building blocks; • Green components are hosted website instances the user interacting with.

The following platform Actors defined:

• Admin – platform administrator; • Dev – website developer; • Website User – both content manager and public Internet user.

Platform Services Below is a short overview for the Platform Services. For every Service it is providing description of its role, dependencies as well as configuration and usage examples. The service startup instructions in this chapter are provided for demonstration purposes only. Normally services are expected to boot in automated manner, for example using Docker Composer scripts. By using Composer we can ensure repeatable and consistent configuration as well as reliable service startup and recovery. See the Platform Startup chapter for additional details.

Stats Collector Platform Stats Collector is a stateless service implemented as container running on every Docker host and collecting resource usage stats exposed by Docker Engine using Google cAdvisor application https://github.com/google/cadvisor. The quote from the project page: “The cAdvisor (Container Advisor) provides container users an understanding of the resource usage and performance characteristics of their running containers. It is a running daemon that collects, aggregates, processes, and exports information about running containers. Specifically, for each container it keeps resource isolation parameters, historical resource usage, and histograms of complete historical resource usage and network statistics. This data may be exported either by container or machine-‐wide. The cAdvisor has native support for Docker containers and should support just about any other container type out of the box.” Current setup assumes that Stats Collector is using Stats DB service for storing metrics collected from the Docker Engine. Therefore Stats Collector depends on Stats DB service and Docker Engine APIs and must be deployed and booted accordingly. Alternatively, it’s possible to use https://github.com/kubernetes/heapster for stats aggregation and resource monitoring for more complex deployments or query Docker API directly, if more control or flexibility is required. Although cAdvisor instances may be accessed directly and providing Web UI for metric visualization, the more practical approach is to export collected stats to external database that may be used for arbitrary data aggregation, reporting and analysis tasks. The cAdvisor does provide multiple storage drivers out of the box. Current implementation is using InfluxDB time-‐series database for storing collected measurements. Below is an example of the chart produced by cAdvisor in runtime. It has quite limited practical usage if at all and provided just for reference purposes.



Figure 5 -‐ cAdvisor Web UI: CPU usage

Below is an example command for running cAdvisor container:

$ docker run -‐-‐name=cadvisor -‐-‐hostname=`hostname` -‐-‐detach=true -‐-‐restart=always \ -‐-‐cpu-‐shares 100 -‐-‐memory 500m -‐-‐memory-‐swap 1G -‐-‐userns=host -‐-‐publish=8080:8080 \ -‐-‐volume=/:/rootfs:ro -‐-‐volume=/var/run:/var/run:rw -‐-‐volume=/sys:/sys:ro \ -‐-‐volume=/var/lib/docker/:/var/lib/docker:ro \ google/cadvisor:v0.24.0 \ -‐storage_driver=influxdb -‐storage_driver_db=cadvisor -‐storage_driver_host=${INFLUXDB_HOST}:8086 \ -‐storage_driver_user=${INFLUXDB_RW_USER} -‐storage_driver_password="${INFLUXDB_RW_PASS}"

The cAdvisor is still an evolving project and, unfortunately, having own shortcomings, for example it’s only accepting configuration values via command line options. Neither configuration files nor ENV variables currently supported. One of the issues directly following form this – the DB credentials passed as command line parameters in clear text and can be seen in the process list. There are several things to keep in mind:

• Unless default database scheme and credentials used, they must be provided too as storage driver parameters. The database scheme must be created prior to storing collected metrics;



• The cAdvisor does not store collected metrics for more than 120sec by default. Therefore, if database connection is interrupted, the resource metrics are lost. Depending on your specific environment setup and requirements it may be a good idea to review and adjust default buffering and flushing settings;

• More-‐less obvious observation: the more containers running on the host, the more resources will cAdvisor consume and the more traffic will flow between cAdvisor instance and storage backend. Consequently:

o It’s a good idea to limit cAdvisor resource usage to avoid impacting production workloads. On the other hand, pulling the belt too tight may have adverse affects on metrics collection itself. The constraints provided in example above are for demonstration purposes only and must be adjusted for specific setup and environment;

o For busy hosts with high container density it’s recommended to adjust cAdvisor buffering, caching and flushing parameters for the best performance. For example: cAdvisor is collecting metrics during the 1min time frame and flushing them in a single transaction. In certain scenarios increasing this time frame may improve performance without impacting monitoring granularity;

• The cAdvisor requires elevated permissions (-‐-‐userns=host), since it is accessing some objects in the Docker host namespace;

• The cAdvisor project does not enforce security by default, which leaves us with three possible options for running this service. All these options have been explored during the POC project and providing the balance between security and complexity:

o Insecure: using default credentials for storage driver. No additional options required;

o Kind-‐of-‐secure: providing storage driver credentials as command-‐line parameters, so they will show up in the process list;

o Secure: creating a custom build and image for cAdvisor that will handle and pass credentials securely.

• It’s unlikely that cAdvisor Web UI itself is going to be used for production deployment monitoring, therefore it’s recommended to avoid publishing cAdvisor Web UI ports;

• The cAdvisor, being a part of Kubernetes project is quickly evolving and new versions appearing quite often. Although common practice is to use the “latest” image version, it’s recommended to standardize on and run specific cAdvisor version across all deployments for consistent and predictable behavior and results.

Stats Database All metrics gathered by Stats Collector service are passed to and persisted by Stats Database service. This service is implemented as Docker container located on utility host in foundation farm and running InfluxDB time-‐series database https://github.com/influxdata/influxdb. Depending on specific requirements different storage back-‐ends may be used in place of InfluxDB. The choice has been made in favor of InfluxDB for the following reasons:

• Simple and self-‐contained database without external dependencies; • Purpose made database for time-‐series metric storage and querying; • Supported by and integrated into many modern deployment stacks and platforms; • Provides several storage engines geared towards real-‐time data processing; • REST API driven for both management, data ingestion and processing; • Supporting SQL-‐like InfluxQL language for querying database; • Provides flexible controls and data retention policies; • Scalable and supports clustering;



The Stats Database service is indirectly depending on Image Registry service, since its image being pulled from registry by the Docker Engine during the service container startup. Other than that, assuming standalone (non-‐clustered) deployment, the Stats Database service is self-‐sufficient and being used by other services and components such as:

• The Stats Visualization portal – is querying Stats Database for visualized resource metrics; • The Reporting service – is querying Stats Database for compiling various usage reports; • The Stats Collector – is periodically storing measurements in the Stats Database.

The InfluxDB is also providing web console for basic management and querying operations.

Figure 6 -‐ InfluxDB Web Console

Here is an example for running InfluxDB container:

$ docker run -‐-‐name=influxdb -‐-‐detach=true -‐-‐restart=always \ -‐-‐cpu-‐shares 512 -‐-‐memory 1G -‐-‐memory-‐swap 1G \ -‐-‐volume=${VOL_DATA}/influxdb:/influxdb -‐-‐publish 8083:8083 -‐-‐publish 8086:8086 \ -‐-‐expose 8090 -‐-‐expose 8099 \ -‐-‐env ADMIN_USER="root" -‐-‐env PRE_CREATE_DB=cadvisor \ ${REGISTRY}/influxdb

In some cases there may be a need to have separate user accounts with varying access levels. The user with write permissions may be used for storing stats in the DB and read-‐only user may be used for reporting and monitoring activities. Let’s create users with read and write permissions:

$ cat <<"EOT" | docker exec -‐i influxdb /usr/bin/influx -‐username=root -‐password=root -‐path -‐ CREATE DATABASE cadvisor CREATE USER writer WITH PASSWORD '<writer password>' CREATE USER reader WITH PASSWORD '<reader password>' GRANT WRITE ON cadvisor TO writer GRANT READ ON cadvisor TO reader EOT



Now, we will list available databases using InfluxDB client:

$ echo "show databases" | docker exec -‐i influxdb /usr/bin/influx -‐username=root -‐password=root -‐path -‐ Visit https://enterprise.influxdata.com to register for updates, InfluxDB server management, and monitoring. Connected to http://localhost:8086 version 0.10.3 InfluxDB shell 0.10.3 > name: databases -‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐ name cadvisor _internal

Things to keep in mind: • For the sake of simplicity InfluxDB is deployed as standalone instance and therefore it is

not resilient to service failures resulting in data loss until service is recovered. It’s recommended to deploy InfluxDB cluster for production deployments;

• The database size on disk will depend on retention policies and amount of metrics collected over time. The policies and retention rules will need to be adjusted for production use and on case-‐by-‐case basis;

• The service (container) memory consumption will depend on configured storage engine, amount of metrics collected and configuration settings. Those settings will need to be adjusted for production use, keeping in mind resource constraints;

• InfluxDB provides multiple interfaces for monitoring and data querying, including database client application, client libraries for most popular languages as well as REST API endpoint;

• This project is using custom built image for InfluxDB for automating and simplifying basic setup and management tasks. It may behave differently comparing to default image provided by the vendor.

Image Registry All container images used by the POC project are stored in the local image repository provided by Image Registry service. This service is implemented as the Docker container located on utility host in the foundation farm and running Docker Distribution https://github.com/docker/distribution application. Whenever new container image is built – it is stored in the Image Registry. Whenever new container created, its image being pulled from this repository. More details and examples can be found in Docker Distribution project documentation on the following link https://github.com/docker/distribution/blob/master/docs/deploying.md. Being one of the base services, the Image Registry is self-‐contained and does not depend on other Platform services. At the same time the Image Registry is not used directly by Platform services. Usually, it is used indirectly, when Docker Engine cannot find required image in the local image storage on particular host. In this case the image is being queried, validated and pulled from the Image Registry. Here is an example for setting up image registry service. First of all we’ll setup certificates. The SSL keys will need to be generated only once, but have to be deployed on every Docker host:



# executed only once: generating self-‐signed registry certificate, CN=registry.poc $ mkdir -‐p ~/certs $ openssl req -‐newkey rsa:4096 -‐nodes -‐sha256 -‐x509 -‐days 365 \ -‐subj "/C=DE/ST=HE/L=Frankfurt/O=VZ/OU=MH/CN=registry.poc/[email protected]" \ -‐keyout ~/certs/registry.key -‐out ~/certs/registry.crt # executed on each Docker host: # -‐ deploying certificates to the Docker certificate store $ mkdir -‐p /etc/docker/certs.d/registry.poc\:5000 $ cp certs/registry.crt /etc/docker/certs.d/registry.poc\:5000/ca.crt # -‐ restarting docker to activate certificates $ systemctl restart docker.service

Next, we’ll set up host volumes and configuration for the Image Registry service container:

$ mkdir -‐p /var/data/registry/{certs,config,data} $ [ -‐d ~/certs ] && cp ~/certs/* /var/data/registry/certs $ cat <<EOT > /var/data/registry/config/config.xml version: 0.1 log: level: info formatter: text fields: service: registry environment: production storage: cache: layerinfo: inmemory filesystem: rootdirectory: /var/lib/registry http: addr: :5000 tls: certificate: /certs/registry.crt key: /certs/registry.key debug: addr: :5001 EOT

Eventually, we’ll start registry service and validate that it can be accessed over HTTPS:

# starting Docker container with registry service $ docker run -‐-‐name registry -‐-‐hostname registry.poc -‐-‐detach=true -‐-‐restart=always \ -‐-‐env REGISTRY_HTTP_TLS_CERTIFICATE=/certs/registry.crt \ -‐-‐env REGISTRY_HTTP_TLS_KEY=/certs/registry.key \ -‐-‐volume /var/data/registry/certs:/certs:ro \ -‐-‐volume /var/data/registry/data:/var/lib/registry:rw \ -‐-‐volume /var/data/registry/config:/etc/docker/registry:ro \ -‐-‐publish 5000:5000 \ registry:2.5 # verifying registry is working, registry.poc name should resolve to IP owned by the registry service $ docker tag busybox registry.poc:5000/poc/busybox:v1 $ docker push registry.poc:5000/poc/busybox:v1 $ curl -‐-‐cacert ~/certs/registry.crt -‐X GET https://registry.poc:5000/v2/poc/busybox/tags/list {"name":"poc/busybox","tags":["v1"]}



Things to keep in mind: • Most container images are stored in the locally hosted Image Registry, however, some

images are pulled from outside repositories to avoid circular dependencies during the service startup:

o The Docker Distribution container image is provided by Docker and pulled from external registry https://hub.docker.com/r/distribution/registry/

o The Google cAdvisor container image is provided by Google and pulled from the external registry https://hub.docker.com/r/google/cadvisor/

o The GitLab container image is provided by GitLab community and pulled from the external registry https://hub.docker.com/r/gitlab/gitlab-‐ce/

• For the sake of simplicity the Image Registry service is deployed as standalone instance and therefore is not resilient to service failures. The HA deployment is recommended for production use;

• Current implementation is not using any authentication or authorization mechanisms, thus allowing any user to access container images. Although this service is only used inside internal secure perimeter, it’s recommended to implement RBAC policies or at least strong authentication mechanism for production deployments;

• Due to security considerations all traffic is encrypted and service access is only possible using HTTPS protocol as a transport. Depending on security requirements there may be a need to create and sign service SSL keys using trusted CA. Current implementation is using self-‐signed CA and keys. For this to work, those self-‐signed keys must be added to Docker certificate store on every Docker host that is communicating with “Image Registry” service;

• Obviously, there is a trade-‐off with known pro’s and contras, when implementing local registry comparing to externally hosted container registry. For this project it’s been decided to use local registry, however, nothing prevents using external Image Registry service. This is assuming that service integration has been performed, service availability, security and access issues have been addressed.

Image Builder This service is implemented as Platform management. Currently, new image builds have to be triggered manually after Docker files have been modified, however, nothing is speaking against automating this step and triggering image build upon certain event, for example container image code or configuration changes.

Figure 7 -‐ Image Builder UI



There are no services depending on Image Builder. The Image Builder itself is directly depending on SCM service and indirectly on Image Registry where fresh built images being pushed to. Obviously, some secrets such as keys and credentials must be used during the container image build stage. There is a nice write up providing good summary for available solutions and options. See http://elasticcompute.io/2016/01/22/build-‐time-‐secrets-‐with-‐docker-‐containers/. Currently, container images can be built in two modes:

• Build: container image is built from scratch and properly tagged; • Release: after performing image build the image is undergoing tests and, if successful,

pushed to the image repository, thus becoming available for deployment. Things to keep in mind:

• Although container build workflow does include the step for executing tests, currently, there are no actual tests provided. Special care should be taken and container images must be tested manually prior to deploying and using them;

• Sometimes, when memory becomes scarce (multiple SonarQube analysis running) – the image rebuild process may fail with error messages indicating lack of memory. It’s indicating some memory leaks in Docker and hopefully will be fixed in the upcoming releases. This should not occur though in environments with sufficient memory allocation;

• The Docker files for images have been built considering image caching, therefore often image rebuilds must not create significant load. At the same time image caching may become a source of hard-‐to-‐track issues, therefore administrators may need to pay a special care to the local image store and cached images on the systems where builds are performed.

Deployment Service By using Deployment service we can ensure that all projects are following naming, security, configuration and deployment standards and conventions. They can be easily identified, managed and recreated in a standard and repeatable way. See the Drupal Website Deployment chapter for additional details and examples. All project deployment tasks are handled by this service, namely:

• Checking requested parameters against naming standards; • Choosing the target location based on user inputs or defaults; • Validating that target location is ready for deployment; • Cloning requested project version from the code repository; • Cloning required add-‐on projects from the code repository; • Deploying code to the target location; • Running configuration instructions and setup procedures;

The Deployment service is completely decoupled from containers or other infrastructure semantics. From a high-‐level perspective the relationship between related components can be described as:

• Container Provisioning Service is deploying well defined pre-‐configured containers; • Containers are encapsulating applications and are immutable or read-‐only. All volatile and

mutable objects such as content, log files, temporary files, etc. are persisted on volumes or using other persistence mechanisms such as Database Storage;

• Deployment Service is populating host volumes with application objects such as code, configuration, content, etc. Those host volumes are mapped to container volumes and thus becoming available to execution runtime inside corresponding containers.



The Deployment service is used by Deployment workflows via corresponding Platform CLI calls. The service itself is having several dependencies:

• Secure Storage – used to query various credentials and sensitive information; • SCM Service – used to clone requested projects and their dependencies; • Persistent Volumes – used for deployment targets to store project-‐related objects; • Persistent Database Storage – may be indirectly used by project setup scripts, for

example for creating database scheme for the project or populating required database objects.

Things to keep in mind:

• The Deployment service is not making orchestration decisions and therefore must be provided the target location specification by upstream caller. This is done on purpose to keep orchestration logic and mechanisms separate from deployment semantics;

• The Deployment service is a part of Platform CLI component and as such uses platform configuration, settings and naming standards;

• Since provisioning tasks may involve multiple hosts or be invoked remotely, it is required that password-‐less (key-‐based) SSH access is configured between the master and slave nodes;

• Deployment service does just that – deploying projects to target locations according to well-‐defined rules and naming standards. It does not care, nor making assumptions about the applications, custom code or content used by applications deployed inside containers as long as projects following defined project structure.

Container Provisioning Service All container provisioning and de-‐provisioning operations are handled by this service, which is translating requested actions into corresponding Docker commands and API calls. It is still possible to create arbitrary containers using Docker client or APIs, however, for the sake of consistency this approach is discouraged. This can be best explained by the following example. Let’s provision new web container using Docker CLI:

$ docker run -‐-‐name d7-‐demo -‐-‐hostname wbs1 -‐-‐detach=true -‐-‐restart=on-‐failure:5 \ -‐-‐security-‐opt no-‐new-‐privileges -‐-‐cpu-‐shares 16 -‐-‐memory 64m -‐-‐memory-‐swap 1G \ -‐-‐publish 10.169.64.232:8080:80 -‐-‐publish 10.169.64.232:8443:443 \ -‐-‐volume /var/web/stg/root/d7-‐demo:/var/www -‐-‐volume /var/web/stg/data/d7-‐demo:/var/data \ -‐-‐volume /var/web/stg/logs/d7-‐demo:/var/log -‐-‐volume /var/web/stg/temp/d7-‐demo:/var/tmp \ -‐-‐volume /var/web/stg/cert/d7-‐demo:/etc/ssl/web \ -‐-‐tmpfs /run:rw,nosuid,exec,nodev,mode=755 \ -‐-‐tmpfs /tmp:rw,nosuid,noexec,nodev,mode=755 \ -‐-‐env-‐file /opt/deploy/container.env \ -‐-‐label container.env=stg -‐-‐label container.size=small \ -‐-‐label container.site=d7-‐demo -‐-‐label container.type=web \ registry.poc:5000/poc/nginx-‐php-‐fpm

You may have noticed, there are number of additional options and parameters required by the platform itself, its services and naming standards. Although, Container Provisioning Service has made exactly this same call to a Docker engine, there is lot more happening, hidden under the hood.



Now, let’s provision the same web container using Container Provisioning Service. In addition to creating Docker Container it is performing the following essential steps:

• Checking is container name against naming standards; • Checking there is no container with such name already present; • Validating IP address:

o Checking whether provided IP belongs to address pool and whether this IP is not already taken by other container;

o If no IP-‐address provided, then automatically selecting next free IP from the pool; • Checking whether container host volumes present and creating them otherwise; • Adding container labels, specifying web site, its environment, size and container type; • Adding resource constraints and security related options; • Using given image or default one if no container image specified for creating new container.

$ /opt/deploy/web container create -‐-‐farm poc -‐-‐env stg -‐-‐site d7-‐demo -‐-‐image nginx-‐php-‐fpm web container create: using next free IP: 10.169.64.232 web container create: checking 10.169.64.232 is setup inet 10.169.64.232/26 brd 10.169.64.255 scope global secondary enp0s17: web container create: folder /var/web/stg/root/d7-‐demo not found, creating web container create: folder /var/web/stg/data/d7-‐demo not found, creating web container create: folder /var/web/stg/logs/d7-‐demo not found, creating web container create: folder /var/web/stg/cert/d7-‐demo not found, creating web container create: folder /var/web/stg/temp/d7-‐demo not found, creating web container create: exporting container ENV variables from /opt/deploy/container.env web container create: creating container d7-‐demo web container create: |-‐-‐ image-‐tag: registry.poc:5000/poc/nginx-‐php-‐fpm web container create: |-‐-‐ resources: small (-‐-‐cpu-‐shares 16 -‐-‐memory 64m -‐-‐memory-‐swap 1G) web container create: |-‐-‐ published: 10.169.64.232:8080:80 web container create: |-‐-‐ published: 10.169.64.232:8443:443 web container create: |-‐-‐ volume: /var/web/stg/cert/d7-‐demo:/etc/apache2/ssl web container create: |-‐-‐ volume: /var/web/stg/logs/d7-‐demo:/var/log web container create: |-‐-‐ volume: /var/web/stg/root/d7-‐demo:/var/www web container create: |-‐-‐ volume: /var/web/stg/data/d7-‐demo:/var/data web container create: |-‐-‐ volume: /var/web/stg/temp/d7-‐demo:/var/tmp web container create: |-‐-‐ volume: tmpfs:/run web container create: |-‐-‐ volume: tmpfs:/tmp web container create: |-‐-‐ label: container.env=stg web container create: |-‐-‐ label: container.size=small web container create: |-‐-‐ label: container.site=d7-‐demo web container create: \__ label: container.type=web web container create: started site container cb68618b84b4d3276a77ebd4a0635c5387a8319f1ffaac3759c74820fa32b258

By using Container Provisioning service we can ensure that all containers following naming, security, configuration and resource allocation standards. They can be easily identified, managed and recreated in a standard and repeatable way.

$ /opt/deploy/web container list -‐-‐farm poc -‐-‐env stg -‐-‐format table web container list: CONTAINER ID NAMES STATUS ENV SIZE PORTS cb68618b84b4 d7-‐demo Up 16 minutes stg small 10.1.1.2:8080-‐>80/tcp, 10.1.1.2:8443-‐>443/tcp c953adf92e09 d7 Up 3 weeks stg small 10.1.1.2:8080-‐>80/tcp, 10.1.1.2:8443-‐>443/tcp



The Container Provisioning service is used by Deployment workflows via corresponding Platform CLI calls. The service itself having no specific dependencies and is using Docker CLI for performing container management operations. Things to keep in mind:

• The Container Provisioning service is not making orchestration decisions and therefore must be provided the target location specification by upstream caller. This is done on purpose to keep orchestration logic and mechanisms separate from deployment semantics;

• The Container Provisioning service is a part of Platform CLI component and as such uses platform configuration, settings and naming standards;

• Since provisioning tasks may involve multiple hosts or be invoked remotely, it is required that password-‐less (key-‐based) SSH access is configured between the master and slave nodes;

• The Container Provisioning service does just that – provisions properly configured containers. It does not consider, nor making assumptions about the applications, custom code or content used by applications deployed inside containers;

• The Container Provisioning service is the only component that has to be adjusted, if different mechanism or API has to be used for provisioning containers, for example CoreOS rkt or LXD;

• In case of using orchestration engines such as Kubernetes, the Container Provisioning service can implement a wrapper for provided provisioning functionality.

Reporting Service Reporting service is implemented as Docker container that runs queries against Stats Database and compiles reports for aggregated resource usage according to specified conditions and parameters. There are no services depending on Reporting service. The Reporting service itself is depending on Stats Database for fetching report data.

Persistent Volumes One of the platform design paradigms is to keep containers immutable or read-‐only and all volatile and modified data should be stored outside of container on so called container volumes. Since we want this data to be available between container runs these volumes must be persistent. There is another benefit related to keeping application data and content outside of container – it allows achieving the best application performance. Since there is not COW (copy-‐on-‐write) indirection layer in between, all I/O operations are handled effectively by Linux kernel. Things to keep in mind:

• Current platform design is not making assumptions about underlying technology and orchestration layer. For the sake of simplicity the container host volumes are used as persistent volumes implementation;

• There are other options to be explored for mapping container volumes to corresponding SAN volumes, NAS volumes or iSCSI targets. This would allow containers to take their volumes along with them if restarted on a different Docker thus making containers “mobile” and allowing container migrations across available hosts. These options were not explored during this project, however, using them may be essential when running containers on platforms like Kubernetes.



Volume Sync-‐Share Service Horizontal scaling and high availability requirements demand that application span multiple application instances, or containers for this matter. Although session state is kept outside of containers, the static content still has to be shared between multiple application instances. Generally speaking, there are two possible ways for resolving this issue: share file-‐system or synchronize file-‐systems. Every solution is having own strong and weak sides. Both options have been explored and considered viable. The choice is really dictated by specific infrastructure, performance and support requirements. The following comparison shall help selecting the most appropriate option for specific deployment scenario: Shared Content Synchronized Content Implementation approach

Centralized storage holding single file-‐system with many nodes performing access.

Share nothing architecture. Many nodes with multi-‐master replication between file-‐systems.

Storage space requirements

Volume-‐Size Volume-‐Size x N (# of nodes)

Storage throughput All nodes sharing server network link and capped by its throughput. One node may saturate the link and degrade performance for others. Limited by single volume IOPs, quickly degrades with number of nodes.

Throughput and IOPs scale linearly with number of nodes.

File-‐system locking File-‐system locks maintained to allow concurrent access for multiple nodes to a single object. Can lead to stalled I/O operations and, as result, to unresponsive applications.

No file-‐system locks required.

Change propagation Instant Little latency Implementation complexity

Low Moderate

Support complexity Moderate Low Known limitations SendFile kernel support and mmap

must be disabled on shared volumes. Orphaned file-‐system locks may need to be identified and cleaned manually. Storage volume restart may have unpredicted effects on clients, they may need to re-‐mount storage. File-‐system caching may produce inconsistent results across clients.

Large file-‐system changes may take some time to propagate on all clients. In rare cases file may be modified in several locations producing a conflict that has to be resolved either automatically or manually.

Specific application NFS 4.x server and clients SyncThing + inotify



Given overview above, one may still wonder, which route to choose and whether there is a simple rule of thumb to select the most appropriate option. Here we go:

• Implement NFS: o If you have storage array capable of serving files using NFS 4.x protocol; o If your applications don’t require high storage throughput and concurrency; o If you can tolerate noisy neighbors effect at times; o If storage volume size (and/or its cost) is significant; o If you already have expertise in house; o If other parts of your solution using NFS;

• Implement SyncThing: o If you don’t have fault-‐tolerant NFS server and can’t afford it for whatever reason; o If your applications require highest storage throughput and need to scale as they

grow; o If you absolutely can’t tolerate noisy neighbors effect or NFS server downtime; o If you can tolerate little latency required to propagate changes; o If storage volume size is small enough to have redundant copy on every client.

Below is an example of how to start volume sync service:

$ docker run -‐-‐name datasync -‐-‐hostname `hostname` -‐-‐detach=true -‐-‐restart=always \ -‐-‐cpu-‐shares 100 -‐-‐memory 100m \ -‐-‐publish 22000:22000 -‐-‐publish 21027:21027/udp -‐-‐publish 8384:8384 \ -‐-‐volume /var/deploy/prd/data/:/var/sync -‐-‐volume /var/data/datasync:/etc/syncthing \ -‐-‐tmpfs /run:rw,nosuid,nodev,mode=755 -‐-‐tmpfs /tmp:rw,nosuid,nodev,mode=755 \ registry.poc:5000/poc/syncthing

This service has to be started on all Docker host nodes having data volumes that must be kept in sync. After starting, these services have to be introduced to each other or preform handshake and mutual changes have to be allowed between them. It’s one-‐time configuration. All file-‐system changes will be tracked via inotify subscription and updated files will be exchanged between nodes using efficient block exchange protocol similar to BitTorrent. Thus, the change propagation speed grows with the number of nodes participating in exchange. Things to keep in mind:

• SyncThing is relatively young, actively developing application. There may be side effects that have not been studied yet;

• SyncThing configuration can be generated from template and saved to the configuration file. It can be also adjusted using APIs and Web UI. The access to API and Web UI must be appropriately secured;

• SyncThing protocol is ensuring quick delta updates and high performance. During the tests ~100+MB/s sync speed has been measured;

• Although SyncThing can perform dynamic service and network discovery, the static configuration has been used for this project.



Persistent Database Storage Similar to file-‐system volumes persistent database storage is used by applications for persisting structured data. Current design is assuming RDBMS type of database, or MySQL database flavor to be more specific, which is very common DB choice for lightweight DB-‐driven web applications. Due to high-‐availability requirement, the production DB instance is hosted on the MySQL cluster. Whereas other environments such as staging or development may use instances deployed on standalone MySQL host. The MySQL database has multiple flavors and distributions, each having own strong and weak sides:

• MariaDB – Enterprise Cluster Subscription: o Feature-‐wise inferior to other options, although it’s quickly catching-‐up; o Using Galera multi-‐master replications; o MariaDB database proxy MaxScale; o Enterprise support and Consulting are available and reasonable priced;

• Oracle MySQL – vendor supported edition; o Good performance and features; o CGE Clustering option is similar to Always-‐On MSSQL and provides carrier-‐grade

data availability and protection; o Enterprise support is available but is very expensive;

• Percona XtraDB – is free, but vendor support is available on demand: o Outstanding features and performance; o Performance Scheme extensions and number of analysis and tuning tools; o XtraDB Cluster is using Galera multi-‐master replication; o Enterprise support and Consulting are available and reasonable priced;

For this project Percona XtraDB has been selected for its outstanding capabilities and relatively low support cost. However, there are number of things to keep in mind when implementing this approach in production:

• Unlike Oracle and MariaDB implementation, Percona XtraDB does not provide packaged solution for cluster load-‐balancing, however, there are number of community papers and vendor articles for using HAproxy or hardware load-‐balancers for implementing such function;

• With either MySQL server option enforcing DB quotas and even calculating DB sizes may be a complicated tasks. There are some workarounds and solutions, but, generally speaking, this is InnoDB storage engine limitation. Still it has to be considered in hosting environment, when single DB server (or cluster) is shared by multiple DB instances belonging to different clients or projects;

• With either MySQL server option, the DB server sizing is very complex exercise requiring thorough knowledge of MySQL server innards and significant amount of stress and capacity tests to find a rough formula that will tie infrastructure measurements to application measurements such as TPS (transactions per second) or QPS (queries per second). There is no universal formula and such sizing must be performed for every application type;

• The best way to perform reliable application sizing is it execute a number of load and stress tests to measure various performance KPIs and baseline DB server capacity in terms of TPS/QPS and map those estimates to specific hardware profiles.



Database Driver Choosing Database distribution is very important. However, there is another, not less important and often overlooked subject: using proper database API driver for your runtime and applications. For the sake of simplicity we will mostly concentrate on PHP specifics. PHP offers three different APIs to connect to MySQL:

• ext/mysql – old MySQL extension that is used by default. Has been around since PHP 2.0 and will be discontinued soon. Lacking support for most modern MySQL features;

• PDO_MySQL – implements PHP Data Objects interface to standardize access from PHP applications to MySQL 3.x-‐5.x databases. Provides good feature coverage;

• ext/mysqli – improved MySQL extension, allows to access functions provided by MySQL 4.1 and above. Most notably it is supporting:

o Non-‐blocking, asynchronous queries; o Server-‐side prepared statements; o Stored procedures; o Multiple statements; o All MySQL 5.1+ functionality.

Additionally, it’s recommended to employ MySQL Native Driver instead of MySQL Client Libraries used by default. This driver is providing number of advantages:

• MySQL Native Driver uses the PHP memory management system. Its memory usage can be tracked with memory_get_usage() call. This is not possible with libmysqlclient because it uses the C function malloc() instead;

• MySQL Native Driver also provides some special features not available when the MySQL database extensions use MySQL Client Library:

o Improved persistent connections; o The special function mysqli_fetch_all(); o Performance and caching related statistics calls: mysqli_get_cache_stats(),

mysqli_get_client_stats(), mysqli_get_connection_stats(). The performance statistics facility can prove to be very useful in identifying performance bottlenecks;

o SSL Support. MySQL Native Driver has supported SSL since PHP version 5.3.3 o Compressed Protocol Support. As of PHP 5.3.2 MySQL Native Driver supports the

compressed client server protocol. For additional details and features of various PHP APIs and MySQL drivers, see the following link http://dev.mysql.com/doc/apis-‐php/en/apis-‐php-‐introduction.html. Things to keep in mind:

• All container images have been tested using MySQL Native Driver and mysqli API extension; • MySQL Native Driver packages provided for most Linux distributions and can be deployed

using standard management mechanisms; • MySQL Native Driver must be explicitly enabled and configured in PHP configuration

pertinent to your runtime; • Obviously, no custom drivers and configurations can safeguard against bad code and wrong

coding patterns. For example, the high number of opened and not properly closed persistent database connections can quickly exhaust the MySQL server connection pool and thus impact other applications attempting to connect to the same database server.



Percona XtraDB Cluster Limitations Below is the list with specific limitations that must be considered and taken care of, when designing solution based on Percona XtraDB Cluster product:

• Currently replication works only with InnoDB storage engine; • Unsupported queries LOCK/UNLOCK TABLES cannot be supported -‐ LOCK functions; • Maximum Allowed transaction sizes is defined by wsrep_max_ws_rows and

wsrep_max_ws_size variables for more documentation please have a look at: https://www.percona.com/doc/percona-‐xtradb-‐cluster/5.6/wsrep-‐system-‐index.html#wsrep_max_ws_rows

• The minimal recommended size of cluster is 3 nodes. The 3rd node can be an arbitrator. However, a full node can be beneficial in regards to availability concerns and performance of the individual node during rebuild;

• XA transactions cannot be supported due to possible rollback on commit. • When running Percona XtraDB Cluster in cluster mode, avoid ALTER TABLE ...

IMPORT/EXPORT workloads. It can lead to node inconsistency if not executed in sync on all nodes;

• The write throughput of the whole cluster is limited by weakest node. If one node becomes slow, whole cluster is slow. If you have requirements for stable high performance, then it should be supported by corresponding hardware;

• Due to cluster level optimistic concurrency control, transaction that issuing COMMIT may still be aborted at that stage. There can be two transactions writing to same rows and committing in separate Percona XtraDB Cluster nodes, and only one of them can successfully commit. The failing one will be aborted.

Other Database flavors, MariaDB and Oracle GCE in particular, having own limitations too. We won’t be doing thorough analysis of different DB options here. It is beyond the scope of this paper and a subject for a separate research project.

Secure Storage There is a need to store and pass credentials in secure and simple manner between various platform components and hosted applications. There are multiple approaches possible for doing this, ranging from using temporary files and environment variables to implementing encrypted volumes visible to certain containers. There is a nice write up about handling runtime secrets with Docker containers that approaches the subject from more general perspective and exploring various options and mechanisms. See http://elasticcompute.io/2016/01/21/runtime-‐secrets-‐with-‐docker-‐containers/. Actually, the build secrets such as credentials, SSH keys, certificates are equally important and you may want to Check this http://elasticcompute.io/2016/01/22/build-‐time-‐secrets-‐with-‐docker-‐containers/ article as well. After performing deep analysis, which is outside of the scope of this document, and comparing various options, it’s been decided to use dedicated service to encapsulate protected storage functionality. Secure Storage service is implemented as Docker container located on utility host in foundation farm and running HashiCorp Vault https://www.vaultproject.io. All data in vault is encrypted and not accessible for external user without the access key. When vault is locked, no access is possible, until it’s been unlocked by admin using several unlock keys.



Here is an example of how to setup and run vault container. Since all communication with vault is only possible over encrypted transport, first of all we’ll create SSL certificate:

$ mkdir -‐p ${VOL_DATA}/vault/ssl $ openssl req -‐x509 -‐nodes -‐days 365 -‐newkey rsa:2048 \ -‐subj "/C=DE/ST=HE/L=Frankfurt/O=VZ/OU=MH/CN=vault.poc.local/[email protected]" \ -‐keyout ${VOL_DATA}/vault/ssl/vault.key -‐out ${VOL_DATA}/vault/ssl/vault.crt

Next, we’ll create the configuration:

$ cat <<EOT >/var/data/vault/config.hcl backend "file" { path="/vault/data" } listener "tcp" { address = "0.0.0.0:8200" tls_disable = 0 tls_key_file = "/vault/ssl/vault.key" tls_cert_file = "/vault/ssl/vault.crt" } EOT

Finally, we’ll start container:

$ docker run -‐-‐name vault -‐-‐detach=true -‐-‐cap-‐add IPC_LOCK -‐-‐publish ${VAULT_HOST}:8200:8200 \ -‐-‐env VAULT_ADDR=https://127.0.0.1:8200 -‐-‐env VAULT_SKIP_VERIFY=1 \ -‐-‐volume /var/data/vault/config.hcl:/vault/config.hcl \ -‐-‐volume /var/data/vault/data:/vault/data \ -‐-‐volume /var/data/vault/ssl:/vault/ssl \ registry.poc:5000/poc/vault server -‐config /vault/config.hcl

WARNING: right after start the Vault storage is sealed and must be unsealed prior to the first use. The following command must be executed 3 times and 3 out of 5 vault keys must be provided. These vault keys are generated when new Vault is initialized. They must be stored elsewhere in a secure location.

$ docker exec -‐it vault vault unseal

Now, that vault container is running and unsealed, we can query, read and store credentials and other variables as key-‐value tuples:

$ /opt/deploy/vault dir /secret/poc/dev/d7-‐vzbase/ SITE_DB_USER SITE_DB_PASS $ /opt/deploy/vault get /secret/poc/dev/d7-‐vzbase/SITE_DB_PASS a$zrHS+cv}9DH.QR



Things to keep in mind: • The configuration provided above is for demonstration purposes only. The vault is

providing lot more capabilities, such as multiple storage back-‐ends, authentication and authorization mechanisms, access policies, etc. It’s recommended to tailor configuration for production use according to security requirements;

• Although Secure Storage service can be started in a completely automated manner, it will become sealed until required number of unseal keys is provided. This “unseal” operation can be also performed via API, given that the code calling API has access to unseal keys stored externally in a secure manner;

• The vault application is attempting mlock syscall to avoid swapping allocated memory pages to disk. For running this application in unprivileged container the IPC_LOCK capability must be granted, otherwise you may add disable_mlock=true statement to configuration, in order to disable this functionality;

• Due to security considerations all traffic is encrypted and service access is only possible using HTTPS protocol as a transport. Depending on security requirements there may be a need to create and sign service SSL keys using trusted CA. Current implementation is using self-‐signed CA and keys. Current implementation is allowing untrusted certificates. For production use, consider using keys signed by trusted CA;

• This project is using “file” storage backend, which does not support clustering. Consider using different storage backend to support clustering and deploy the service in highly available manner.

Identity Management Service Identity Management or IdM service utilizes MS Active Directory for user authentication and authorization as well as storing service accounts. All access to this service performed using LDAPS protocol for additional transport security. You can use openldap or any other LDAP client to query Active Directory objects. It’s recommended to deploy adtool application, which is using openldap client libraries and providing very convenient CLI tool for managing MS Active Directory objects. First of all, let’s build and configure adtool application:

# we need to install compiler and openldap libraries $ yum install gcc openldap-‐devel # downloading and extracting adtool sources $ curl -‐LOsS http://gp2x.org/adtool/adtool-‐1.3.3.tar.gz $ tar xvzf adtool-‐1.3.3.tar.gz # configuring package, building and deploying binaries $ cd adtool-‐1.3.3 $ ./configure -‐-‐prefix=/opt/deploy -‐-‐datarootdir=/tmp $ make $ make install # since we use openldap client libraries, we need to hint openldap where to look for stuff $ cat <<EOT >>/etc/openldap/ldap.conf URI ldaps://poc.local BASE DC=POC,DC=LOCAL TLS_REQCERT allow EOT



Now that adtool is deployed, you can already access the Active Directory services, however, some commands will only work when using LDAPS protocol and LDAP SSL is properly setup on AD server side. Below we’ll provide steps for setting up LDAP SSL transport. There are multiple approaches possible that will depend on your organization PKI management practices. We’ll assume here that different systems used for certificate generation, signature and LDAP services. For demonstration purposes we will be using self-‐signed CA and certificates. For production systems you’ll need to create Certificate Signing Request (CSR) having required extended attributes and have it signed by Certificate Authority (CA) of your choice. First of all, let’s create and sign SSL certificates:

# 1> Setting a password that will be used for certificate container encryption $ echo "<Your Very Secure Password>" > passwd # 2> Create a root CA: $ openssl req -‐x509 -‐newkey rsa:4096 -‐keyout myCA.key -‐out myCA.pem -‐days 3650 \ -‐subj "/C=DE/L=Frankfurt/O=Verizon/OU=MH/CN=AMB38997ADS100.POC.LOCAL/[email protected]" \ -‐passout file:passwd # 3> Strip the password from the RSA key: $ openssl rsa -‐in myCA.key -‐out myCA_nopass.key -‐passin file:passwd # 4> Export CA certificate bundle along with the private key in PFX format: $ openssl pkcs12 -‐export -‐in myCA.pem -‐inkey myCA.key \ -‐passin pass:$(<passwd) -‐out CA.pfx -‐passout file:passwd # 5> Create CSR configuration: $ cat <<EOT >myCSR.cnf basicConstraints=CA:FALSE keyUsage = nonRepudiation,digitalSignature,keyEncipherment,dataEncipherment extendedKeyUsage=serverAuth,clientAuth subjectAltName=DNS:AMB38997ADS100.POC.LOCAL,DNS:POC.LOCAL,IP:10.169.69.11 EOT # 6> Create a Certificate Signing Request: $ openssl req -‐out myCSR.csr -‐new -‐newkey rsa:4096 -‐nodes -‐keyout myCSR.key \ -‐subj "/C=DE/L=Frankfurt/O=Verizon/OU=MH/CN=AMB38997ADS100.POC.LOCAL/[email protected]" # 7> Sign the request with your CA, using custom config for extended attributes required by AD: $ openssl x509 -‐CA myCA.pem -‐CAkey myCA_nopass.key -‐CAcreateserial -‐req -‐in myCSR.csr -‐days 3650 \ -‐extfile myCSR.cnf -‐out myCSR.pem # 8> Export signed server certificate bundle along with the private key in PFX format: $ openssl pkcs12 -‐export -‐in myCSR.pem -‐inkey myCSR.key -‐out ldaps.pfx -‐passout file:passwd # 9> Do cleanup -‐ we don’t need this password any more: $ rm -‐f passwd

After copying PFX files, the last steps have to be performed on your AD servers:

# 1> Import CA.pfx to Windows' Trusted Root Certification Authorities container # 2> Import ldaps.pfx to 'NTDS\Personal' container for 'Active Directory Domain Services' # 3> Use ldp tool to validate LDAP connectivity.



Now that LDAPS is setup, it’s a time to test adtool functionality:

$ adtool useradd testuser 'CN=Users,OU=HOSTING,DC=POC,DC=LOCAL' # create and rename may be required because "Name Length Limits from the Schema": # backward compatibility the limit is 20 characters for login name. $ adtool userrename testuser 'Firstname Lastname' # set password and unlock $ adtool setpass "Firstname Lastname" '<password>' $ adtool userunlock "Firstname Lastname" # login name $ adtool attributereplace "Firstname Lastname" userPrincipalName testuser # logon Name (pre-‐Windows 2000) $ adtool attributereplace "Firstname Lastname" sAMAccountName testuser # email $ adtool attributereplace "Firstname Lastname" mail [email protected] # First name and Last name $ adtool attributereplace "Firstname Lastname" givenName "Firstname" $ adtool attributereplace "Firstname Lastname" sn "Lastname" # display name $ adtool attributereplace "Firstname Lastname" displayName "Firstname Lastname" # checking user has been created $ adtool list 'CN=Users,OU=HOSTING,DC=POC,DC=LOCAL' CN=Firstname Lastname,CN=Users,OU=HOSTING,DC=POC,DC=LOCAL

The architecture, deployment and setup considerations for Active Directory services are out of scope of this document. See the Active Directory Structure chapter for additional details about setting up platform related LDAP objects. Things to keep in mind:

• The 20 characters logon name limitation should be considered when using very old tools and AD infrastructure. During the project both national alphabet symbols as well as long logon names have been tested successfully and no issues identified;

• The LDAPS is required to avoid sending credentials over un-‐encrypted communication channels. On the Windows side LDAPS is not required, since Windows favors Kerberos. Using Kerberos wasn’t explored during this project, only LDAP SSL secure transport;

• The adtool may use configuration file or, alternatively, all options may be passed via command line parameters. The latter approach is used in platform management tools;

• The LDAP bind credentials used for accessing and managing AD objects are provided by Secure Storage service;

• The vzpoc/adtool git repository contains updated and improved version of adtool with extended commands and functions. In particular, it can manage OU type containers. You may use the build instructions provided above for configuring and building this project;

• The adtool is a part of management framework and also provided as standalone binary depending on openldap client libraries only. Building a static self-‐contained binary does not seem to be feasible, since openldap itself depends on Cyrus SASL implementation, which is rather hard to build statically;

• The host system running adtool must have openldap client libraries deployed.



Load-‐Balancer Service A clustered pair of Citrix NetScaler hardware appliances is performing load-‐balancing function. This is the only service, which is not accessed via APIs and programmatically. While technically it’s possible, most probably such access will be prohibited due the shared nature of those devices. Moving forward it’s planned to use HAproxy for setting up load-‐balancing services and managing them programmatically. The load-‐balancers performing several functions:

• Load-‐balancing. There are several load-‐balancing algorithms supported including handling of sticky sessions. It is expected that session stickiness will be avoided and connections will be distributed using round-‐robin or least-‐connect algorithms;

• SSL Bridging/Termination. Depending on security and application requirements SSL traffic can either be decrypted (SSL termination) or not, and forwarded to elected load-‐balancer pool member. In some cases, e.g. for traffic inspection or session persistence, the traffic is decrypted and re-‐encrypted again using either original or different SSL key and forwarded to elected load-‐balanced pool member (SSL Bridge);

• Network Address Translation. Since VIPs are setup using public Internet addresses resolved by DNS and load-‐balanced pool members are using private network addresses, the load-‐balancer is also responsible for performing NAT when forwarding packets to and from private subnet.

Since programmatic access to load-‐balancers is not available, adding and removing new VIPs may become a real stumbling point. In order to avoid provisioning delays the pool of pre-‐configured VIP slots is created during the hosting farm setup. In this case, every new website is just claiming preconfigured VIP slot. When website is removed or migrated to a different hosting farm this VIP slot is freed up again and available for new websites. For this project, the hosting farm got 10 VIP slots for production and 10 VIP slots for staging websites:

wbp1 10.169.64.211-‐220 213.177.35.146-‐155 # 1st production web server wbp2 10.169.64.221-‐230 213.177.35.146-‐155 # 2nd production web server wbs1 10.169.64.231-‐240 213.177.35.156-‐165 # 1st staging web server

As you can see from the above table, staging VIPs are mapping public IP addresses to private IP address space 1-‐to-‐1. The production VIPs are balancing load between the two web-‐servers wbp1 and wbp2, thus providing highly available setup. One of the important concepts for load-‐balancer service is so called health-‐check script or page. It has several applications:

• Ensuring that web server is up and running and whole application stack is available and can accept inbound connections. For simple websites, this page may be just a static HTML file. For more complex deployments, the page may be dynamic, include the code to initiate basic DB transaction and thus validate multiple layers: web server engine, application engine and DB connection pool health and readiness;

• Controlling whether the webserver instance included into load-‐balanced pool or not. This is useful for maintenance work, code deployment, and other operations requiring short-‐term webserver downtime or re-‐configuration.



Learning from the past lessons, the health-‐check page is deployed outside of the website’s <Document Root>, thus making it independent of website deployment, its structure and page updates. This page is kept in a dedicated code repository and it’s being deployed separately by deployment service along with the website code. There is a dedicated tool provided for controlling load-‐balancer health-‐check routine, making it possible to validate, add or remove website instance from the load-‐balanced pool:

$ /opt/deploy/web vip status -‐-‐farm poc -‐-‐env stg -‐-‐site d7-‐demo @wbs1 web vip status: d7-‐demo VIP state: disabled $ /opt/deploy/web vip up -‐-‐farm poc -‐-‐env stg -‐-‐site d7-‐demo @wbs1 web vip up: d7-‐demo VIP state: enabled $ /opt/deploy/web vip test -‐-‐farm poc -‐-‐env stg -‐-‐site d7-‐demo @wbs1 web vip test: VIP monitor: success $ /opt/deploy/web vip down -‐-‐farm poc -‐-‐env stg -‐-‐site d7-‐demo @wbs1 web vip down: d7-‐demo VIP state: disabled $ /opt/deploy/web vip test -‐-‐farm poc -‐-‐env stg -‐-‐site d7-‐demo @wbs1 web vip test: VIP monitor: failed

Things to keep in mind: • Load-‐balancer VIP slots have to be setup when hosting farm is created, therefore farm

capacity will be capped by the number of pre-‐configured slots; • Assuming current network structure and design assumptions, one hosting farm can host up

to ~80 websites (2 x prod + 1 x stag VIP). That is assuming that one subnet is used. If multiple subnets used this limit can be raised up to ~125 websites;

• It is assumed that session state is shared between application instances and web applications can properly handle its sessions, so that sticky-‐sessions are not used. It does mean that any webserver instance can be removed from the load-‐balancer pool and sessions will be picked up by other pool members automatically;

• Current prototype is assuming that SSL is enabled for all web sites; • When new website instance (container) is deployed its health-‐check page is disabled, so the

VIP is turned down. It must be explicitly enabled to allow inbound traffic.

SCM Service All project artifacts are stored in versioned Code Repositories. This service is implemented as a Docker container located on the utility host in foundation farm and running GitLab application. The SCM service is depending on:

• IdM Service – controls access to Code Repositories; • Persistent Volumes – storing configuration and Git repositories content;

Other services in turn depending on SCM service, among them: • Workflow Engine – storing configuration in corresponding repo; • Deployment Service – fetching project from the project repo; • Image Builder – fetching image build file and its dependencies from the image repo;

Additionally, SCM service is providing both User and Admin portals, accessed by Developers and Platform Administrators correspondingly. Besides external web projects all platform settings, scripts and configurations are also version controlled and stored in corresponding Git repositories,



following configuration as a code paradigm. Here is an example for setting up GitLab container. First of all we’ll setup certificates:

$ openssl req -‐x509 -‐nodes -‐days 365 -‐newkey rsa:2048 –subj \ "/C=DE/ST=HE/L=Frankfurt/O=VZ/OU=H/CN=gitlab.poc.local/[email protected]" \ -‐keyout ${VOL_DATA}/gitlab/config/ssl/gitlab.poc.local.key \ -‐out ${VOL_DATA}/gitlab/config/ssl/gitlab.poc.local.crt

Next, assuming volume folders in place, we’ll boot up GitLab container:

$ docker run -‐-‐name gitlab -‐-‐detach=true -‐-‐restart always -‐-‐hostname gitlab.poc \ -‐-‐publish ${UTIL_HOST}:8443:443 -‐-‐publish ${UTIL_HOST}:8080:80 -‐-‐publish ${UTIL_HOST}:2222:22 \ -‐-‐volume ${VOL_DATA}/gitlab/config:/etc/gitlab \ -‐-‐volume ${VOL_DATA}/gitlab/logs:/var/log/gitlab \ -‐-‐volume ${VOL_DATA}/gitlab/data:/var/opt/gitlab \ gitlab/gitlab-‐ce:8.6.8-‐ce.0

Now GitLab application is listening on TCP ports 8443 for HTTPS connections and 2222 for SSH access to git repositories. In order to provide sufficient project and user isolation, all projects are created as private, so they are visible only to users who have been explicitly granted access. The users are setup with LDAP authentication provider and using IdM Service for authentication. The Enterprise Edition (EE) of GitLab provides more advanced features for authorization and access management using LDAP groups and roles out of the box. However, the basic mechanism described above does satisfy project isolation requirements and the POC project was implemented using free Community Edition (CE) of GitLab. See more details about GitLab setup in the GitLab Repository Structure chapter. Things to keep in mind:

• The GitLab is setup for hybrid authentication and using both local user database and LDAP user directory. There are two special accounts in GitLab:

o root – the Administrative account used for GitLab management and configuration; o robot – the API Bot account used by automation workflows and platform services.

This account is mainly used for API access and its API key is stored in secure store; • User logon name is used for GitLab authentication. It’s assumed that IdM service is ensuring

uniqueness of this logon name across enterprise; • Although plain HTTP is supported by GitLab for accessing the Web UI and using WebDAV, it

is discouraged due to security considerations and attempts to access GitLab over HTTP will result in redirect to HTTPS endpoint;

• Obviously there is a trade-‐off with known pro’s and contras, when implementing local code repository versus hosted code repository. For this project it’s been decided to use local GitLab based repository, however, it’s also possible to use external or hosted repository such as GitHub or BitBucket, assuming that service integration has been performed, service availability, security and access issues have been addressed.



Workflow Engine The Workflow Engine or Orchestration Engine is implemented on the basis of Jenkins automation framework. Jenkins itself is running in the container on the utility host in foundation farm. Jenkins configurations and workflows, implemented as Groovy code, are stored in the corresponding code repository provided by SCM service. The access to Workflow Engine is controlled by RBAC mechanisms provided by IdM Service. Jenkins provides multiple possibilities for setting up authorization mechanisms and RBAC rules. Current implementation is using Matrix-‐based access model, which is allowing mapping specific LDAP groups (Roles) to certain Jenkins access permissions. Obviously, every specific project may call for additional roles and mappings. The POC project has been setup with the following roles and permission matrix: Permission Group

Permission Anonymous Jenkins Administrators

Jenkins Job Managers

Overall Administer ✔ Configure Update Center ✔ Read ✔ ✔ Run Scripts ✔ ✔ Upload Plugins ✔ Credentials ✔ Create ✔ Delete ✔ Manage Domains ✔ Update ✔ View ✔ Agent Build ✔ Configure ✔ Connect ✔ Create ✔ Delete ✔ Disconnect ✔ Job Build ✔ ✔ Cancel ✔ ✔ Configure ✔ Create ✔ Delete ✔ Discover ✔ Move ✔ Read ✔ ✔ Workspace ✔ ✔ View Delete ✔ Replay ✔ Update ✔ Configure ✔ ✔ Create ✔ Delete ✔ Read ✔ ✔ SCM Tag ✔



Below is an example for setting up Jenkins container:

# pulling the latest stable Jenkins build https://hub.docker.com/_/jenkins/ $ docker pull jenkins:alpine # get admin password from container log or /var/jenkins_home/secrets/initialAdminPassword 88345b8ecf904c0ba9eca63fb2cf8d47 $ docker run -‐-‐name=jenkins -‐-‐hostname jenkins -‐-‐detach=true -‐-‐restart=always \ -‐-‐cpu-‐shares 512 -‐-‐memory 2G \ -‐-‐volume=${VOL_DATA}/jenkins:/var/jenkins_home \ -‐-‐publish jenkins.poc.local:8080:8080 \ -‐-‐publish jenkins.poc.local:50000:50000 \ jenkins:alpine

Now, that Jenkins is up and running it can be managed via Web UI as well as API or CLI, e.g.

# login with the client $ java -‐jar ./war/WEB-‐INF/jenkins-‐cli.jar -‐s http://localhost:8080 login \ -‐-‐username admin -‐-‐password '<your admin password>' # get list of installed plugins $ java -‐jar ./war/WEB-‐INF/jenkins-‐cli.jar -‐s http://localhost:8080 list-‐plugins \ | cut -‐c 1-‐27,92-‐ | awk '{printf "%s:%s\n", $1,$2}' | sort > plugins.txt

One of the Jenkins benefits is modularity. You may find plug-‐ins for pretty much anything and tailor your Jenkins setup for your particular needs and processes. Below is a list of Jenkins plugins and their dependencies proposed for this POC project:

# The most important and essential plugins are annotated and provided with comments. # Note: not all plugins are used and most required to satisfy dependencies ace-‐editor:1.1 # JS UI library active-‐directory:1.47 # AD authentication and authorization ansicolor:0.4.2 # ANSI colors in the console output improve readability ant:1.3 antisamy-‐markup-‐formatter:1.5 branch-‐api:1.10 build-‐name-‐setter:1.6.5 # Sets the name of a build to something other than #1, #2, #3, ... build-‐pipeline-‐plugin:1.5.4 build-‐timeout:1.17.1 # Interrupt task after certain execution time threshold reached cloudbees-‐folder:5.12 # Folders can help structuring and organizing jobs conditional-‐buildstep:1.3.5 credentials-‐binding:1.8 # Makes credentials visible as a build parameter credentials:2.1.4 # Store credentials in Jenkins secure containers durable-‐task:1.12 email-‐ext:2.47 external-‐monitor-‐job:1.6 git-‐client:1.19.7 # Git client for Jenkins git-‐server:1.7 git:2.5.3 # Support for Git SCM gitlab-‐plugin:1.3.0 # Support for GitLab SCM greenballs:1.15 # Makes successful jobs “green” not “blue” groovy-‐postbuild:2.3.1 handlebars:1.1.1 # JS UI library icon-‐shim:2.0.3 javadoc:1.4 jquery-‐detached:1.2.1 # JS UI library



jquery:1.11.2-‐0 junit:1.18 ldap:1.12 mailer:1.17 mapdb-‐api:1.0.9.0 matrix-‐auth:1.4 # matrix-‐based authorization strategies (global and per-‐project) matrix-‐project:1.7.1 maven-‐plugin:2.13 momentjs:1.1.1 # JS UI library pam-‐auth:1.3 parameterized-‐trigger:2.32 pipeline-‐build-‐step:2.2 pipeline-‐input-‐step:2.1 pipeline-‐rest-‐api:1.7 pipeline-‐stage-‐step:2.1 pipeline-‐stage-‐view:1.7 plain-‐credentials:1.2 publish-‐over-‐ssh:1.14 rebuild:1.25 role-‐strategy:2.3.2 # Enables authorization using a role-‐based strategy run-‐condition:1.0 scm-‐api:1.2 scm-‐sync-‐configuration:0.0.10 # Stores Jenkins configuration in SCM (Git) repo script-‐security:1.22 # Controls what users can execute what scripts scriptler:2.9 ssh-‐credentials:1.12 # Stores SSH credentials in Jenkins secure containers ssh:2.4 # Remote job execution using SSH structs:1.3 subversion:2.6 timestamper:1.8.4 # Adds timestamps to console output token-‐macro:1.12.1 uno-‐choice:1.4 # Interactive input parameter HTML controls windows-‐slaves:1.2 workflow-‐aggregator:2.2 workflow-‐api:2.1 workflow-‐basic-‐steps:2.1 workflow-‐cps-‐global-‐lib:2.2 workflow-‐cps:2.12 workflow-‐durable-‐task-‐step:2.4 workflow-‐job:2.5 workflow-‐multibranch:2.8 workflow-‐scm-‐step:2.2 workflow-‐step-‐api:2.3 workflow-‐support:2.2 # Workflow or Pipeline as a code support ws-‐cleanup:0.30 # Clean-‐up workspace upon task completion

For more details on workflows please refer to the Management Tasks and Workflows chapter. Things to keep in mind:

• Jenkins project is actively developed and new versions appearing very often. Unless you require particular fix implemented in those versions, it’s recommended to stick to a certain LTS version rather than always using latest-‐greatest one. Although the Jenkins Core is known to be mature and stable, the Core version update may introduce incompatibilities with the deployed plugins, so they will need to be checked for compatibility as well;

• The list of plugins proposed above is by no means a definitive list and should be seen as an example only. The versions indicated above are actual at the time of container deployment and half of the plugins already proposing upgrades;

• By default the Active Directory plugin is using plain LDAP protocol. In order to secure LDAP communication you can use one of the supported mechanisms:

o Active Directory plugin performs TLS upgrade – it connects to domain controllers through insecure LDAP, then from within the LDAP protocol it "upgrades" the



connection to use TLS, achieving the same degree of confidentiality and server authentication as LDAPS does;

o If you must insist on using LDAPS, and not TLS upgrade, you can set the system property hudson.plugins.active_directory.ActiveDirectorySecurityRealm.forceLdaps=true as a startup parameter to force Jenkins to start a connection with LDAPS, even though this will buy you nothing over LDAP+TLS upgrade. You will also need to check inside config.xml to ensure either the secured port is defined (636 or 3269) or not defined at all;

• Jenkins recognizes all the groups in Active Directory that the user belongs to, so you can use those to make authorization decisions. For example, you can choose the matrix-‐based security as the authorization strategy and perhaps allow "Jenkins Admins" to administer Jenkins;

• Jenkins is using JVM and Java is hungry for memory, therefore for production setup Jenkins container memory constraints should be carefully chosen considering JVM memory allocation and garbage collection settings;

• You can setup additional Jenkins slaves if greater concurrency is required; • Since Jenkins container does not store any state, it’s safe to stop and restart this container,

assuming there are no tasks in flight and no management activities being performed at the moment. The recommended approach is to use “Prepare for Shutdown” task in the Jenkins management menu.

SonarQube Service The SonarQube Analysis Engine or Static Code Analyzer Service is implemented as a set of containers located on utility host in foundation farm and running SonarQube application:

• SonarQube container – provides Web Portal and Code Analysis engine; • Sonar Scanner container – wraps the Sonar Scanner application and its runtime; • Sonar Database container – optional container providing Sonar Database instance.

The SonarQube container is also exposing user interface and APIs, which may be used for both administration and code review, depending on the user’s role. The SonarQube service is depending on Sonar Database, Persistent Volumes and IdM Services. The former is used as persistent storage for analysis reports and artifacts and latter is providing authentication and authorization services. The SonarQube application and its functionality can be extended using various plugins provided either by vendor or the broad community. For this project the following plugins have been selected and explored:

• LDAP – delegates authentication to LDAP; • Generic Coverage – imports coverage reports defined a given format; • Git – Git SCM Provider; • CSS – enables analysis of CSS and Less files; • JavaScript – enables scanning of JavaScript source files; • PHP -‐ enables analysis of PHP projects; • Web Languages -‐ enables scanning of HTML, and JSP/JSF files; • XML – enables analysis and reporting on XML files.


• In the current setup Sonar Database is provided by the Persistent Database Storage service and dedicated Sonar Database container is not used;

• SonarQube is using JVM and Java is hungry for memory, therefore for production setup container memory constraints should be carefully chosen considering JVM memory



allocation and garbage collection settings. Obviously, those settings must be correlated with container resource constraints for the optimal performance;

• Additional sizing exercise must be performed to identify number of scanner jobs that may be submitted concurrently;

• After successful authentication all user groups provided by IdM Service are mapped automatically to the groups known by SonarQube. There are two SonarQube groups setup by default: sonar-‐users and sonar-‐administrators. It’s sufficient to create such groups on Active Directory side and add users to them accordingly. After authentication these users will be properly mapped and grouped by SonarQube application too.

• By default sonar-‐users may submit new analysis jobs and review results. The sonar-‐administrators can manage the application and its settings. However, by default, sonar-‐administrators cannot submit new analysis jobs;

• SonarQube project is actively developed and new versions appearing quite often. Unless you require particular fix implemented in those versions, it’s recommended to stick to a certain LTS version for the sake of consistency and compatibility with chosen plugins.

Sonar Database The Sonar Database service can be implemented as a Docker container or external Database identified by its DSN (Data Source Name). The DB scheme required by SonarQube to operate is created automatically upon application startup and first DB access, if Sonar DB user is having enough access permissions. The DB scheme must be pre-‐created manually otherwise. As already mentioned, in the current setup Sonar Database is provided by the Persistent Database Storage service, therefore dedicated Sonar Database container is not used.

Sonar Scanner Sonar Scanner service implemented as a Docker container located on utility host in foundation farm. Its main purpose is submitting projects to the SonarQube Analysis Engine. The usual workflow is looking like following:

• The Sonar Scanner is executed: o This may be a standalone task, when user is initiating a code analysis project; o The code analysis task may be a task invoked as a part of bigger workflow;

• The Sonar Scanner is submitting project code to the Sonar Analysis Engine and reporting scanner task ID and analysis progress back to the caller;

• Sonar Analysis engine is validating the code against defined quality rules and standards. All identified deviations, code defects and issues stored in the Sonar Database;

• When analysis completed, the user can access the SonarQube Portal to review the analysis summary and code quality metrics. From there the user can drill down to every single issue and understand the scope and problems identified with the code fragment;

• After fixing the code quality issues and identified defects the analysis may be repeated. In this case Sonar Analysis Engine is also calculating various trends and incremental code quality metrics so that users can track code quality improvement over time.

The Sonar Scanner service is depending on SonarQube service. There is no service directly depending on Sonar Scanner, though indirectly it may be invoked by workflow tasks or triggered by Platform CLI. For example, let’s manually trigger code analysis for the website project. The -‐-‐login parameter is using Sonar Token instead of user-‐password pair. It is preferred and more secure approach.



$ /opt/deploy/sonar -‐-‐login 4f758a739160aeec49d5c7ed628a4f... \ -‐-‐path /var/web/dev/root/d7-‐demo/demo/docroot \ -‐-‐group demo_agency -‐-‐name d7_demo -‐-‐version 7.50 INFO: Scanner configuration file: /opt/sonar-‐scanner/conf/sonar-‐scanner.properties INFO: Project root configuration file: NONE INFO: SonarQube Scanner 2.6.1 INFO: Java 1.8.0_92-‐internal Oracle Corporation (64-‐bit) INFO: Linux 3.10.0-‐327.28.3.el7.x86_64 amd64 INFO: User cache: /workspace/cache INFO: Load global repositories INFO: Load global repositories (done) | time=180ms INFO: User cache: /workspace/cache INFO: Load plugins index INFO: Load plugins index (done) | time=4ms INFO: SonarQube server 5.6.1 INFO: Default locale: "en_US", source code encoding: "UTF-‐8" (analysis is platform dependent) INFO: Process project properties INFO: Load project repositories INFO: Load project repositories (done) | time=48ms INFO: Load quality profiles INFO: Load quality profiles (done) | time=75ms INFO: Load active rules INFO: Load active rules (done) | time=587ms INFO: Publish mode INFO: -‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐ Scan d7-‐demo INFO: Language is forced to php INFO: Load server rules INFO: Load server rules (done) | time=115ms INFO: Base dir: /workspace INFO: Working dir: /workspace/.sonar INFO: Source paths: docroot INFO: Source encoding: UTF-‐8, default locale: en_US INFO: Index files INFO: 1296 files indexed INFO: Quality profile for php: Sonar way INFO: Sensor NoSonar and Commented out LOC Sensor INFO: Sensor NoSonar and Commented out LOC Sensor (done) | time=1084ms INFO: Sensor Lines Sensor INFO: Sensor Lines Sensor (done) | time=60ms INFO: Sensor PHPSensor INFO: 1296 source files to be analyzed INFO: 50/1296 files analyzed, current file: /workspace/docroot/includes/lock.inc INFO: 150/1296 files analyzed, current file: /workspace/docroot/modules/openid/openid.pages.inc INFO: 203/1296 files analyzed, current file: /workspace/docroot/modules/simpletest/tests/upgrade/drupal-‐6.forum.database.php INFO: 203/1296 files analyzed, current file: /workspace/docroot/modules/simpletest/tests/upgrade/drupal-‐6.forum.database.php INFO: 346/1296 files analyzed, current file: /workspace/docroot/profiles/vzbase/modules/ctools/includes/wizard.theme.inc INFO: 635/1296 files analyzed, current file: /workspace/docroot/profiles/vzbase/modules/features/features.admin.inc INFO: 847/1296 files analyzed, current file: /workspace/docroot/profiles/vzbase/modules/views/handlers/views_handler_field.inc INFO: 1088/1296 files analyzed, current file: /workspace/docroot/profiles/vzbase/modules/views/plugins/views_plugin_exposed_form.inc INFO: 1296/1296 source files have been analyzed INFO: Sensor PHPSensor (done) | time=86335ms … INFO: Analysis report generated in 1676ms, dir size=22 MB INFO: Analysis reports compressed in 2316ms, zip size=8 MB INFO: Analysis report uploaded in 700ms INFO: ANALYSIS SUCCESSFUL, you can browse http://10.169.64.241:9000/dashboard/index/vzpoc INFO: Note that you will be able to access the updated dashboard once the server has processed the submitted analysis report INFO: More about the report processing at http://10.169.64.241:9000/api/ce/task?id=AVfCEentql6U738tQ3qu INFO: -‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐



INFO: EXECUTION SUCCESS INFO: -‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐ INFO: Total time: 1:50.097s INFO: Final Memory: 53M/908M INFO: -‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐

Now, in the Sonar User Portal we can see analysis summary along with the code quality metrics:

Figure 8 -‐ Sonar Project Dashboard

SonarQube does provide a number of scan-‐rule packages, however, you may want to include additional rules provided by vendors, in our case PHP and Drupal or implement own rules specific to your organization quality guidelines and policies. It’s important to understand that no automated code-‐analyzer can replace good coding culture and pair code reviews. However, such code analysis tools can be seen as a great helper for identifying many classes of coding errors. We can check the first bug from the list to get an idea about kind of issues caught by the SonarQube code analyzer. As you can see from the screenshot below, the alert was caused by the for-‐loop missing the code block. The code quality topic warrants an involved discussion and scan results will depend a lot on scanner rules and applied quality policies. There will definitely be some false positives found.



Figure 9 -‐ Sonar Issue Report

The Sonar Scanner parameters can be adjusted via command line options. Alternatively, you can use sonar-‐project.properties configuration file in the root directory of the project, e.g.

# must be unique in a given SonarQube instance sonar.projectKey=my:project # this is the name and version displayed in the SonarQube UI. Was mandatory prior to SonarQube 6.1. sonar.projectName=My project sonar.projectVersion=1.0 # Path is relative to the sonar-‐project.properties file. # Since SonarQube 4.2, this property is optional if sonar.modules is set. # If not set, SonarQube starts looking for source code from the directory containing # the sonar-‐project.properties file. sonar.sources=. # Encoding of the source code. Default is default system encoding #sonar.sourceEncoding=UTF-‐8

For more details on scanner parameters see the following links: http://docs.sonarqube.org/display/SCAN/Analyzing+with+SonarQube+Scanner http://docs.sonarqube.org/display/SONAR/Analysis+Parameters Things to keep in mind:

• Sonar Scanner is using JVM and Java is hungry for memory, therefore for production setup container memory constraints should be carefully chosen considering JVM memory



allocation and garbage collection settings. Obviously, those settings must be correlated with container resource constraints for the optimal performance;

• Additional sizing exercise must be performed to identify number of scanner jobs that may be submitted concurrently;

• The scanner behavior can be refined and controlled by providing additional scanner settings in corresponding configuration files. This can be done either via project-‐ or module-‐wide settings file and thus may be controlled by developer.

Platform Interfaces The platform provides multiple interfaces of various types: Web UI, CLI and API. Those different interfaces are usually providing access to the same functionality and appropriate interface must be chosen depending on the use-‐case.

API Endpoints Generally speaking, those API end-‐points are for internal use only and not exposed by the platform for external consumptions, with some possible exceptions2. Below is the list of component APIs:

• MS ADC – LDAPS protocol is used to utilize standard APIs implemented my MS ADC • Grafana – http://docs.grafana.org/reference/http_api/ • InfluxDB – https://docs.influxdata.com/influxdb/v1.0//tools/api/ • cAdvisor – https://github.com/google/cadvisor/blob/master/docs/api.md • SyncThing – https://docs.syncthing.net/dev/rest.html • Docker Engine – https://docs.docker.com/engine/reference/api/docker_remote_api/ • Docker Registry – https://docs.docker.com/registry/spec/api/ • Jenkins – https://wiki.jenkins-‐ci.org/display/JENKINS/Remote+access+API • GitLab – https://docs.gitlab.com/ce/api/ • SonarQube – http://docs.sonarqube.org/display/DEV/Web+API • Drupal – https://www.drupal.org/drupalorg/docs/api

Command Line Interfaces The set of CLI tools is providing the next abstraction level above programmatic interface or APIs and used for particular administrative and troubleshooting tasks. These CLI tools providing input validators and auto-‐completion for parameters and other convenient shortcuts and helpers. For day-‐to-‐day administrative and support activities, however, it is expected that Web UI and corresponding portals will be predominantly used.

Platform CLI The Platform CLI or management CLI toolkit is designed with master-‐slave or hub-‐spoke architecture in mind. The platform management tools are deployed on a management (master or hub) node and management tasks being submitted to the slave or spoke nodes. That said, the distributed or multi-‐master model is also possible. The CLI can be executed on any node in the farm. If management task targets the current node, then this task being performed locally, otherwise, the target node is contacted and the task is delegated and executed remotely.

2 If platform has to be integrated with other enterprise tools and services, then yes, specific APIs may be exposed to external applications.



Thus, the management toolkit can be used in both scenarios: with dedicated management node and without one. The dotted lines on platform diagram showing that administrative user may use CLI tools. This is considered a low-‐level access and, generally speaking, it’s discouraged in favor of management workflows. Those workflows are still relying upon platform CLI, however, they are implementing strict validation and task sequences. Thus, management workflows guaranteeing task serialization and consistency, which may not always be ensured, when CLI tools executed manually. For current project the Platform CLI has been deployed on utility host, at util.poc.local:/opt/deploy location. The following CLI tools belong to the Platform CLI suit. One of potential improvements would be packaging Platform CLI tools into a separate container and becoming completely independent from runtime provided by particular host. Obviously, this will remove dependency on openldap client libraries on the host, where adtool is executed.

LDAP/AD CLI This tool is used to manage hosting related objects located in LDAP/AD containers. The tool is a wrapper for aforementioned adtool and besides translating AD management operations into LDAP calls it’s ensuring AD structure and validating user inputs against naming standards and rules. The credentials for the LDAP management account are stored in the Vault credential storage.

$ ./vault list /secret/poc/ldap/ binddn bindpw $ ./vault get /secret/poc/ldap/binddn CN=adtool,CN=Service Accounts,OU=HOSTING,DC=POC,DC=LOCAL $ ./ldap LDAP/AD CLI tool for managing hosting related objects Usage: ./ldap <object> <action> [<args 1> ... <arg N>] Objects and actions: group <list|add|del|users|adduser|deluser> <args...> list (dir) [-‐-‐pattern <pattern>] [-‐-‐format <short|long>] add (create) -‐-‐group <group name> del (remove) -‐-‐group <group name> users (members) -‐-‐group <group name> [-‐-‐format <short|long>] adduser (useradd) -‐-‐group <group name> -‐-‐org <org name> -‐-‐name <user name> deluser (userdel) -‐-‐group <group name> -‐-‐org <org name> -‐-‐name <user name> org <list|add|del> <args...> list (dir) [-‐-‐pattern <pattern>] [-‐-‐format <short|long>] add (create) -‐-‐org <org name> del (remove) -‐-‐org <org name> user <list|add|del> <args...> groups (memberof) -‐-‐name <user name> [-‐-‐format <short|long>] list (dir) -‐-‐org <org name> [-‐-‐pattern <pattern>] [-‐-‐format <short|long>] add (create) -‐-‐org <org name> -‐-‐first <first name> -‐-‐last <last name> -‐-‐login <user login> -‐-‐mail <email> -‐-‐password <password> del (remove) -‐-‐org <org name> -‐-‐name <user name>

GitLab CLI This tool is used for managing GitLab application from the command line. It’s talking directly to GitLab APIs and can be used for automating administrative and maintenance activities. The GitLab management token is stored in the Vault and provided by Secure Storage service during the runtime, so that administrative GitLab credentials are not exposed anywhere and can be safely changed, secured or stored in different credentials storage.



This management token is belonging to the “API bot” user called robot, thus all administrative tasks are performed on its behalf and administrative user root is mainly for the Web UI administrative use. Both, root and robot administrative users are created in the local GitLab user database. All other users are setup for LDAP authentication using IdM Service.

# the API token used to manage GitLab $ /opt/deploy/vault list /secret/poc/gitlab token $ /opt/deploy/gitlab GitLab CLI tool for managing hosting related objects Usage: ./gitlab <object> <action> [<args 1> ... <arg N>] Objects and actions: group <list|add|del|users|adduser|deluser> <args...> list (dir) [-‐-‐pattern <pattern>] [-‐-‐format <list|table>] add (create) -‐-‐group <group name> [-‐-‐description <description>] del (remove) -‐-‐group <group name> users (members) -‐-‐group <group name> [-‐-‐format <list|table>] adduser (useradd) -‐-‐group <group name> -‐-‐login <user login> [-‐-‐access <guest|reporter|developer|master|owner>] deluser (userdel) -‐-‐group <group name> -‐-‐login <user login> project <list|add|del> <args...> list (dir) [-‐-‐group <group name>] [-‐-‐pattern <pattern>] [-‐-‐format <list|table>] add (create) -‐-‐project <project name> -‐-‐group <project group> [-‐-‐description <project description>] [-‐-‐clone <project path>] del (remove) -‐-‐project <project name> -‐-‐group <project group> user <list|add|del> <args...> list (dir) [-‐-‐pattern <pattern>] [-‐-‐format <list|table>] add (create) -‐-‐login <user login> -‐-‐name <First Last> -‐-‐mail <email> -‐-‐org <LDAP org> del (remove) -‐-‐login <user login>

Web CLI This tool is used for managing web projects, their web containers and load-‐balancer VIPs. It is one of the key platform management tools. It does “know”, how to deploy new web projects from the source code repositories to the target web hosting farm and environment and like other CLI tools being used by platform management workflows. Strictly speaking, the Web CLI is the only platform tool, which is depending on SSH connectivity, delegating tasks and re-‐spawning itself on remote nodes. The tool does not require any credentials and remote host access is performed using password-‐less key-‐based SSH connectivity. It’s worth mentioning that SSH is setup to use connection multiplexing, giving additional speed-‐up to cross-‐host connectivity. One of the possible improvements may be using SSH agent and storing key password in the Vault. This will prevent unauthorized user from hopping between hosts in the farm yet allowing authorized platform services cross-‐host SSH access.



$ /opt/deploy/web CLI tool for managing web projects and containers Usage: ./web <object> <action> [<args 1> ... <arg N>] Objects and actions: container <deploy|health|ipmap|list|stats|remove> <args...> deploy (create) -‐-‐site <site name> -‐-‐farm <farm> -‐-‐env <env> [-‐-‐host <target>] [-‐-‐ip <ipaddr>] [-‐-‐size <small|medium|large>] [-‐-‐image <image name>] health -‐-‐site <site name> -‐-‐farm <farm> -‐-‐env <env> [-‐-‐host <target>] ipmap [-‐-‐site <site name>] -‐-‐farm <farm> -‐-‐env <env> [-‐-‐host <target>] list [-‐-‐site <site name>] -‐-‐farm <farm> -‐-‐env <env> [-‐-‐host <target>] [-‐-‐format <list|table>] stats [-‐-‐site <site name>] -‐-‐farm <farm> -‐-‐env <env> [-‐-‐host <target>] remove (delete) -‐-‐site <site name> -‐-‐farm <farm> -‐-‐env <env> [-‐-‐host <target>] project <build|deploy|list|remove> <args...> list (dir) [-‐-‐site <site name>] -‐-‐farm <farm> -‐-‐env <env> [-‐-‐host <target>] [-‐-‐pattern <pattern>] [-‐-‐format <list|table>] build -‐-‐project <project name> -‐-‐group <project group> [-‐-‐description <project description>] [-‐-‐profile <profile path>] deploy -‐-‐site <site name > -‐-‐farm <farm> -‐-‐env <env> [-‐-‐host <target>] -‐-‐org <org name> -‐-‐project <project name> -‐-‐group <project group> remove -‐-‐site <site name> -‐-‐farm <farm> -‐-‐env <env> [-‐-‐host <target>] -‐-‐path <archive path | /dev/null> vip <down|status|test|up> down -‐-‐site <site name> -‐-‐farm <farm> -‐-‐env <env> [-‐-‐host <target>] status -‐-‐site <site name> -‐-‐farm <farm> -‐-‐env <env> [-‐-‐host <target>] test -‐-‐site <site name> -‐-‐farm <farm> -‐-‐env <env> [-‐-‐host <target>] up -‐-‐site <site name> -‐-‐farm <farm> -‐-‐env <env> [-‐-‐host <target>]

Vault CLI The HashiCorp Vault https://www.vaultproject.io/ provides own CLI tool, however, it’s more practical to access secure storage over API. Moreover, additional input validation has been implemented to make Vault usage more safe and convenient.

$ /opt/deploy/vault Vault CLI tool for managing objects in secure storage Usage: ./vault <action> <key path> [<value>] <key path> -‐ the key path in the secure storage, has the following format /<key>/.../<path> and can be up to 128 chars long. Currently all keys are put under /secret prefix. <value> -‐ the value string can be up to 1024 chars long and may contain any ASCII characters but double-‐quotes. Supported actions: list (dir) <key path> get (read) <key path> del (drop) <key path> put (set) <key path> <value>

The Vault is modular and supporting multiple storage and authentication back-‐ends. For this project built-‐in key-‐based mechanism is used for authentication, requiring at least 3 out of 5 unlock keys in order to unseal the Vault storage. It is possible to use LDAP authentication backend too. It may lead to circular dependencies if credentials required for accessing LDAP will be stored in the Vault, which is locked and must be unsealed.



Scanner CLI This is a wrapper for sonar-‐scanner java application used to submit code analysis tasks to the SonarQube server. The java application is executed in the container for ensuring resource constraints. User inputs additionally validated against naming standards.

$ /opt/deploy/vault list /secret/poc/sonar token $ /opt/deploy/scanner Sonar-‐Scanner: submits given project to the SonarQube code analyzer Usage: ./scanner -‐-‐option1 <value 1> ... -‐-‐optionN <value N> [-‐-‐ -‐Dsonar.key=val ...] Supported options: -‐-‐path <project path> File-‐system path where the project has been deployed. -‐-‐group <group name> The project group name -‐-‐name <project name> The project name -‐-‐version <project version> The project version [-‐-‐sources <source subdir>] Optional source folder location relative to project path if sources stored in subdir. [-‐-‐host <Server URL>] Optional SonarQube server URL For additional analysis parameters, see http://docs.sonarqube.org/display/SONAR/Analysis+Parameters

Although, it’s possible to use both username-‐password pair and secure token for SonarQube server authentication, the secure token is more preferred. For submitting jobs this tool is using authentication token from Vault.

Sonar CLI The SonarQube application can be managed either using Web UI or REST API. The latter approach is employed for automating general management tasks, such as managing authorization groups, project permission templates and projects themselves. In order to provide reliable isolation in multi-‐user and multi-‐project environment, the following strategy has been proposed and implemented:

• Each LDAP organization is getting own LDAP group(s) assigned. This LDAP group is named after Organization name with Sonar Users suffix added.

• Corresponding Sonar authorization group with the same name as LDAP group is created. This enables mapping between LDAP and Sonar groups during the authorization phase;

• The permission template is created in Sonar for binding permissions to the projects, whose keys matching given pattern;

• Eventually, Sonar authorization group(s) with specific permissions getting added to the template, thus implementing the authorization matrix, which is defining what users or user groups allowed to perform certain operations on given projects;

This can be better explained on example. Let’s assume, our organization is called “Alpha Agency” and we want organization users in corresponding LDAP groups to browse Sonar projects and access project code. The following configuration has to be performed for implementing authorization strategy allowing organization and project isolation.



LDAP: add organization org name: "Alpha Agency" LDAP: add group group name: "Alpha Agency Sonar Users" SONAR: add group group name: "Alpha Agency Sonar Users" SONAR: add permission template template name: "Alpha Agency" project key: "alpha_agency:*" SONAR: add permission template name: "Alpha Agency" group: "Alpha Agency Sonar Users" permission: "Browse" SONAR: add permission template name: "Alpha Agency" group: "Alpha Agency Sonar Users" permission: "See Source Code"

It does effectively mean that all users in Alpha Agency Sonar Users group will be granted Browse and See Source Code permissions for all Sonar projects with project keys matching the “alpha_agency:*” pattern. Such authorization structure allows to:

• Separate LDAP organizations and their projects from each other; • Grant specific organization users access to Sonar projects with given access level

granularity; • Add new projects to Sonar, so that they will inherit the same permission set due to

permission template mechanisms in place. For more details about Sonar Authorization and permission templates, see the following links http://docs.sonarqube.org/display/SONARQUBE56/Authorization http://docs.sonarqube.org/display/SONAR/Authorization .

Report CLI The report CLI is running predefined InfluxQL scripts and queries against Stats Database and generating reports in different formats.

Test CLI The set of tools validating platform services and platform health. The testing tools are idempotent, meaning that they can be executed repeatedly and in case of failure the test will continue from the failed step and proceed until completion, when all objects created during the test will be automatically cleaned up. The setup validation tool is checking whether:

• Platform services are deployed; • Platform Services are running and healthy; • Published Platform Service endpoints published and accessible; • Credentials for accessing Platform Services have been setup and stored in the Vault.



For example:

testSetup: ### checking service "cadvisor" testSetup: -‐-‐ found container "cadvisor" testSetup: -‐-‐ container "cadvisor" is using image "google/cadvisor:v0.23.8" testSetup: -‐-‐ container "cadvisor" is up and running testSetup: -‐-‐ 10.169.64.241:18080: connection successful testSetup: ### checking service "grafana" testSetup: -‐-‐ found container "grafana" testSetup: -‐-‐ container "grafana" is using image "registry.poc:5000/poc/grafana:2.6.0" testSetup: -‐-‐ container "grafana" is up and running testSetup: -‐-‐ 10.169.64.241:3000: connection successful testSetup: ### checking service "influxdb" testSetup: -‐-‐ found container "influxdb" testSetup: -‐-‐ container "influxdb" is using image "registry.poc:5000/poc/influxdb" testSetup: -‐-‐ container "influxdb" is up and running testSetup: -‐-‐ 10.169.64.241:8083: connection successful testSetup: -‐-‐ 10.169.64.241:8086: connection successful testSetup: ### checking service "sonar" testSetup: -‐-‐ found container "sonar" testSetup: -‐-‐ container "sonar" is using image "registry.poc:5000/poc/sonar:5.6.1" testSetup: -‐-‐ container "sonar" is up and running testSetup: -‐-‐ 10.169.64.241:9000: connection successful testSetup: ### checking service "vault" testSetup: -‐-‐ found container "vault" testSetup: -‐-‐ container "vault" is using image "registry.poc:5000/poc/vault" testSetup: -‐-‐ container "vault" is up and running testSetup: -‐-‐ 10.169.64.241:8200: connection successful testSetup: ### checking service "jenkins" testSetup: -‐-‐ found container "jenkins" testSetup: -‐-‐ container "jenkins" is using image "jenkins:alpine" testSetup: -‐-‐ container "jenkins" is up and running testSetup: -‐-‐ 10.169.64.241:50000: connection successful testSetup: -‐-‐ 10.169.64.241:8888: connection successful testSetup: ### checking service "gitlab" testSetup: -‐-‐ found container "gitlab" testSetup: -‐-‐ container "gitlab" is using image "gitlab/gitlab-‐ce:8.6.8-‐ce.0" testSetup: -‐-‐ container "gitlab" is up and running testSetup: -‐-‐ 10.169.64.241:8080: connection successful testSetup: -‐-‐ 10.169.64.241:8443: connection successful testSetup: -‐-‐ 10.169.64.241:2222: connection successful testSetup: ### checking service "registry" testSetup: -‐-‐ found container "registry" testSetup: -‐-‐ container "registry" is using image "registry:2.5" testSetup: -‐-‐ container "registry" is up and running testSetup: -‐-‐ 10.169.64.241:5000: connection successful testSetup: ### looking up "LDAP Bind Username" in Vault testSetup: vault:/secret/poc/ldap/binddn found testSetup: ### looking up "LDAP Bind Password" in Vault testSetup: vault:/secret/poc/ldap/bindpw found testSetup: ### looking up "MySQL DBA Username" in Vault testSetup: vault:/secret/poc/mysql/DBA_USER found testSetup: ### looking up "MySQL DBA PAssword" in Vault testSetup: vault:/secret/poc/mysql/DBA_PASS found testSetup: ### looking up "GitLab Auth Token" in Vault testSetup: vault:/secret/poc/gitlab/token found testSetup: ### looking up "SonarQube Auth Token" in Vault testSetup: vault:/secret/poc/sonar/token found

The test-‐workflow validation tool is checking orchestration and management workflows by executing number of tasks and implementing complete website life-‐cycle including, but not limited to the following steps:



• Setup: -‐ List LDAP orgs (ldap org list) -‐ Add LDAP org (ldap org add) -‐ Add LDAP user (ldap user add) -‐ Test user has been added (ldap user list) -‐ List LDAP groups (ldap group list) -‐ Add LDAP group (ldap group add) -‐ List LDAP group users (ldap group users) -‐ Add LDAP user to group (ldap group adduser) -‐ Test user is member of the LDAP group (ldap user groups)

-‐ List GitLab groups (gitlab group list) -‐ Add GitLab group (gitlab group add) -‐ List GitLab users (gitlab user list) -‐ Add GitLab user (gitlab user add) -‐ Add GitLab user to group (gitlab group adduser) -‐ Test user is member of the GitLab group (gitlab group users)

-‐ Build web project from base (web project build) -‐ Create GitLab project (gitlab project add) -‐ List GitLab projects (gitlab project list)

-‐ Deploy web project (web project deploy) -‐ Test web project deployed (web project list) -‐ Deploy web container (web container deploy) -‐ Test web container deployed (web container list) -‐ Enable VIP (web vip up) -‐ Test VIP status (web vip status)

• Test:

-‐ List IP map (web container ipmap) -‐ Test VIP monitor (web vip test) -‐ Test web container health (web container health) -‐ Get web container stats (web container stats)

• Cleanup:

-‐ Disable VIP (web vip down) -‐ Remove web container (web container remove) -‐ Remove web project (web project remove) -‐ Remove GitLab user from group (gitlab group deluser) -‐ Remove GitLab project (gitlab project del) -‐ Remove GitLab group (gitlab group del) -‐ Remove GitLab user (gitlab user del) -‐ Remove LDAP user from group (ldap group deluser) -‐ Remove LDAP user (ldap user del) -‐ Remove LDAP group (ldap group del) -‐ Remove LDAP org (ldap org del)



Docker CLI The same disclaimer as for Platform CLI applies to the Docker CLI too. While Docker CLI may be surely used for all management tasks, generally speaking, for the sake of consistency its use is discouraged in favor of platform management tools and workflows. For more details on Docker CLI and particular options, visit the following link: https://docs.docker.com/engine/reference/commandline/cli/

Web Portals The Web Portals are used for both administrative, system management and development tasks. Often, these portals provided by existing service components and the component separation on the diagram is rather logical.

Stats Visualization Portal The Stats Visualization Portal is implemented using Grafana -‐ an open source metric analytics and visualization suite. Grafana is preconfigured to use Stats Database as its data source. Below is an example view of resource monitoring dashboard.

Figure 10 -‐ Stats Visualization and Analysis Portal

After this basic configuration, the Stats Visualization portal can be accessed to view and analyze container resource usage. The Stats Visualization Portal may be setup to authenticate against Active Directory. Depending on the assigned role, the user can only see measurements or also update and modify monitoring dashboards and database queries used for fetching monitoring data. There is a handy script created to help setting up the DSN and dashboards for the first time.



# running Grafana in container $ docker run -‐-‐name=grafana -‐-‐hostname grafana.poc -‐-‐detach=true -‐-‐restart=always \ -‐-‐cpu-‐shares 50 -‐-‐memory 50m \ -‐-‐publish=3000:3000 \ registry.poc:5000/poc/grafana:2.6.0 # first time setup (make sure that setup script has proper credentials for the InfluxDB access) # -‐ creating Data Source Name (DSN) pointing to InfluxDB # -‐ setting-‐up custom dashboards for visualizing container metrics. $ ./setup.sh Usage: setup.sh <db_name> [<dashboards file mask>] Example: setup.sh cadvisor './dashboards' $ ./setup.sh cadvisor dashboards Grafana: data source cadvisor created Grafana: adding dashboard ./dashboards/ContainerStats.json Grafana: databoard ./dashboards/ContainerStats.json created

GitLab Portal The GitLab Portal is provided by GitLab application stack.

Figure 11 -‐ GitLab Portal

The GitLab is using IdM service for authentication and authorization, so in order to access this portal one has to provide valid user credentials. Depending on the assigned role, the user can access projects, browse code repositories or perform other tasks. It is the same portal presented to developers and administrative users and the level of access depends on the user role granted in the LDAP user directory.



Sonar Portal The Sonar Portal is provided by SonarQube application stack. The SonarQube is using IdM service for authentication and authorization. So in order to access this portal, one has to provide valid user credentials. Depending on the assigned role, the user can access projects, browse source code or perform management tasks. It is the same portal presented to developers and administrative users and level of access depends on RBAC setup and user role granted in the LDAP user directory.

Figure 12 -‐ Sonar Portal

The authorization strategy setup in a way that the user is only able to access and perform operations on projects owned by the same LDAP Organization the user is belonging to.

Platform Orchestration Portal The Platform Orchestration Portal is implemented using Jenkins automation framework along with various Jenkins Plugins. The Portal and Workflow Engine provided by the same container. The separation here is rather logical. In essence Jenkins is providing orchestration engine, role-‐based authentication, using LDAP integration, as well as Web UI (and API) to trigger pre-‐configured tasks and workflows. Those workflows using in turn either Platform CLI or component APIs to perform required actions. For additional details about Jenkins tasks see the Management Tasks and Workflows chapter. The Platform Orchestration Portal is setup to authenticate against Active Directory, so in order to access this portal, one has to provide valid user credentials. Depending on the assigned role, the user can see reports, execute or modify management workflows.



Below is a screenshot showing most common management tasks and workflows.

Figure 13 -‐ Platform Orchestration Portal

Other Components The following components are, strictly speaking, not Platform Services, rather general-‐purpose platform building blocks, encapsulating specific functionality.

Docker Engine The vendor provided definition: “The Docker Engine is a lightweight container runtime and robust tooling that builds and runs your container. Docker allows you to package up application code and dependencies together in an isolated container that share the OS kernel on the host system. The in-‐host daemon communicates with the Docker Client to build, ship and run containers.” The vendor site should be consulted for more details. See the following links: https://www.docker.com/products/docker-‐engine https://docs.docker.com/engine/ It is worth mentioning that there is no hard dependency on Docker Engine itself and it may be replaced with CoreOS rkt or LXD alternatives. Obviously, some platform components may need to be adjusted, but due service encapsulation, this effort is expected to be minimal. The deep dive into Docker Engine ecosystem and alternatives is out of scope of this paper and the only subject we will discuss in more details is Storage Scalability in Docker.

Docker Containers Like with Docker Engine, there is no direct dependency on Docker Containers specifically. Other alternatives may be used too, depending on specific platform requirements. From a platform perspective containers are seen as a mere application virtualization construct ensuring component “containment”, i.e. packaging, security and isolation. The vendor site should be consulted for more details.



Platform Capacity Model The proposed capacity model is trying to optimize resource usage and simplify calculations and predictive analytics. For this we’ll make several assumptions and put several constraints:

• Each CPU core has capacity of 1024 compute units3; • The minimal RAM unit is 1MB; • Optimal CPU:RAM ratio4 for web applications is 1:4, i.e. 1 CPU unit gets 4 RAM units; • Several resource allocation profiles defined: small, medium and large; • Host resources can be fully utilized by combining different application sizes; • The next profile size requires 4 times more resources than the given one; • The host OS resource usage is steady, predictable and well known; • Resources are not over-‐committed5 and application is always guaranteed to get allocated

resources, even under the full load; • If application demands more CPU resources than expected, it will get them as long as

resource requirements of other applications are met. In other words, the application will be provided at least guaranteed resource amount or even more;

• If multiple applications will claim more than guaranteed CPU resource amount simultaneously, the weighted resource distribution will be performed;

• Unlike CPU, the memory resources are always limited by specific amount that may be seen as a soft-‐quota. If application demands more memory than allotted, such memory may be still provided using OS page swapping mechanism. The virtual memory amount per application is also limited and may be seen as a hard quota. If application will step over this hard quota, it will be terminated by kernel OOM handler.

Figure 14 – Platform Capacity Model

3 Actually it’s not an assumption and rather a factual number defined by Linux Kernel. 4 This ratio is based on practical experience and must be seen as recommendation only, not the hard number. 5 Current model assumes that OS resource usage is very little and thus sharing application resource pool with OS won’t produce any adverse effects. In case if OS load becomes an issue – additional constant increment may be planned for resources, in order to account for OS demands.



The picture above summarizes assumptions and provides some resource allocation guidance for various hardware profiles. Thus, the VM with 4 CPUs and 16GB of RAM may host up to 256 small applications (websites) or up to 64 medium applications (websites) or up to 16 large applications (websites). Obviously, different combinations of various website sizes are possible too. In this case, the free CPU and RAM capacity may be calculated using provided formula. Given a normal distribution of various application sizes and assuming that applications (websites) can be moved or migrated between host systems, it’s possible to perform the most optimal resource allocation without resource waste and complex predictive calculations. There are two specific use-‐cases to be considered as a part of a broader capacity management discipline:

• Application re-‐tiering: the application demands more resources than current tier guarantees and its tier must be upgraded. Depending on the current host system capacity, there may be several options:

o There is free capacity to fit the next resource allocation tier. In this case only resource allocations for the application container must be adjusted.

o Defragmentation is needed, i.e. some other applications may be migrated to the different host system to free up required resources on the given host system for increasing application tier.

o If no migration possible – see the next point – the additional host system must be provisioned.

• Host Capacity Extension: there is no free capacity on available host system(s) for provisioning the next application. In this case, additional host system(s) must be provisioned.


• The storage (volume size, IOPs) considerations have been deliberately skipped in the above capacity model, since they are not specific to web containers. Nonetheless, this subject is very important and it’s more driven by application sizing agreements and requirements, rather than by the platform itself;

• The capacity model described above is considering the most generic use case, when hosted applications can only scale-‐up by choosing one from the pre-‐defined resource profiles, not scale-‐out;

• Due to the previous point, it’s further assumed that resource allocation decision is performed either manually, semi-‐manually or orchestration service is aware of different container sizes and deploying them accordingly;

• In case, if application can scale-‐out, it’s recommended to standardize on specific container size or resource allocation profile for optimal resource allocation and perform linear application scale-‐out, i.e. deploy more containers hosting application instances, rather than having various container sizes and resource profiles;

• Generally speaking, the scale-‐out scenario is recommended and more preferred over scale-‐up approach, since the latter is limited by a single host resource capacity. Unfortunately, most applications we’re not designed for scaling out, leaving scale-‐up scenario as the sole option;

• Obviously, the scale-‐out scenario is preferred for orchestration engines like Kubernetes, since it allows for automatic resource ramp-‐up and ramp-‐down, when application load decreases. Such auto-‐scaling mechanisms have been explored, however, they are out of scope of this paper.



Platform Security Platform security is considered to be an integral part of its design. Therefore, wherever possible design is striving to accommodate specific security controls and practices. This is the list of specific security mechanisms that have been explored and implemented:

• Centralized Identity and Access Management (IdM) • Role Based Access Control (RBAC); • Read-‐only container file-‐system; • Key-‐based content validation; • Linux sec-‐comp profiles; • Linux SELinux policies; • Linux AppArmor profiles; • Linux user namespace remap; • Container and image audits.

Unfortunately, not all security mechanisms are getting along well. For example, user namespace remap is not possible to use for containers with file-‐systems. There are some limitations when using SELinux policies and sec-‐comp profiles is quite young feature that will mature over time. Different Linux distributions prefer using either AppArmor or SecComp profiles. We won’t be reviewing all security mechanisms in details, since in-‐depth security research is out scope of this paper. To get up to speed on existing security mechanisms, however, it’s recommended to check the following resources and articles:

• The Understanding Docker Security and Best Practices blog post; • Project Nautilus https://blog.docker.com/2016/05/docker-‐security-‐scanning/; • Clair https://github.com/coreos/clair is an open-‐source project for static analysis and

vulnerabilities in containers; • Docker Engine Security Improvements https://blog.docker.com/2016/02/docker-‐engine-‐

1-‐10-‐security/; • Comprehensive overview for various container security tools and frameworks:

https://www.alfresco.com/blogs/devops/2015/12/03/docker-‐security-‐tools-‐audit-‐and-‐vulnerability-‐assessment/.

User Namespace Remap Essentially, a user namespace is a special Linux kernel mechanism allowing containers to have own root user, completely separate from the host root account. For example, the root user in a container would be able to manage its root owned files in the container, act as any user in the container, manage his own network interfaces and some of his mount-‐points (restrictions apply) and at the same time being “mapped” or “translated” to, say, user “container” with UID 1000 on the host system. User namespaces have been introduced as early as Linux 3.5 and are considered as stable starting with Linux 4.3. The user namespace remap is relatively new, yet promising feature and below we’ll provide some configuration details and examples. Since user namespace mapping is disabled by default, first of all, we need to enable this kernel feature:



# adding kernel parameter and rebooting $ grubby -‐-‐args="user_namespace.enable=1" -‐-‐update-‐kernel=/boot/vmlinuz-‐3.10.0-‐327.28.3.el7.x86_64 $ reboot # verify user namespace enabled $ cat /proc/cmdline BOOT_IMAGE=/vmlinuz-‐3.10.0-‐327.28.3.el7.x86_64 root=/dev/mapper/rhel_rheltest2-‐root ro crashkernel=auto rd.lvm.lv=rhel_rheltest2/root rd.lvm.lv=rhel_rheltest2/swap rhgb quiet LANG=en_US.UTF-‐8 user_namespace.enable=1

Next, we need to setup some user and group IDs as well as their mapping rules:

# creating container-‐root user $ groupadd -‐-‐gid 100000 dc-‐root $ useradd -‐-‐system -‐-‐uid 100000 -‐-‐gid 100000 -‐-‐home-‐dir / -‐-‐no-‐create-‐home -‐-‐shell /bin/false -‐-‐comment "docker root user" dc-‐root $ cat <<EOT >/etc/subuid dc-‐root:100000:65535 EOT $ cat <<EOT >/etc/subgid dc-‐root:100000:65535 EOT

At last, we need to enable user namespace remap for containers, either by editing systemd script:

$ cat <<EOT >/etc/systemd/system/docker.service.d/override.conf [Service] ExecStart= ExecStart=/usr/bin/docker daemon -‐-‐storage-‐driver=overlay -‐-‐userns-‐remap="dc-‐root" EOT

Or, more preferred approach, by changing Docker configuration file:

$ cat <<EOT >/etc/docker/daemon.json { "debug": true, "selinux-‐enabled": false, "storage-‐driver": "overlay", "userns-‐remap": "dc-‐root", "live-‐restore": true } EOT

Now, we should be having the following mapping in place:

User: root none web redis _____________________________________________________________ Host UID: 100000 101001 101002 101003 Container UID: 0 1001 1002 1003



Thus, the user web will have UID 1002 inside container and outside, from the host perspective it will have UID 101002. Now, the last touch, we can hide complexity to an extent by creating users with corresponding IDs on both host system and inside container image:

# On the Host $ groupadd -‐-‐gid 101001 none && useradd -‐-‐system -‐-‐uid 101001 -‐-‐gid 101001 -‐-‐home-‐dir / -‐-‐no-‐create-‐home -‐-‐shell /bin/false -‐-‐comment "Nobody" none $ groupadd -‐-‐gid 101002 web && useradd -‐-‐system -‐-‐uid 101002 -‐-‐gid 101002 -‐-‐home-‐dir / -‐-‐no-‐create-‐home -‐-‐shell /bin/false -‐-‐comment "Web User" web $ groupadd -‐-‐gid 101003 redis && useradd -‐-‐system -‐-‐uid 101003 -‐-‐gid 101003 -‐-‐home-‐dir /var/lib/redis -‐-‐no-‐create-‐home -‐-‐shell /bin/false -‐-‐comment "Redis User" redis # In the Container $ groupadd -‐-‐gid 1001 none && useradd -‐-‐system -‐-‐uid 1001 -‐-‐gid 1001 -‐-‐home-‐dir / -‐-‐no-‐create-‐home -‐-‐shell /bin/false -‐-‐comment "Nobody" none $ groupadd -‐-‐gid 1002 web && useradd -‐-‐system -‐-‐uid 1002 -‐-‐gid 1002 -‐-‐home-‐dir / -‐-‐no-‐create-‐home -‐-‐shell /bin/false -‐-‐comment "Web User" web $ groupadd -‐-‐gid 1003 redis && useradd -‐-‐system -‐-‐uid 1003 -‐-‐gid 1003 -‐-‐home-‐dir /var/lib/redis -‐-‐no-‐create-‐home -‐-‐shell /bin/false -‐-‐comment "Redis User" redis

For additional details and examples, please see the following links: http://rhelblog.redhat.com/2015/07/07/whats-‐next-‐for-‐containers-‐user-‐namespaces/#more-‐1004 http://goyalankit.com/blog/2016/06/25/user-‐namespace-‐in-‐red-‐hat-‐enterprise-‐linux-‐7-‐dot-‐2/ https://blog.yadutaf.fr/2016/04/14/docker-‐for-‐your-‐users-‐introducing-‐user-‐namespace/ https://docs.docker.com/engine/reference/commandline/dockerd/#/daemon-‐user-‐namespace-‐options Although user namespace mapping is a great feature it does have (at the time of writing) several known limitations:

• Sharing PID or NET namespaces with the host (-‐-‐pid=host or -‐-‐network=host); • A -‐-‐read-‐only container file-‐system (this is a Linux kernel restriction against remounting

with modified flags of a currently mounted file-‐system when inside a user namespace); • External volume or graph drivers, which are incapable of using daemon user mappings; • Using -‐-‐privileged mode flag on docker run (unless also specifying -‐-‐userns=host) • In general, user namespaces are an advanced feature and will require coordination with

other capabilities. For example, if volumes are mounted from the host, file ownership will have to be pre-‐arranged if the user or administrator wishes the containers to have expected access to the volume contents;

• Finally, while the root user inside a user “namespaced” container process has many of the expected admin privileges that go along with being the super-‐user, the Linux kernel has restrictions based on internal knowledge that this is a user “namespaced” process. The most notable restriction that we are aware of at this time is the inability to use mknod. Permission will be denied for device creation even as container root inside a user namespace.


• As it stands now, one need to choose either using read-‐only container file-‐system or using user namespaces. Hopefully, this restriction will be lifted soon and both mechanisms can be used simultaneously;

• At the time of writing Red Hat does not support user namespaces yet and considering them an experimental feature.



Docker Bench for Security Container Security has been recognized by community as one of the biggest hurdles and at the same time pain points for companies adopting containers. As result, lot more attention has been paid by vendors and developers to this subject. Number of security improvements has been proposed and most of them summarized in CIS Docker Benchmark. The latest version: https://benchmarks.cisecurity.org/tools2/docker/CIS_Docker_1.12.0_Benchmark_v1.0.0.pdf There is a handy tool hosted in Docker project repo: https://github.com/docker/docker-‐bench-‐security. Docker Bench for Security is a script that checks for dozens of common best practices around deploying Docker containers in production. The tests are all automated, and are inspired by the CIS Docker Benchmark. The tool itself is regularly updated form every Docker release. It’s recommended to perform regular platform audits using this benchmarking tool, perform risk analysis, plan and implement remediation actions as described in CIS Docker Benchmark paper. This exercise may be added to the list of security management tasks regularly performed on the container platform. Things to keep in mind:

• The CIS Security Benchmark is not mandatory prescription. It’s a guide and a set of checking rules helping to improve overall deployment security. Therefore, it’s not mandatory to implement every single control without performing risk analysis;

• The same security risk may be addressed by multiple controls and it’s enough to implement one, not all of them;

Web Application Security The Web Application Security has always been a tricky subject and there is no ultimate recipe for making applications secure. For example, OWASP https://www.owasp.org is listing the following controls:

• Verify for Security Early and Often • Parameterize Queries • Encode Data • Validate All Inputs • Implement Identity and Authentication Controls • Implement Appropriate Access Controls • Protect Data • Implement Logging and Intrusion Detection • Leverage Security Frameworks and Libraries • Error and Exception Handling

Unfortunately, those controls mostly concerned with application code and its data management, which is mostly out of control in case of hosting platform, which is providing a deployment placeholder and reliable runtime. Using code analyzers, secured platform components and best configuration practices for security can definitely improve the overall application security, however, generally speaking, it won’t guarantee sufficient protection. This is requiring another security layer – Web Application Firewall, also called WAF.



The OWASP definition: A web application firewall (WAF) is an application firewall for HTTP applications. It applies a set of rules to an HTTP/S conversation. Generally, these rules cover common attacks such as cross-‐site scripting (XSS) and SQL injection. While proxies generally protect clients, WAFs protect servers. A WAF is deployed to protect a specific web application or set of web applications. Another benefit of using WAF is that it can protect against both known and unknown security attacks as well as shield against DoS and DDoS type of attacks, the application itself and infrastructure may not be able to cope with. Often WAF is combined with content distribution or CDN solution. Therefore, we’ll be referring to a CDN/WAF combination, since most considerations apply to both of them. During the POC project, the VDMS CDN/WAF solution has been tested and validated. Generally speaking, adding WAF or CDN, must be completely transparent to the web application. However, there are couple points that have to be paid special attention:

• WAF and CDN are acting as a reverse proxy and forwarding application requests on behalf of the user, so additional steps required to identify the “Real User IP”. Usually it is supplied in custom HTTP headers;

• If CDN or CDN/WAF combination is caching application content or even dynamic pages, those caches must considered when managing application lifecycle. For example, flushing and re-‐populating caches or their parts;

• Sometimes, for the optimal performance and best integration, application must be aware of external CDN/WAF layer and adjusted accordingly. There is a term coined by major CDN vendor – application must be “akamized”;

• Since most WAF solutions are using rule-‐based scoring systems, sometimes WAF must be taught and adjusted for specific application behavior, in order to avoid reporting false positives and blocking valid application requests and activities;

• The WAF/CDN solution can also take care of SSL termination or bridging, depending on specific security requirements. In the latter case when traffic re-‐encrypted again, the self-‐signed CA may be used;

• Although, most CDN/WAF solutions are managed via corresponding vendor portals, using APIs for CDN/WAF service management becoming more and more ubiquitous. These API calls may be integrated into platform deployment processes, thus simplifying WAF/CDN management and making it more transparent for a platform user;

• Last, but not least, using CDN/WAF solution will have significant positive impact on application capacity, so it must be considered by application and platform capacity management processes;


• The VDMS solution worked smoothly with Drupal-‐based sites with couple little exceptions, when AJAX calls have been scored as suspicious and eventually blocked by WAF. This has broken some website functionality and required to adjust some scoring rules;

• The VDMS solution provides a set of APIs, however, at the time of writing the SSL certificate management via APIs is not yet possible;

• Using CDN/WAF solution may also simplify SSL certificate setup and management procedures. It is expected that most if not all hosted applications will be using SSL or TLS for transport security;

• The performance tests did not use CDN or WAF and were all executed in the same subnet. Obviously, using CDN can offload most static file requests or even some dynamic page requests.



Platform Change Management Often, when Change and Release management is discussed, it’s related to Customer applications or other assets deployed on the platform. The platform itself, however, does have own lifecycle and requirements for change and release management. Although specific development processes and practices may vary, it’s recommended to implement at least the following environments for platform change and release management:

• Development Environment – this is where all developments, code updates and other platform changes are performed initially and tested;

• Staging Environment – also known as Integration or QA environment. This is where code releases and platform updates deployed, tested and validated prior to pushing them to production. Optimally, this environment is mirroring production environment from setup and infrastructure perspective;

• Production Environment – this is where Customer applications being hosted; Additionally, there is a need for fully automated Continuous Delivery workflow that is performing platform deployment given the following parameters:

• Platform Release Identifier or Version Tag; • Target Specification or Location Identifier;

The last but not least, there is a need for the Test Suite that will be executed against deployed platform to ensure that platform is up, healthy and its services working as expected. Things to keep in mind:

• Sometimes, especially during the major application version upgrades, the data structure format or configuration is not compatible with the new version and thus requires migration or data format (scheme) upgrade. Such upgrade or ETL procedures must be a part of platform upgrade roll-‐out;

• For quick deployment roll-‐back it’s recommended to use copy-‐upgrade approach for the application data stored on persistent volumes, rather than doing data update in-‐place;

• Although most dependencies between platform components are explicit and well known, there are some implicit or indirect dependencies available too. Therefore, when changing one platform component, it’s not enough to validate only dependent components. Optimally, complete end-‐to-‐end test must be performed to ensure that all components are working as expected;

• Due to platform modularity and loose coupling, pretty much any component can be replaced with external service. This is simplifying platform support, however, it does make platform change management even more complicated, since external change management schedules and dependencies must be accounted for. Thus, one may choose to use GitHub or BitBucket as SCM Service, instead of locally hosted GitLab instance. While it does address number of questions related to component support, at the same time changes to GitHub APIs may result in unexpected need to upgrade depending platform components or processes;

• Container ecosystem is still quickly evolving. This is resulting in new features being added for addressing existing shortcomings, changing application architecture or breaking API changes.

• Depending on platform architecture, the upgrade operation may require maintenance time windows, partial capacity or function downgrade or complete platform shutdown.



Drupal Hosting The platform does not put limits on applications that can be deployed. The only possible constraint can be stated as: whatever may be packaged in container, can be hosted on this platform. Drupal CMS has been chosen to demonstrate hosting capabilities and DevOps-‐style approach to application lifecycle management as well as code deployment and content publishing workflows.

Figure 15 -‐ Drupal CMS: Configuration Portal

The end goal is to deliver Drupal CMS as a Service hosting model, where the Customer is only responsible for so called creative part – design and content, where as Service Provider is responsible for hosting platform, from a Drupal instance and down to infrastructure and application lifecycle management. Drupal CMS itself is a PHP application that uses so called LAMP (Linux, Apache, MySQL, PHP) or LEMP (Linux-‐NGINX-‐MySQL-‐PHP) application stack as its runtime.



Drupal Site Components

Figure 16 -‐ Drupal Site Components

The Drupal application is built out of multiple components: • Drupal Core -‐ The standard release of Drupal. The Drupal core installation can be seen as a

bare-‐bones setup that can be extended per specific project needs and requirements; • Core Modules – The set of modules included into standard Drupal release to provide basic

CMS features: user account registration and maintenance, menu management, RSS feeds, taxonomy, page layout customization and system administration;

• Contrib Modules – The set of modules provided by Drupal Community and 3rd parties. They offer such additional or alternate features as image galleries, custom content types and content listings, WYSIWYG editors, private messaging, third-‐party integration tools, integrating with enterprise applications, and more. As of November 2016 the Drupal website lists more than 35,800 free modules;

• Libraries – Parts of code, shared by multiple modules or even sites, which are not included to the package or distribution for licensing, maintenance or other reasons;

• Themes – Defining Drupal site look and feel. They use standardized formats that may be generated by 3rd party theme design engines. Some templates use hard-‐coded PHP. The Drupal themes utilizing a template engine to further separate HTML/CSS from PHP;

• DB Objects – Drupal CMS is heavily dependant on database. The DB Objects are used to store units of content, configuration and customization;

• File-‐system Objects – Besides its PHP code and often session-‐state objects, Drupal CMS is storing media assets, caches, compressed files and temporary files as file-‐system objects.



These components may be grouped as:

• Drupal Base (or Distribution) Components – a common denominator set of components included into specific Drupal distribution;

• Site Specific (or Custom) Components – each website may require further customization and unique set of features and functions, which are not provided by the distribution.

One of the most important objectives is to minimize number of custom components, which means that Drupal Distribution has to provide majority of features required for the modern enterprise website. This way Drupal sites will be using the common, secure and tested code-‐base and differ from each other only by themes, customization and content. This allows in turn simplifying and standardizing website sizing, development, maintenance and management processes and getting closer to the end goal – industrialized website management and delivering Drupal CMS as a Service.

Drupal Container Components One of the design paradigms is to keep containers immutable (read-‐only) and all mutable application objects must be kept outside of container itself, on container volumes or in the database.

Figure 17 -‐ Web Container Components



This design paradigm has been applied to web containers. Obviously, for number of websites using a similar LAMP or LEMP runtime components, it’s possible to define a common denominator runtime components and package them in a container image. All website specific objects: configuration, content, assets, transient and temporary files, session state are stored outside of container – either on file-‐system volumes or in the database. Let’s take a closer look to components included to web container image. It does include the following components:

• Base OS image – Although pretty much any Linux-‐based OS can be used here, the number of viable options is rather limited, each having own up and downsides. See the Base OS Image section for more considerations on this subject. For this particular project either Alpine Linux or custom built and hardened “Debian 8.1 Slim” distribution has been used;

• Process Supervisor – Another, almost religious subject – one or many processes per container. Again, please refer to Base OS Image section for more details. For this project the S6 process supervisor has been used to provide init-‐process functionality and manage sub-‐ordinate process life-‐cycle in container-‐aware manner;

• Web Server – The Apache web server is having longer track record, however, the NGINX is much more lightweight and efficient. The images have been built with both httpd and nginx binaries and results speaking for themselves;

• PHP-‐FPM – In order to make HTTP request processing more efficient, the PHP engine has been separated from HTTP pipeline. The PHP requests are served by application server component accessed via FastCGI Process Manager (FPM) reverse proxy protocol;

• Redis – The fast in-‐memory key-‐value store implemented by Redis can speed up web applications (page rendering times) by orders of magnitude. It can cache DB query results, content blocks, session objects, etc.

• Cron – Applications packaged in container may and often do require using OS provided task scheduler. In our particular case both PHP and Drupal are relying upon cron for certain cleanup, validation and other maintenance activities;

• Syslog – Applications packaged in container often rely upon system logging mechanisms and sending log messages via /dev/log socket. In our particular case the cron daemon is sending log messages via syslog facilities;

• Smtp-‐Proxy – Applications packages in container often need to send emails using SMTP protocol. There are several possibilities to make it possible:

o Add an MTA package to every container. The most straightforward solution, but goes against stackable applications paradigm. Also sending emails is not a primary function of web container;

o Add a simple SMTP-‐proxy or gateway that will pass SMTP commands from within container to the proper mail-‐relay. This approach has been implemented and all messages being transparently passed to SMTP relay;

• Container Volumes – As you can see above, several container volumes have been defined, each serving specific purpose:

o ROOT – web application root folder. The file-‐system location where web application is being deployed to;

o DATA – web site content, media assets and persistent file-‐system objects. These files guaranteed to be preserved between web application restarts;

o CERT – SSL/TLS certificates: server.crt and server.key required to serve HTTPS requests;

o LOGS – web container log files stored here, including, but not limited to web server, php, redis, syslog and cron log files;

o TEMP – transient and temporary file-‐system objects stored here. These files not guaranteed to be preserved between web application restarts.




• The Apache server, even with the most modern and efficient Event MPM, can’t cope with the number of requests that NGINX handles with ease. Eventually, one is presented with a trade-‐off, either using more “standard” Apache server providing a huge amount of modules and extensions or chose relatively new NGINX, though in this case support skills may be rather scarce;

• The number of threads processing web requests for both web server and php-‐fpm is dynamic and can be adjusted for various container sizes and resource allocation models;

• Both options: keeping dedicated Redis instance inside website container and making Redis a platform service can be considered a viable choices, each having own up-‐ and downsides. The dedicated Redis instance model allows reaping quick benefits with relatively simple deployment. The platform service approach allows much better economies of scale and resilience in case of failure with a bit more involved deployment and management overhead;

• Web container initialization script is checking whether valid SSL certificates have been supplied. If SSL certificates are missing, the new set of self-‐signed SSL certificates is generated on the fly, meaning that web container is able to serve HTTPS requests whether site certificates have been provided or not.

Drupal Container Performance Using containers and enforcing resource constraints requires even more thorough considerations, when it comes to application architecture, its scalability and configuration. The following chapters are providing an overview and useful background for web application sizing and performance.

Sizing Considerations Historically, PHP applications have been running on the LAMP stack, which stands for Linux, Apache, MySQL and PHP. Usually, the PHP runtime is compiled as a shared-‐object and loaded as Apache module. This is allowing Apache to handle both static files from the file-‐system and process so called dynamic pages, or scripts created in PHP. This solution is very elegant, simple to setup and manage, however, it comes with a price. The single Apache process size may grow quite large in size, significantly increasing the web server memory footprint. Truth be said, the biggest part of the process image size is contributed by VSS pages shared between processes, so the math is not really linear here. More details on finding process memory footprint provided in the Process Size Conundrum chapter. Much bigger issue is that PHP engine is not thread safe, generally speaking, due to 3rd party libraries and extensions. Which is leaving no other options, but using Prefork MPM. Or using other words, one HTTP connection will be served by one Unix process and number of concurrent connections that your server may potentially serve is capped by amount of RAM allotted to your web server. The rough calculation will look like following: ConcurrentConnections = (ServerMem – OsMem)/HttpdMem For example, for the web server with 4GB RAM and Apache process size ~64M, the number of concurrent HTTP connections will be (4G – 2G)/64M = 32.



We shall remember that if no CDN or other caching technology used, the same web server will serve both HTML pages and static resources, such as images, scripts, CSS files, etc. Assuming that modern web browser is sending 3-‐4 concurrent HTTP requests when accessing the web page, in reality ~8 browser sessions can completely saturate your web server connection pool. In order to increase the number of concurrent users (or connections) you can either grow web server RAM or decrease the httpd process size. The latter option is not going to help a lot. Cutting here and there may help a bit, but it won’t be a game changer. This is leaving the only options: growing the web server RAM or adding more web servers to spread the load. This is where FastCGI is coming to help. The basic idea is to let Apache do what it does good – serving static content and requests for dynamic content are proxied using FastCGI protocol to a separate server running PHP engine or PHP application server. Such approach allows killing multiple birds with the same stone, since PHP engine is not a part of Apache any more:

• Apache can eventually use multi-‐threading instead of multi-‐process model, thus significantly reducing resource usage;

• Apache can use other more efficient MPMs, for example Event MPM; • Application server can be deployed and scaled independently from the web server; • The web server life-‐cycle management and maintenance is simplified.

For PHP, specifically, moving forward we’ll be using PHP-‐FPM. A quote from vendor site: PHP-‐FPM (FastCGI Process Manager) is an alternative PHP FastCGI implementation with some additional features useful for sites of any size, especially busier sites. These features include:

• Adaptive process spawning • Basic statistics, ala Apache's mod_status • Advanced process management with graceful stop and start • Start workers with different uid/gid/chroot/environment/php.ini. Replaces safe_mode • Stdout and stderr logging • Emergency restart in case of accidental opcode cache destruction • Accelerated upload support • Support for a "slowlog" – logging long running requests • Enhancements to FastCGI, such as fastcgi_finish_request() -‐ a special function to finish request

and flush all data while continuing to do something time-‐consuming, e.g. video converting, stats processing, etc.

Having PHP-‐FPM deployed on the same server, the calculation is becoming a bit more complex: ServerMem – OsMem = Nweb * HttpdMem + Nphp * PHPMem Nweb > Nphp This equation is having two variables Nweb and Nphp (or number of web and php server instances correspondingly) and therefore has multiple solutions. Generally speaking, we can assume that number of web server instances must be higher than number of PHP server instances. The more precise constraint will depend on static to dynamic requests ratio processed by the web server, which will in turn depend on particular application architecture and deployment.



Another caveat – as soon as we’re going away from the Prefork MPM and using Worker or Event MPM, the number of connections is depending on number of threads, rather than number of web server processes. This leads us to more accurate formula: ServerMem – OsMem = Nweb * HttpdMem + Nweb * Tweb * ThreadMem + Nphp * PHPMem Nweb = Nphp * StaticDynamicRatio Now, let’s see some practical examples for using this formula. Let’s assume the following inputs are provided:

• The web server is equipped with 4GB RAM; • The OS and other essential services are using ~2GB RAM; • The httpd process size (HttpdMem) is 4M; • The php process size (PHPMem) is 16M; • The httpd thread size (ThreadMem) is 256K used for stack and other runtime structures; • The httpd configured with Event MPM;

o MPM is setup to use 4 httpd processes (Nweb); o 256 threads per process (Tweb);

Let’s put those inputs into our formula: 4GB – 2GB = 4 * 4MB + 4 * 256 * 256KB + Nphp * 16MB This can be reduced to: 2048MB = 16MB + 256MB + Nphp * 16MB and provides us with the value for the number of PHP processes, Nphp = 111. This number is going to be a limit for concurrent PHP requests that our web server will be able to handle concurrently. Considering queuing mechanisms and the fact that number of available threads on the web server side is much higher (1024), the real life constraint is going to be a bit higher, somewhere around 128-‐160+ concurrent requests, depending on PHP script complexity and script processing time. If parts of the page and SQL queries will be coming from cache, this number can be yet higher. Assuming that we have lots of “free” web server threads to serve static resources, our web server should be easily able to handle 150+ web browser sessions and up to 1000 concurrent HTTP hits. At this point we’ll possibly start seeing other bottlenecks and constraints such as network and disk subsystem throughput limits. This number looks impressive for such moderately sized web server, but it can be improved even further. Apache web server is very capable, however, it was designed when Internet was young and Apache is not always able to cope with demands of cloud age. Using other words – Apache failed C10K test, see https://en.wikipedia.org/wiki/C10k_problem.

Apache vs. NGINX It’s time to welcome NGINX! Quote from the vendor site: NGINX is one of a handful of servers written to address the C10K problem. Unlike traditional servers, NGINX doesn’t rely on threads to handle requests. Instead it uses a much more scalable event-‐driven (asynchronous) architecture. This architecture uses small, but more importantly, predictable amounts of memory under load. Even if you don’t expect to handle thousands of simultaneous requests, you can still benefit from NGINX’s high-‐performance and small memory footprint. NGINX scales in all directions: from the smallest VPS all the way up to large clusters of servers.



NGINX is supporting FastCGI protocol and integrates well with PHP-‐FPM. This is making NGINX an ideal candidate for Apache replacement in our web container. The NGINX is very lean, so PHP gets even more memory. Some performance tuning guides may be found in the following articles: http://drupalwxt.github.io/performance/apache-‐fpm/ https://www.howtoforge.com/configuring-‐your-‐lemp-‐system-‐linux-‐nginx-‐mysql-‐php-‐fpm-‐for-‐maximum-‐performance

Performance Test Let’s run some real tests using apache-‐bench. We’ll be testing performance of httpd-‐php-‐fpm and nginx-‐php-‐fpm containers running the same Drupal 7.50 based web application. Containers limited to 64MB memory and 1GB swap. Apache

50 TCP Apache 60 TCP

NGINX 50 TCP

NGINX 100 TCP

NGINX 100 UDS

NGINX 150 UDS

Concurrency Level 50 60 50 100 100 150 Test duration (sec) 95.88 102.46 70.21 71.99 70.08 72.44 Complete requests 100K 100K 100K 100K 100K 100K Filed requests 0 0 0 0 0 0 Write errors 0 649 0 0 0 0 Total Transferred (MB) 796.13 791.31 792.98 793.65 792.98 795.15 HTML Transferred (MB) 757.31 752.63 757.31 757.31 757.31 757.31 Requests/sec (mean) 1043.01 975.96 1424.26 1389.14 1427.03 1380.42 Time per request (ms, mean) 47.94 61.48 35.11 71.99 70.08 108.66 Time per request (ms, max) 274 1231 122 177 169 298 Transfer rate (MB/s) 8.30 7.72 11.29 11.02 11.32 10.98

Figure 18 -‐ Stress Test Results



What can we conclude from those test results? • Apache 50 TCP: the httpd-‐php-‐fpm container was able to cope with 50 concurrent dynamic

page requests quite well. It served 1000+ requests per second, with <50ms response time in average. Quite impressive for the little web server with 64MB of RAM, right? Web server provides rock-‐solid service and fully capable to handle this load over extended time;

• Apache 60 TCP: let’s increase the load and run the same test on the fresh created container (to avoid caching effects) with 60 concurrent connections. This is where we’re stepping close to the limit. Closer to the end of the test server is overwhelmed with the load. It starts throwing errors and response time jumps 20 times higher, up to 1200ms. What’s happening inside container? When number of connections is growing, PHP-‐FPM is not able to cope with the load and adding more and more threads. The memory usage is growing too. At some point, when even 1GB of virtual memory is used, the OOM handler is killing some processes. That’s an explanation for the write errors reported by apache-‐bench – listeners disappearing in the middle of HTTP request. Intensive swap usage and process creation is leading to significantly increased latency and response time. It doesn’t make any sense to increase the load even more. We have already reached the bottleneck;

• NGINX 50 TCP: the nginx-‐php-‐fpm container was able to cope with 50 concurrent dynamic page requests very well. It served 1400+ requests per second, with <36ms response time in average. Impressive +40% increase in throughput. Looking to the overall container load and behavior, it’s clear -‐ this container can handle lot more. Let’s increase the load;

• NGINX 100 TCP: the container is still stable, handling the load with grace and delivering very similar figures. The number of requests per second slightly dropped, just below 1400. The response time doubled, still below <72ms. Let’s see if we can squeeze little bit more performance out of it. The NGINX and PHP-‐FPM are using TCP protocol for inter-‐process communication. Since both processes are inside the same container, we can use UDS (Unix Domain Sockets) and reduce the overhead of the TCP stack. We shall not expect a miracle here, still worth a try;

• NGINX 100 UDS: the same test as before, just using UDS instead of TCP. As expected, the container keeps up with the load not even breaking a sweat. The performance figures have improved just a tad. The number of requests per second is again above 1400+ and the request processing time is ~70ms. Again, as expected, switching to UDS from TCP can give very little, barely noticeable speed bump. Let’s increase the load even more;

• NGINX 150 UDS: incredible result. The single little container with just 64MB of RAM can serve 150 concurrent PHP page requests! Obviously, the response time has increased up to 300ms, but container is handling the load, delivering persistent and reliable results. This concludes the test. We won’t be exploring capacity of this container in greater detail since it goes way beyond of scope for this paper. The main objective was to show that NGINX can be and is a great replacement for the Apache web server in web hosting scenario.


• To those of you reading with a skeptical face: yes, this test is just scratching a surface and by no means can be seen as a thorough test exploring performance for web containers or even particular web servers. The whole point was to show that in equal conditions NGINX is delivering better throughput than Apache;

• You’ve definitely noticed that we’re much more concerned with RAM constraints rather than with CPU limits. The reason is simple. If CPU resource is scarce and application will get less number of CPU cycles or will be scheduled a bit later to run on CPU, it most probably won’t break the application. Yes, it will take longer to perform an operation, the latency will definitely increase, but the application still does its job. The memory is a different story. When application used both physical and virtual memory, the OOM handler will kick-‐in and application will be killed. End of story.



• Ideally, we shall consider all resources, including I/O capacity and storage throughput, number of CPUs and threads, though, just to keep things simple, we’re concentrating on the most important variables;

• Are those test results valid at all? We do have Redis cache built into web containers. So, eventually, the test is doing nothing more than bashing the cache… While this is a true concern, it’s not an issue in this particular case and here is why. Drupal is depending heavily on its DB. Without Redis this performance test would result in serious DB server bashing and surely it wasn’t the main point of this test. By using Redis we simply excluded DB server from equation, so the test is really stressing the HTTP pipeline;

• Still, something does not add up… The container size is 64M. Is it a typo? Anyway, even with 640MB using your formula we can have just 640MB / 16 MB (PHP process size) = 40 concurrent PHP requests, not 50 or even 100! Where is the trick? The answer is – there is no trick and to explain memory calculations better I had to add the next chapter.

Process Size Conundrum This is a bonus chapter. Originally, I didn’t plan to include it to this paper, since the subject is not directly in scope. However, understanding some of the concepts described below can help a great deal, when sizing web applications and their containers. Let’s start with a simple question: “how to find the amount of memory used by the given application?” Actually, it’s one of those great questions that leading to some interesting discoveries and the more you drill in depth looking for an answer the more you learn and realize that most simple questions have no simple answers. The question is quite straightforward and the answer is – just run top or ps command and it will give you the number. Well, let’s see. Here is the process list inside our web container:

# ps fauxwww USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 18840 0.0 0.0 20196 1684 ? Ss 09:18 0:00 /bin/bash root 19240 0.0 0.0 17488 1148 ? R+ 11:46 0:00 \_ ps fauxwww root 1 0.0 0.0 184 0 ? Ss Oct05 0:00 s6-‐svscan -‐t0 /var/run/s6/services root 29 0.0 0.0 184 0 ? S Oct05 0:00 s6-‐supervise s6-‐fdholderd root 228 0.0 0.0 184 0 ? S Oct05 0:00 s6-‐supervise syslog root 239 0.0 0.0 182856 1452 ? Ssl Oct05 0:00 \_ rsyslogd -‐f /etc/rsyslog.conf -‐n root 229 0.0 0.0 184 0 ? S Oct05 0:00 s6-‐supervise redis redis 238 0.0 0.0 33344 5624 ? Ssl Oct05 26:00 \_ redis-‐server 127.0.0.1:6379 root 231 0.0 0.0 184 0 ? S Oct05 0:00 s6-‐supervise php5-‐fpm root 236 0.0 0.1 254592 12452 ? Ss Oct05 1:54 \_ php-‐fpm: master process web 261 0.0 0.0 254592 1696 ? S Oct05 0:00 \_ php-‐fpm: pool www web 262 0.0 0.0 254592 1700 ? S Oct05 0:00 \_ php-‐fpm: pool www web 263 0.0 0.0 254592 1700 ? S Oct05 0:00 \_ php-‐fpm: pool www web 264 0.0 0.0 254592 1696 ? S Oct05 0:00 \_ php-‐fpm: pool www web 265 0.0 0.0 254592 1700 ? S Oct05 0:00 \_ php-‐fpm: pool www web 266 0.0 0.0 254592 2116 ? S Oct05 0:00 \_ php-‐fpm: pool www web 267 0.0 0.0 254592 1696 ? S Oct05 0:00 \_ php-‐fpm: pool www web 268 0.0 0.0 254592 1700 ? S Oct05 0:00 \_ php-‐fpm: pool www root 232 0.0 0.0 184 0 ? S Oct05 0:00 s6-‐supervise nginx root 237 0.0 0.0 36180 2728 ? Ss Oct05 0:00 \_ nginx: master process web 259 0.0 0.0 37188 2616 ? S Oct05 2:39 \_ nginx: worker process web 260 0.0 0.0 37156 2116 ? S Oct05 2:44 \_ nginx: worker process root 233 0.0 0.0 184 0 ? S Oct05 0:00 s6-‐supervise cron root 235 0.0 0.0 25896 1028 ? Ss Oct05 0:03 \_ cron -‐f



So, what is the process memory usage? There are two values RSS and VSZ. According to the man-‐page:

• RSS – resident set size, the non-‐swapped physical memory that a task has used (in kiloBytes); • VSZ – virtual memory size of the process in KiB (1024-‐byte units).

So, the value in RSS column must represent the process size in kilobytes and often one of the following formulas being used:

$ ps -‐ylC php5-‐fpm -‐-‐sort:rss | awk '!/RSS/ { s+=$8 } END { printf "Total memory used by processes: %dM\n", s/1024}' Total memory used by processes: 28M $ ps aux | grep 'php-‐fpm' | grep -‐v grep | awk '{s+=$6} END {printf "Total memory used by processes: %dM\n", s/1024}' Total memory used by processes: 28M

Is it the right answer to our question? Unfortunately, no, it’s not. The Linux virtual memory system is not quite so simple. There are many reasons for RSS not being an accurate memory usage estimate and the most important are:

• When a process forks, both the parent and the child will show up with the same RSS. However, Linux employs a copy-‐on-‐write mechanism, so that both processes are really using the same memory segments. Only when one of the processes modifies the memory will it actually be duplicated. So in our calculation above we have counted the same memory pages multiple times;

• The RSS value doesn't include shared memory. Shared memory is owned by many processes and not owned by any single one, hence why it isn’t included in process RSS field.

So, we may conclude that while RSS is indeed the total memory actually held in RAM for a process, it is really misleading, when it comes to estimating the process size. What about VSZ? The VSZ is the total accessible address space of a process. It also includes memory that may not be resident in RAM like mallocs that have been allocated, but not written to. As we can see -‐ VSZ is of very little use for determining real memory usage of a process. It’s time to drill deeper and use more advanced tools:

$ pmap -‐d 237 237: nginx: master process /usr/sbin/nginx Address Kbytes Mode Offset Device Mapping 0000000000400000 980 r-‐x-‐-‐ 0000000000000000 0fd:00002 nginx 00000000006f5000 112 rw-‐-‐-‐ 00000000000f5000 0fd:00002 nginx 0000000000711000 124 rw-‐-‐-‐ 0000000000000000 000:00000 [ anon ] 000000000122f000 508 rw-‐-‐-‐ 0000000000000000 000:00000 [ anon ] 00007f136ab8a000 44 r-‐x-‐-‐ 0000000000000000 0fd:00002 libnss_files-‐2.19.so 00007f136ab95000 2044 -‐-‐-‐-‐-‐ 000000000000b000 0fd:00002 libnss_files-‐2.19.so 00007f136ad94000 4 r-‐-‐-‐-‐ 000000000000a000 0fd:00002 libnss_files-‐2.19.so 00007f136ad95000 4 rw-‐-‐-‐ 000000000000b000 0fd:00002 libnss_files-‐2.19.so … 00007f136cf01000 4 rw-‐-‐-‐ 0000000000000000 000:00000 [ anon ] 00007ffd8d02d000 132 rw-‐-‐-‐ 0000000000000000 000:00000 [ stack ] 00007ffd8d0f3000 8 r-‐x-‐-‐ 0000000000000000 000:00000 [ anon ] ffffffffff600000 4 r-‐x-‐-‐ 0000000000000000 000:00000 [ anon ] mapped: 36180K writeable/private: 1300K shared: 4K



If you go through the output, you will find that the lines with the largest Kbytes number are usually the code segments of the included shared libraries, the ones with .so suffix. What is great about that is that they are the ones that can be shared between processes. If you factor out all of the parts that are shared between processes, you end up with the "writeable/private" total, which is shown at the bottom of the output. This is what can be considered the incremental cost of this process in terms of memory consumption, factoring out the shared libraries. Let’s check the size of nginx processes again:

# for pid in `pgrep nginx`; do pmap -‐d $pid | grep 'private' ; done mapped: 36180K writeable/private: 1300K shared: 4K mapped: 0K writeable/private: 0K shared: 0K mapped: 0K writeable/private: 0K shared: 0K

We can see that 36180KB is the total amount of process addressable space for the nginx process, also known as mapped memory or VSZ. The most part of this mapped memory is taken by shared libraries, which are mapped to many, but not really “belonging” to any of the process. The only part of memory pertinent to nginx processes is just 1300K. In other word, 3 nginx Unix processes are using 1300K of memory. Let’s check php-‐fpm processes:

# for pid in `pgrep php`; do pmap -‐d $pid | grep 'private' ; done mapped: 254592K writeable/private: 6440K shared: 66052K mapped: 0K writeable/private: 0K shared: 0K mapped: 0K writeable/private: 0K shared: 0K mapped: 0K writeable/private: 0K shared: 0K mapped: 0K writeable/private: 0K shared: 0K mapped: 0K writeable/private: 0K shared: 0K mapped: 0K writeable/private: 0K shared: 0K mapped: 0K writeable/private: 0K shared: 0K mapped: 0K writeable/private: 0K shared: 0K

The same goes for PHP processes. The memory usage can be estimated as 6440K. Eventually, it looks like we got an answer to a simple question asked in the beginning of this chapter. Not so quick… Those mapped shared libraries that we discounted are still taking some significant amount of memory. Although, the numbers we’ve found are quite accurate estimates of “incremental” process memory usage, it would be good to have shared memory somehow distributed and accounted for across processes using those shared libraries too. That’s exactly what Proportional Set Size metric is providing. The Proportional Set Size (PSS) is more meaningful representation of the amount of memory used by libraries and applications in a virtual memory system. Because large portions of physical memory are typically shared among multiple applications, the standard measure of memory usage known as Resident Set Size (RSS) will significantly overestimate memory usage. PSS instead measures each application's "fair share" of each shared area to give a realistic measure. For example, if three processes all use a shared library that has 30 pages, that library will only contribute 10 pages to the PSS that is reported for each of the three processes. PSS is a very useful number because when the PSS for all processes in the system are summed together, that is a good



representation for the total memory usage in the system. When a process is killed, the shared libraries that contributed to its PSS will be proportionally distributed to the PSS totals for the remaining processes still using that library.

# smem -‐tk PID User Command Swap USS PSS RSS 29 root s6-‐supervise s6-‐fdholderd 8.0K 4.0K 9.0K 40.0K 228 root s6-‐supervise syslog 16.0K 4.0K 9.0K 40.0K 229 root s6-‐supervise redis 16.0K 4.0K 9.0K 40.0K 231 root s6-‐supervise php5-‐fpm 8.0K 4.0K 9.0K 40.0K 232 root s6-‐supervise nginx 8.0K 4.0K 9.0K 40.0K 233 root s6-‐supervise cron 8.0K 4.0K 9.0K 40.0K 1 root s6-‐svscan -‐t0 /var/run/s6/s 16.0K 36.0K 36.0K 40.0K 235 root cron -‐f 184.0K 112.0K 205.0K 1.0M 239 root rsyslogd -‐f /etc/rsyslog.co 380.0K 776.0K 830.0K 1.7M 237 root nginx: master process /usr/ 956.0K 288.0K 851.0K 2.7M 18840 root /bin/bash 228.0K 1.1M 1.1M 1.8M 19464 root /usr/bin/python /usr/bin/sm 0 6.3M 6.4M 7.1M 236 root php-‐fpm: master process (/e 4.8M 9.1M 9.7M 12.2M -‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐ 13 1 6.5M 17.7M 19.1M 26.7M

Truth be said, even PSS does not give the ultimate answer, since it is “proportional” or statistical point-‐in-‐time dependent value and it will vary depending on the processes running on the system and whether they are mapping the same shared libraries or not. It has taken quite a bit of digging to answer that simple question we asked in the chapter begin and the answer is pretty much “it depends…” ;-‐) The whole picture is far from being complete. If you want to have a solid understanding of all the bits and pieces contributing to Linux kernel memory allocation mechanisms often referred to as Linux VMM, take a look at the following great five-‐part article: https://techtalk.intersec.com/2013/07/memory-‐part-‐1-‐memory-‐types/ Things to keep in mind:

• The considerations in this chapter providing are estimates only, not hard figures. Therefore these estimates must be periodically verified and adjusted as needed. Eventually, the capacity management shall consider these estimates for infrastructure, application and container resource sizing;

• To make things even more complicated, containers creating additional layers of indirection and understanding virtual memory in the context of containerization is a subject for a separate book. The homework task for the curious reader is to figure out, how the “docker stats” memory usage relate to the memory figures reported for the processes inside the container. Hint: see the source;

• Page swapping is definitely bad, but, if your options are killing the application or letting it run very slow, the choice is obvious;

• One shall not forget that swap memory slices allotted to multiple containers are taken from the host system swap memory pool. For example, 100 small containers with 1GB swap limit each, given serious load, may all claim their share of swap memory at some point. So, you got to size your host system swap partitions accordingly, considering per-‐container swap allocations, otherwise prepare to OOM messages in the logs;



Drupal Project Creation The Drupal CMS is modular and very flexible, when it comes to customization and extending core functionality. The developer may add required modules, themes and features to the Drupal core and thus create new custom Drupal distribution that may be used for company-‐wide or brand-‐specific websites. Drupal Profiles providing programmatic mechanism for defining distribution components, dependencies and rules for building distribution out of definition (make) file. From vendor documentation: Distributions provide site features and functions for a specific type of site as a single download containing Drupal core, contributed modules, themes, and pre-‐defined configuration. They make it possible to quickly set up a complex, use-‐specific site in fewer steps than if installing and configuring elements individually. There are several building blocks used to assemble new Drupal website project:

• VzBase profile – it does include carefully selected set of Drupal modules, features, configurations and themes to provide a jumpstart and a solid base for enterprise CMS-‐based websites. Two example installation profiles have been created under vzpoc repository: vzbase-‐7.43 and vzbase-‐7.50;

• Drupal Base Project – it’s a boilerplate structure that helps to simplify starting a new site by having the most common directory structures and files already included and set up;

The diagram below is schematically depicting the most important build stages.

Figure 19 -‐ Drupal Project Creation Process



The whole Drupal Project Creation process can be described as following:

1. The Drupal Base project is cloned from git to the new Drupal Project location. Let’s call it SITE_ROOT. This way we can ensure that all site projects are having well defined standardized structure;

2. The VzBase Profile is cloned from git to a temporary location; 3. The VzBase Profile being built and its build results – the custom Drupal distribution is

stored under the SITE_ROOT/docroot path; 4. New git repository initialized under the SITE_ROOT, including both Drupal Base project

structure and fresh built VzBase Drupal distribution; 5. The project repository for the new Drupal project is checked into the git, i.e. committed and

pushed to the origin. The steps outlined above can be executed using either Orchestration Portal or Platform CLI. For example, the following command will build new Drupal project named d7base in the test_agency project group, using vzpoc/vzbase-‐7.50 installation profile:

$ /opt/deploy/web project build -‐-‐project d7base -‐-‐group test_agency -‐-‐profile vzpoc/vzbase-‐7.50 web project build: found Gitlab project group: test_agency web project build: creating new GitLab project: web project build: |-‐-‐ project group: test_agency web project build: |-‐-‐ project name: d7base web project build: |-‐-‐ project desc: d7base web project web project build: \-‐-‐ response JSON: { "id": 133, "description": "d7base web project", "default_branch": null, "tag_list": [], "public": false, "archived": false, "visibility_level": 0, "ssh_url_to_repo": "ssh://[email protected]:2222/test_agency/d7base.git", "http_url_to_repo": "https://gitlab.poc.local:8443/test_agency/d7base.git", "web_url": "https://gitlab.poc.local:8443/test_agency/d7base", "name": "d7base", "name_with_namespace": "test_agency / d7base", "path": "d7base", "path_with_namespace": "test_agency/d7base", "issues_enabled": true, "merge_requests_enabled": true, "wiki_enabled": true, "builds_enabled": true, "snippets_enabled": false, "created_at": "2016-‐11-‐10T13:21:55.498Z", "last_activity_at": "2016-‐11-‐10T13:21:57.196Z", "shared_runners_enabled": true, "creator_id": 7, "namespace": { "id": 106, "name": "test_agency", "path": "test_agency", "owner_id": null, "created_at": "2016-‐10-‐21T13:10:16.072Z", "updated_at": "2016-‐10-‐21T13:10:16.072Z", "description": "test_agency project group", "avatar": { "url": null }, "share_with_group_lock": false, "visibility_level": 0



}, "avatar_url": null, "star_count": 0, "forks_count": 0, "open_issues_count": 0, "runners_token": "mXyxGNzRXmj-‐nZNFRC8d", "public_builds": true } gitlab: project created successfully web project build: assembling new project d7base@test_agency... web project build: |-‐-‐ cloning drupal-‐base + vzbase-‐7.50 -‐> workspace web project build: |-‐-‐ building vzbase-‐7.50 drupal site profile web project build: \-‐-‐ commiting and pushing workspace -‐> git Cloning into 'd7base'... remote: Counting objects: 200, done. remote: Compressing objects: 100% (112/112), done. Receiving objects: 100% (200/200), 164.68 KiB | 0 bytes/s, done. Resolving deltas: 100% (93/93), done. Checking connectivity... done. Cloning into 'vzbase-‐7.50'... remote: Counting objects: 54, done. remote: Compressing objects: 100% (54/54), done. Receiving objects: 100% (54/54), 26.08 KiB | 0 bytes/s, done. Resolving deltas: 100% (13/13), done. Checking connectivity... done. Wiping target directory ../d7base/docroot ... Building distribution <vzbase> Beginning to build /var/www/vzbase-‐7.50/build-‐vzbase.make. [ok] drupal-‐7.50 downloaded. [ok] vzbase copied from .. [ok] Found makefile: drupal-‐org.make [ok] Project ldap contains 12 modules: ldap_test, ldap_authorization_og, ldap_authorization_drupal_role, ldap_authorization, ldap_feeds, ldap_help, ldap_servers, lda p_authentication, ldap_query, ldap_sso, ldap_views, ldap_user. ldap-‐7.x-‐2.0-‐beta11 downloaded. [ok] … Applying local patches ... /var/www/d7base/docroot/profiles/vzbase /var/www/vzbase-‐7.50 patching file modules/workbench_email/workbench_email.module Hunk #1 succeeded at 802 (offset 6 lines). /var/www/vzbase-‐7.50 Initialized empty Git repository in /var/www/d7base/.git/ [master (root-‐commit) f556005] Initial revision 4859 files changed, 880897 insertions(+) create mode 100644 .gitignore create mode 100644 docroot/.editorconfig … create mode 100644 tests/readme.md Counting objects: 5315, done. Delta compression using up to 2 threads. Compressing objects: 100% (5138/5138), done. Writing objects: 100% (5315/5315), 22.70 MiB | 10.24 MiB/s, done. Total 5315 (delta 513), reused 0 (delta 0) To ssh://[email protected]:2222/test_agency/d7base.git * [new branch] master -‐> master Branch master set up to track remote branch master from origin. web project build: successfully built project

After performing above steps the test_agency/d7base website project has been built from scratch using vzpoc/vzbase-‐7.50 installation profile. Obviously, there is no need to build installation profile for every new Drupal project. Much more efficient approach is cloning existing base distribution, so the following command will produce the equivalent result:



$ /opt/deploy/gitlab project add -‐-‐project d7base2 -‐-‐group test_agency -‐-‐clone vzpoc/d7-‐vzbase-‐7.50 gitlab: looking up seed project: vzpoc/d7-‐vzbase-‐7.50... gitlab: project created successfully gitlab: cloning project vzpoc/d7-‐vzbase-‐7.50... Cloning into 'd7base2'... remote: Counting objects: 5315, done. remote: Compressing objects: 100% (4625/4625), done. Receiving objects: 100% (5315/5315), 22.70 MiB | 29.96 MiB/s, done. Resolving deltas: 100% (513/513), done. Checking connectivity... done. Initialized empty Git repository in /var/www/d7base2/.git/ [master (root-‐commit) 553ebd3] Initial revision 4859 files changed, 880897 insertions(+) create mode 100644 .gitignore create mode 100644 docroot/.editorconfig … create mode 100644 tests/readme.md Counting objects: 5315, done. Delta compression using up to 2 threads. Compressing objects: 100% (5138/5138), done. Writing objects: 100% (5315/5315), 22.70 MiB | 11.12 MiB/s, done. Total 5315 (delta 513), reused 0 (delta 0) To ssh://[email protected]:2222/test_agency/d7base2.git * [new branch] master -‐> master Branch master set up to track remote branch master from origin. gitlab: project cloned successfully

Things to keep in mind: • An installation profile can only be used when you're installing a new Drupal instance. This

means that you cannot run an installation profile on an existing Drupal site to add extra functionality;

• You can also only select one installation profile per site. This means that, if existing installation profile has to be extended with new modules or features, the new extended profile must be created;

• The proposed approach is to have a well-‐defined set of installation profiles for different website types. All website projects must be built out of one of standard installation profiles.

Drupal Website Deployment The Website deployment includes two major steps: Web Project Deployment and Web Container Deployment. Simply speaking, the Project is Website code, content and everything that belongs to Website assets. The Web Container is an application engine or runtime, which is executing code and serving requests to the end-‐users. The management and lifecycle of those two parts supported by corresponding Platform Services. Although, both steps are completely independent, for the sake of consistency it’s recommended to deploy Web Project first and then deploy Web Container for this project.

Web Project Deployment Let’s look to the Project Deployment first. The http://Drupal.org website and community resources providing number of step-‐by-‐step guides for installing Drupal projects ranging from manual instructions to fully unattended setup procedures. The proposed project deployment procedure is completely automated and taking into account platform requirements, naming conventions and project structure. Deployment can be executed using either Orchestration Portal or Platform CLI.



Figure 20 -‐ Drupal Project Deployment Process

The following steps are usually executed for deploying Drupal project:

1. The site-‐monitor project is cloned from git to the new Drupal Project location corresponding to the ROOT container volume. Let’s call it SITE_ROOT.

2. The website project is cloned from git to the new Drupal Project location, or SITE_ROOT. The SITE_ROOT/docroot symlink created, pointing to the docroot in the web project folder;

3. The site installation scripts executed. The following tasks being performed at this stage: a. Create website database; b. Deploy fie-‐system objects required for site operation; c. Deploy certificates; d. Deploy and adjust configuration files; e. Adjust file-‐system object owners and permissions;

4. Populate Database. There are two possible approaches: a. New “empty” database is created and initialized by Drupal; b. Existing site DB snapshot can be transformed and imported;

5. Populate media and assets. These objects stored in the location corresponding to the DATA container volume. Same as with database, two approaches possible:

a. Generate assets and site media using programmatic routines; b. Sync all or some real media and assets from existing site;



The best this can be demonstrated by example. First of all, a little bit of background information. Every project is belonging to a project group, which is in turn owned by a corresponding organization. Or other way around: the organization Demo Agency has the project group called demo_agency having in turn the project demo.

$ /opt/deploy/ldap org list Alpha Agency Beta Agency Demo Agency $ /opt/deploy/gitlab group list alpha_agency beta_agency demo_agency images vzpoc $ /opt/deploy/gitlab project list -‐-‐group demo_agency -‐-‐format table 118 Private demo 2016-‐10-‐06T08:07:27.041Z https://gitlab.poc.local:8443/demo_agency/demo Drupal 7.50 demo project

Let’s deploy demo_agency/demo project to the d7-‐demo website in staging environment:

$ /opt/deploy/web project deploy -‐-‐site d7-‐demo -‐-‐farm poc -‐-‐env stg -‐-‐org 'Demo Agency' -‐-‐project demo -‐-‐group demo_agency web project deploy: looking up Drupal credentials in secure storage... web project deploy: get secure variable vault:/secret/poc/stg/d7-‐demo/DRUPAL_ADMIN_NAME web project deploy: get secure variable vault:/secret/poc/stg/d7-‐demo/DRUPAL_ADMIN_PASS web project deploy: Drupal credentials missing, generating new set... web project deploy: saving Drupal credentials in secure storage... web project deploy: put secure variable vault:/secret/poc/stg/d7-‐demo/DRUPAL_ADMIN_NAME web project deploy: put secure variable vault:/secret/poc/stg/d7-‐demo/DRUPAL_ADMIN_PASS @wbs1 web project deploy: looking up Drupal credentials in secure storage... @wbs1 web project deploy: get secure variable vault:/secret/poc/stg/d7-‐demo/DRUPAL_ADMIN_NAME @wbs1 web project deploy: get secure variable vault:/secret/poc/stg/d7-‐demo/DRUPAL_ADMIN_PASS @wbs1 web project deploy: found Drupal credentials in secure storage @wbs1 web project deploy: using Drupal credentials from secure storage @wbs1 web project deploy: folder /var/web/stg/root/d7-‐demo not found, creating @wbs1 web project deploy: folder /var/web/stg/data/d7-‐demo not found, creating @wbs1 web project deploy: folder /var/web/stg/logs/d7-‐demo not found, creating @wbs1 web project deploy: folder /var/web/stg/cert/d7-‐demo not found, creating @wbs1 web project deploy: folder /var/web/stg/temp/d7-‐demo not found, creating @wbs1 web project deploy: creating private and public file-‐storage... @wbs1 web project deploy: folder /var/web/stg/data/d7-‐demo/public not found, creating @wbs1 web project deploy: folder /var/web/stg/data/d7-‐demo/private not found, creating @wbs1 web project deploy: website DB host: 10.169.69.12 @wbs1 web project deploy: website DB name: STG_D7_DEMO @wbs1 web project deploy: looking up DBA credentials in secure storage @wbs1 web project deploy: get secure variable vault:/secret/poc/mysql/DBA_USER @wbs1 web project deploy: get secure variable vault:/secret/poc/mysql/DBA_PASS @wbs1 web project deploy: looking up site DB credentials in secure storage @wbs1 web project deploy: get secure variable vault:/secret/poc/stg/d7-‐demo/SITE_DB_USER @wbs1 web project deploy: get secure variable vault:/secret/poc/stg/d7-‐demo/SITE_DB_PASS @wbs1 web project deploy: DB credentials missing, generating new set... @wbs1 web project deploy: put secure variable vault:/secret/poc/stg/d7-‐demo/SITE_DB_USER @wbs1 web project deploy: put secure variable vault:/secret/poc/stg/d7-‐demo/SITE_DB_PASS @wbs1 web project deploy: creating DB user if not present @wbs1 web project deploy: building website ... Cloning into 'monitor'... Cloning into 'demo'...



@wbs1 web project deploy: site-‐install: creating website settings: /var/www/demo/docroot/sites/default/settings.php @wbs1 web project deploy: site-‐install: allow site authentication only for users from: Demo Agency @wbs1 web project deploy: site-‐install: configuring drupal settings... @wbs1 web project deploy: site-‐install: installing new site: d7-‐demo You are about to CREATE the 'STG_D7_DEMO' database. Do you want to continue? (y/n): y Starting Drupal installation. This takes a while. Consider using the [ok] -‐-‐notify global option. Installation complete. User name: admin User password: ***************** [ok] VZbase defaults configured. [status] @wbs1 web project deploy: site-‐install: rebuilding node access permissions The content access permissions have been rebuilt. [status] @wbs1 web project deploy: site-‐install: linking /var/www/demo/docroot/sites/default/files -‐> /var/data/public @wbs1 web project deploy: site-‐install: site docroot clean-‐up Changing ownership of all contents of /var/www/demo/docroot: user => none group => web Changing permissions of all directories inside /var/www/demo/docroot to rwxr-‐x-‐-‐-‐... Changing permissions of all files inside /var/www/demo/docroot to rw-‐r-‐-‐-‐-‐-‐... Changing permissions of files directories in /var/www/demo/docroot/sites to rwxrwx-‐-‐-‐... Changing permissions of all files inside all files directories in /var/www/demo/docroot/sites to rw-‐rw-‐-‐-‐-‐... Changing permissions of all directories inside all files directories in /var/www/demo/docroot/sites to rwxrwx-‐-‐-‐... Done setting proper permissions on files and directories @wbs1 web project deploy: project deployed successfully

The following major steps were performed in the example above: • Since no credentials provided, generated a set of credentials for Drupal admin account; • Stored Drupal admin credentials in the secure storage; • Created folder structure for the project. Later on, those folders will be mapped to container

volumes and thus made available to applications inside the container; • Generated project-‐specific database name and set of user credentials according to the

naming standards; • Created corresponding database objects and granted access permissions; • Stored database user credentials in the secure storage; • Deployed demo_agency/demo project from the GitLab code repository; • Deployed vzpoc/monitor add-‐on project from the GitLab code repository; • Executed project-‐specific setup steps:

o Created Drupal site settings; o Setup LDAP authentication and authorization; o Installed new Drupal site; o Rebuilt Drupal node access permissions; o Setup file-‐system ownerships using best security practices;

Finally, let’s check deployed projects to validate that deployment was successful:

$ /opt/deploy/web project list -‐-‐farm poc -‐-‐env stg d7-‐demo



Web Container Deployment Now, let’s look at Web Container deployment part. It’s worth mentioning that website project deployment and website container deployment are two separate steps that usually being executed sequentially, however, they are completely independent of each other. More details about Web Container Deployment process can be found in the Container Provisioning Service chapter. For the sake of simplicity and consistency, the Website ID and Web Container name are equal. The following Platform CLI command can be used to deploy the Web Container for the d7-‐demo website:

$ /opt/deploy/web container create -‐-‐farm poc -‐-‐env stg -‐-‐site d7-‐demo -‐-‐image nginx-‐php-‐fpm web container create: using next free IP: 10.169.64.232 web container create: checking 10.169.64.232 is setup inet 10.169.64.232/26 brd 10.169.64.255 scope global secondary enp0s17: web container create: folder /var/web/stg/root/d7-‐demo not found, creating web container create: folder /var/web/stg/data/d7-‐demo not found, creating web container create: folder /var/web/stg/logs/d7-‐demo not found, creating web container create: folder /var/web/stg/cert/d7-‐demo not found, creating web container create: folder /var/web/stg/temp/d7-‐demo not found, creating web container create: exporting container ENV variables from /opt/deploy/container.env web container create: creating container d7-‐demo web container create: |-‐-‐ image-‐tag: registry.poc:5000/poc/nginx-‐php-‐fpm web container create: |-‐-‐ resources: small (-‐-‐cpu-‐shares 16 -‐-‐memory 64m -‐-‐memory-‐swap 1G) web container create: |-‐-‐ published: 10.169.64.232:8080:80 web container create: |-‐-‐ published: 10.169.64.232:8443:443 web container create: |-‐-‐ volume: /var/web/stg/cert/d7-‐demo:/etc/apache2/ssl web container create: |-‐-‐ volume: /var/web/stg/logs/d7-‐demo:/var/log web container create: |-‐-‐ volume: /var/web/stg/root/d7-‐demo:/var/www web container create: |-‐-‐ volume: /var/web/stg/data/d7-‐demo:/var/data web container create: |-‐-‐ volume: /var/web/stg/temp/d7-‐demo:/var/tmp web container create: |-‐-‐ volume: tmpfs:/run web container create: |-‐-‐ volume: tmpfs:/tmp web container create: |-‐-‐ label: container.env=stg web container create: |-‐-‐ label: container.size=small web container create: |-‐-‐ label: container.site=d7-‐demo web container create: \__ label: container.type=web web container create: started site container cb68618b84b4d3276a77ebd4a0635c5387a8319f1ffaac3759c74820fa32b258

Website Deployment Workflow The workflow shown on the screenshot below is performing both tasks, i.e. deploying chosen Web Project for specified organization into selected environment and creating the Web Container to serve this website to internet users. Things to keep in mind:

• Since Web Projects and Web Containers belonging to the same Website are independent, they may be managed separately and have own lifecycle. The Web Container may be re-‐deployed in case of upgrade or patching without impact to Web Project. The Web Project may be re-‐deployed and still be served by existing Web Container. The only touch-‐point or interface between Web Projects and Web Containers is a set of file-‐system volumes.

• Nothing is preventing from having more than two instances for the same website, making possible scale-‐out scenarios. On the other hand, each website instance has it’s own web container, however, whether it’s having own (dedicated) or shared project code and content deployed on container volumes – depends on the storage model chosen.



Figure 21 -‐ Website Deployment Workflow

Editorial Workflow Now, that Drupal CMS instance is up and running, it’s a time to start adding website content. Drupal is a mature CMS itself and by using various modules its content management capabilities can be extended even further. The VzBase distribution includes all required modules, features and configuration for implementing editorial workflow depicted on the diagram below. The workflow is defining the following states for the unit of content:

• DRAFT – the content is still worked on or requires amendments; • NEEDS REVIEW – the content work is complete and it requires review; • PUBLISHED – the content passed review, validation and eventually has been approved for

publishing. Although the content is published, it’s not yet visible to the public users; • PENDING DEPLOY – the content has been scheduled for deployment; • DEPLOYED – the content has been deployed to internet facing environment and visible to

the public users;



The editorial workflow is also assuming the following user roles: • Contributor, or Author – responsible for authoring and populating the web site content; • Editor, or Reviewer – responsible for proof-‐reading, checking and otherwise validating

site content against defined rules and quality standards; • Publisher, or Content Manager – responsible for publishing the content, i.e. making it

accessible to public internet users; Since VzBase Drupal distribution is using Platform IdM service, the Drupal roles outlined above are mapped to Active Directory groups. This mapping is absolutely transparent, so the user role in Editorial workflow is defined by user’s LDAP group membership.

Figure 22 -‐ Editorial Workflow

Obviously, at any moment in time each unit of content has one of the states defined above. The workflow actors, i.e. Drupal users having corresponding workflow roles can change these states, thus executing workflow transitions for the given unit of content. When content is transitioned between workflow states, the email notification being sent to the workflow user having an action. For example, when content author is pushing an article to the NEEDS REVIEW state, the content editor in charge is getting the following email notification:

Subject: [Publishing-‐Workflow][Review] content review requested Dear editor, Please review the following content [node:url:absolute]. Sincerely, your friendly workflow engine.



Things to keep in mind: • Important point worth mentioning – although all Drupal users authenticated against Active

Directory, only users belonging to the organization owning the web site can pass authentication;

• Proposed roles and transition states are not set in stone and, although supporting the real-‐life editorial process, they have been provided for example purposes only. The workflow can be changed to correspond the editorial workflow defined in your organization;

• Notification emails can be either sent automatically to all users given certain role, which is quite practical in smaller organizations, or specific user with given role may be selected. For example, the Content Author has been assigned a certain Editor for the articles on given subject. In this case when content is submitted for review, the Author can choose assigned Editor and only this specific Editor will be notified about pending review.

Content Publishing Anyone familiar with Drupal may wonder, why additional content publishing or deployment step is needed. Normally, when you published the content in the Drupal, it’s visible to the site users. That’s true, this content management model is working fine for simple deployments, however, in more complex scenarios with multiple hosting environments this mechanism falls short and here is why… Historically, Drupal CMS does not differentiate environments where content is created and where content is displayed. It does mean that content author (oftentimes editor and site administrator in the same person) is logging in to the Drupal site administration area and making content changes. These content changes are stored in the database or in the file-‐system and first staged, i.e. not yet visible to the site visitors. Whenever editor decides that the content is ready, she is pushing the publish button and voilà – the content is live. In the real life and more complex scenarios it does not work that way. What if …

• There are multiple content authors and editors, often belonging to virtual and geographically distributed teams.

• There are multiple versions of the same content. What is the source of truth? • There are multiple environments and multiple site instances. Where to update the content? • The content updates are only allowed from the intranet or other dedicated and

appropriately secured environment. • The internet-‐facing website doesn’t allow login and site management for security reasons. • The internet-‐facing website is hardened for further security and CMS-‐related modules and

features are disabled or even stripped-‐down. • The content deployment must be performed during specific well-‐defined change windows. • The content deployment must be done to several target sites.

Although, some of the points listed above may be addressed by using certain Drupal modules and extensions, eventually, we are coming to a need to separate content authoring and content publishing mechanisms. This way, all content modification activities can be conducted in isolated and secure “authoring” environment. Whenever desired result is achieved, the content is pushed (or deployed) to the internet-‐facing instance or instances.



Figure 23 -‐ Content Publishing Process

From a technical perspective this boils down to copying the content and some site configurations from Drupal website instance located in one environment to another instance in a different environment. There are multiple approaches possible, ranging from backup-‐restore procedure requiring website downtime to incremental transactional updates, not impacting website uptime or even caches (at least unaffected parts of them). Proposed approach is to employ drush sync (Unix rsync in background) for copying modified file-‐system objects and to update content entities using XML-‐RPC protocol implemented by Deploy module as depicted on the diagram below. There is a comprehensive DrupalCon presentation discussing in more details issues and possible solutions for the modern publishing systems: https://www.youtube.com/watch?v=EJDGfye3OuQ. Things to keep in mind:

• Multiple content deployment models possible: on-‐demand updates performed as soon as the content is approved, content batches accumulated over some periods of time and pushed altogether or scheduled content deployment jobs executed during well-‐defined content publishing windows;

• Since there is no content modification and site administration performed through this internet-‐facing instance, it can be stripped from major part of Drupal modules and code to bare minimum and can be locked down, thus, making attack surface orders of magnitude smaller or even completely preventing some classes of website attacks;



• The Drupal features must be used for distributing configuration changes, however, this is not exactly content publishing, rather configuration management task, which may require different implementation approaches depending on Drupal version;

• Updating website content in runtime may require additional steps on the CDN (content caching layer). The parts of the cache for the content tree must be flushed and re-‐populated. This can be further automated using APIs or using other content flush mechanisms depending on the CDN in use;

• Drupal caching mechanisms must be also aligned with content update strategy to avoid caching the stale content or dropping caches for unchanged content parts, effectively reducing or eliminating effect of caching, which can put additional stress on infrastructure, if content updates performed frequently.

Active Directory Structure The Active Directory Structure is following these simple design principles:

• Platform-‐scoped AD objects are separate from Hosting-‐scope AD objects; • Hosting-‐scoped AD objects may refer to Platform-‐scope AD objects, not other way around; • All AD objects belonging to a hosted organization are encapsulated in dedicated OU

container; • The Platform may have own Users, Groups and Service Accounts; • Each hosted organization may also have own Users, Groups and Service Accounts;

Figure 24 -‐ Example: MS Active Directory Structure

Let’s describe in more details the structure presented above. For the sake of simplicity all AD objects stored under the root-‐level HOSTING org-‐unit. It does include the following platform-‐scoped objects in corresponding containers:

• Groups – platform-‐scoped LDAP groups; • Service Accounts – platform-‐scoped service accounts; • Users – platform-‐scoped users;



The hosting-‐scoped AD objects are further separated and stored under the EXT org-‐unit. The org-‐units provisioned for each hosted organization are provisioned here. For example AD objects for the Test Agency organization are located under /HOSTING/EXT/Test Agency path in LDAP tree. Like with platform-‐scope AD objects, the hosting-‐scoped AD objects for particular organization are stored under its org-‐unit. Following the above example, the Test Agency org-‐unit has its objects stored in corresponding containers:

• Groups – hosting-‐scoped LDAP groups belonging to specific hosting org-‐unit; • Service Accounts – hosting-‐scoped service accounts belonging to specific org-‐unit; • Users – hosting-‐scoped users belonging to specific hosting org-‐unit;

For example, all users belonging to the Test Agency will be provisioned under the /HOSTING/EXT/Test Agency/Users path in LDAP tree. These users can belong to both groups from the Platform scope /HOSTING/Groups and to groups from the hosted org-‐unit scope /HOSTING/EXT/Test Agency/Groups. Eventually, the LDAP tree is schematically looking like following:

/[HOSTING] -‐ Root OU for hosting-‐related AD objects /Groups -‐ Platform Groups /Service Accounts -‐ Platform Service Accounts /Users -‐ Platform Users /[EXT] -‐ OU for External hosted organizations /[Organization 1] -‐ OU for each organization /Groups -‐ Organization Groups /Service Accounts -‐ Organization Service Accounts /Users -‐ Organization Users /[Organization 2] /Groups /Service Accounts /Users ... /[Organization N] /Groups /Service Accounts /Users

Practically speaking, such LDAP structure will allow the user belonging to a particular organization, say Test Agency to be a member of GitLab Users platform group and thus gain access to the GitLab code repositories At the same time this user may also belong to a local group Test Agency Sonar Users and thus gain access to the Test Agency’s Sonar projects. Things to keep in mind:

• Obviously, creating one AD structure to fit all possible especially unknown future requirements is impossible. The presented AD structure and design principles are thought to provide flexible enough foundation to be extended and adapted to fit new and changed project requirements;

• Platform Components are heavily relying on given AD structure, i.e. authorization groups, service accounts and platform users are looked-‐up in specific containers. Shall AD structure need adjustments, all Platform Components depending on AD must be reviewed and their configuration must be adjusted accordingly;

• Due to AD requirements user login names must be unique within enterprise. Although you may have two users with the same “First Last” name in different containers, the login name attribute must be still unique;



• Due to AD requirements the group names must always be unique, even if they are located in different LDAP containers and org-‐units;

• Platform does not put any limitations for AD object names and generally MS ADC documentation must be checked for AD-‐specific limitations and naming requirements;

• During the POC Managed Service Accounts were not explored. All Service Accounts are plain accounts;

• Changing Service Accounts passwords is a simple and straightforward operation, since passwords are stored and managed by Secure Storage service on the platform side. This makes regular password update operation very easy to automate and implement programmatically;

• The only exception is Drupal Service Account, whose credentials are stored in the AD Feature configuration. This feature code may be extended to fetch credentials from the Secure Store in the run-‐time.

GitLab Repository Structure The GitLab Source Code Management platform access and authorization matrix can be simply mapped to the LDAP structure in GitLab Enterprise Edition (EE). The Community Edition (CE), however, supports only a subset of this functionality, which is nonetheless sufficient for implementing user and project isolation and role based access. The GitLab platform supports a concept of project and project namespace, sometimes called a project group. These project groups and projects may have a different scope and visibility. The users may get assigned certain project roles and permissions for project groups and projects. The following rules have been established:

• Each LDAP organization has one and only one project group assigned; • The project group name is derived from the LDAP org-‐name by lowercasing it’s letters and

replacing spaces with the underscore (‘_’) character. For example, the Test Agency LDAP org gets assigned the test_agency GitLab project group;

• Each GitLab project group is provisioned with private visibility level. Meaning that only users, who have been explicitly granted access permission, can access this project group.

• Unless different specified, by default, GitLab users authenticated via LDAP are given developer project role.

Besides having dedicated project group for every LDAP organization, several project groups have been reserved for the platform itself. The group holding platform code is called vzpoc. It does contain the following projects:

• adtool -‐ Unix command line utility for Active Directory administration; • deploy -‐ This project contains Platform CLI: the number of tools along with their

configurations for managing web containers, code deployment and performing other platform management tasks;

• d7-‐vzbase-‐7.50 – The Drupal 7.50 built from VzBase profile with AD integration and the Publishing Workflow features;

• drupal-‐base -‐ This is a boilerplate directory structure for starting a new Drupal project; • foundation – This project contains composer scripts for bootstrapping platform

foundation services; • jenkins – This project contains Jenkins workflows and configuration store; • monitor – The add-‐on project, contains health-‐check monitors for LB VIPs; • vzbase-‐7.50 – This project contains VzBase Installation Profile for Drupal 7.50;



Another platform project group Images has been created to store all container image projects. Eventually, the GitLab project structure is looking like following:

/images/ /image_1 /image_2 … /image_N /vzpoc/ /adtool /deploy /drupal-‐base /foundation /jenkins /monitor /d7-‐vzbase-‐7.50 /vzbase-‐7.50 /… /<ldap_org_name_1>/ /project_1 /project_2 … /project_N

Things to keep in mind: • Actually, the GitLab project group name can be made equal to LDAP org name. This will

require, however, maintaining a separate GitLab project path property, which must be HTTP URL friendly and that will be used behind the scene, when specific project with the given name being addressed;

• The platform projects names and their group names are not fixed and can be adjusted as needed. Since platform is heavily relying on source control, the platform configuration must be updated to account for those changed names.

• The name validators implemented in Platform CLI tools are usually checking against predefined string length and a subset of allowed characters quoted by vendor. These validators have to be adjusted per specific project requirements.

Management Tasks and Workflows All platform management and lifecycle tasks have corresponding jobs in Jenkins, thus allowing RBAC platform management. These platform management and lifecycle tasks can be divided into two groups: low-‐level tasks and workflows. Generally speaking, the workflow is an ordered list of lifecycle tasks. There are workflows for creating or adding new hosting object and, eventually, removing or deleting this object. The workflows for updating or modifying hosting objects are not provided due to practicality reasons. Besides doing additional parameter validation and invoking sub-‐tasks in predefined order the workflows providing several more benefits:

• Workflows are following pipeline as a code paradigm and created using Groovy code; • The Groovy code for workflows is stored under source control; • The workflow parameters pre-‐populated using dynamic lists and responsive controls; • Workflows can be paused, stopped and continued at any point in time. They can be also re-‐

played at a later point in time.



The following high-‐level workflows have been defined: • Organization Setup – creates all required hosting objects and structures for newly added

organization. It does include the following sub-‐tasks: o Create LDAP Org o Create GitLab Group o Create “Sonar Users” LDAP Group o Create SonarQube Authorization Group o Create SonarQube Permission Template o Add LDAP Group with “Browse” Permission to Template o Add LDAP Group with “See Source Code” Permission to Template

• Organization Remove – removes hosting organization and all related objects. It does include the following sub-‐tasks:

o Remove GitLab Group o Remove SonarQube Projects o Remove SonarQube Permission Template o Remove SonarQube Authorization Group o Remove “Sonar Users” LDAP Group o Remove LDAP Org

• Project Setup – creates new GitLab project owned by given organization • Project Remove – removes GitLab project owned by given organization • User Setup – creates new hosting user for the given organization. It does include the

following sub-‐tasks: o Create LDAP Account o Add Account to “Sonar Users” LDAP Group o Add Account to “GitLab Users” LDAP Group o Create GitLab Account o Grant Account Access to Organization Projects

• User Remove – removes hosting user for the given organization. It does include the following sub-‐tasks:

o Remove GitLab Account o Remove LDAP Account

• Website Setup – deploying website and all related hosting objects for the given Organization and its project. It does include the following sub-‐tasks:

o Deploy Website Project o Deploy Website Container o Enable Website VIP o Test Website Monitor

• Website Remove – removes website and all related hosting objects. It does include the following sub-‐tasks:

o Remove Website Container o Remove Website Project

• Code Analyzer – submits specified project owned by given Organization for analysis. The analysis results may be accessed via the Sonar Portal

• Container Image Builder – performs container image build from its specification. The image is appropriately tagged, tested and submitted to the Image Registry.

The sub-‐tasks used by the platform management workflows are stored in the Admin Tools folder. Unlike workflows, the low-‐level tasks do not pre-‐populate values and not using dynamic parameters, thus giving more control and flexibility to the platform administrator.



These low-‐level tasks may be executed in arbitrary order. The user triggering task execution must ensure prerequisites and provide parameters following predefined scheme and naming standards. Below is a list of low-‐level tasks:

$ java -‐jar ./war/WEB-‐INF/jenkins-‐cli.jar -‐s http://localhost:8080 list-‐jobs Admin_Tools gitlab_group_add gitlab_group_adduser gitlab_group_del gitlab_group_deluser gitlab_group_list gitlab_group_users gitlab_project_add gitlab_project_del gitlab_project_list gitlab_user_add gitlab_user_del gitlab_user_list ldap_group_add ldap_group_adduser ldap_group_del ldap_group_deluser ldap_group_list ldap_group_users ldap_org_add ldap_org_del ldap_org_list ldap_user_add ldap_user_del ldap_user_groups ldap_user_list sonar_group_add sonar_group_del sonar_project_del sonar_template_add sonar_template_addgroup sonar_template_del sonar_template_delgroup web_container_deploy web_container_health web_container_ipmap web_container_list web_container_remove web_container_stats web_project_deploy web_project_remove web_vip_down web_vip_status web_vip_test web_vip_up

The Test folder contains “Test Platform” and “Test Setup” tasks. See the Test CLI chapter for examples and additional details. Things to keep in mind:

• Management workflows are using scriptler scripts for generating and populating dynamic parameter lists. Those scripts are shared among multiple Jenkins tasks;

• Management Workflows and lifecycle tasks propagating return codes from Platform tools; • Management workflows don’t alert warnings and don’t attempt recovery. When issue

occurred in one sub-‐task, the whole workflow is marked as failed. And other way around, if workflow execution succeeded it does mean that every workflow sub-‐task has accomplished successfully.



Platform Startup There are several ways for ensuring repeatable and consistent service startup, among them systemd definitions, custom shell scripts and docker-‐composer scripts. Without performing thorough comparison, which goes beyond the scope of this paper, we’ll say that the latter option has been chosen to manage platform service startup, although it does have own shortcomings that will be outlined later. Below is an example of composer definition file allowing booting up all required services:

version: '2' services: ############################################################################## ### Registry: TLS enabled local docker image repository ### ========================================================================== ## Depends on: ## 1. Host folders: ${VOL_DATA}/registry/{certs,config,data} ## 2. Service config file: ${VOL_DATA}/registry/config/config.xml ## 3. TLS certificates: ${VOL_DATA}/registry/certs # ## Ensuring dependencies: ## 1. Creating host folders # $ mkdir -‐p ${VOL_DATA}/registry/{certs,config,data} # ## 2. Creating service config # $ cat <<EOT > ${VOL_DATA}/registry/config/config.xml # version: 0.1 # log: # level: info # fields: # service: registry # environment: production # storage: # delete: # enabled: true # cache: # blobdescriptor: inmemory # filesystem: # rootdirectory: /var/lib/registry # maintenance: # uploadpurging: # enabled: true # age: 168h # interval: 24h # dryrun: false # redirect: # disable: false # http: # addr: :5000 # debug: # addr: :5001 # tls: # certificate: /certs/registry.crt # key: /certs/registry.key # headers: # X-‐Content-‐Type-‐Options: [nosniff] # EOT # ## 3. Creating TLS certificates # $ mkdir -‐p ~/certs && cd ~/certs # $ openssl req -‐x509 -‐nodes -‐days 365 -‐newkey rsa:2048 \ # -‐subj "/C=DE/ST=HE/L=Frankfurt/O=Verizon/OU=Managed Hosting/CN=registry.poc/[email protected]" \ # -‐keyout registry.key -‐out registry.crt



# ## 4. Deploying certificate to container volume # $ sudo cp ~/certs/registry.* ${VOL_DATA}/registry/certs # ## 5. Deploying certificates to the Docker certificate store (on each docker host) # $ sudo mkdir -‐p /etc/docker/certs.d/registry.poc:5000 # $ sudo cp ~/certs/registry.crt /etc/docker/certs.d/registry.poc\:5000/ca.crt # ## 6. Restarting docker # $ systemctl restart docker.service # ## 7. Validating registry service # $ docker tag busybox registry.poc:5000/poc/busybox:v1 # $ docker push registry.poc:5000/poc/busybox:v1 # $ curl -‐-‐cacert certs/registry.crt -‐X GET https://registry.poc:5000/v2/poc/busybox/tags/list # ## Startup command example: # $ docker run -‐-‐name registry -‐-‐hostname registry -‐-‐detach=true -‐-‐restart=always \ # -‐-‐env REGISTRY_HTTP_TLS_CERTIFICATE=/certs/registry.crt -‐-‐env REGISTRY_HTTP_TLS_KEY=/certs/registry.key \ # -‐-‐volume ${VOL_DATA}/registry/certs:/certs:ro -‐-‐volume ${VOL_DATA}/registry/data:/var/lib/registry:rw \ # -‐-‐volume ${VOL_DATA}/registry/config/config.xml:/etc/docker/registry/config.xml:ro \ # -‐-‐publish ${UTIL_HOST}:5000:5000 \ # registry:2.5 registry: container_name: registry image: registry:2.5 hostname: registry dns: -‐ ${DNS_SERVER1} -‐ ${DNS_SERVER2} ports: -‐ ${UTIL_HOST}:5000:5000 volumes: -‐ ${VOL_DATA}/registry/certs:/certs:ro -‐ ${VOL_DATA}/registry/config/config.xml:/etc/docker/registry/config.xml:ro -‐ ${VOL_DATA}/registry/data:/var/lib/registry environment: -‐ REGISTRY_HTTP_TLS_CERTIFICATE=/certs/registry.crt -‐ REGISTRY_HTTP_TLS_KEY=/certs/registry.key cpu_shares: 512 mem_limit: 2G security_opt: -‐ no-‐new-‐privileges restart: on-‐failure:5 read_only: false ############################################################################## ### GitLab: git repository server ### ========================================================================== ## Depends on: ## 1. Host folders: ${VOL_DATA}/gitlab/{config/ssl,logs,data} ## 2. TLS certificates: ${VOL_DATA}/gitlab/config/ssl # ## Ensuring dependencies: ## 1. Creating host folders # $ mkdir -‐p ${VOL_DATA}/gitlab/{config/ssl,logs,data} # ## 2. Creating TLS certificates # $ openssl req -‐x509 -‐nodes -‐days 365 -‐newkey rsa:2048 \ # -‐subj "/C=DE/ST=HE/L=Frankfurt/O=Verizon/OU=Managed Hosting/CN=gitlab.poc.local/[email protected]" \ # -‐keyout ${VOL_DATA}/gitlab/config/ssl/gitlab.poc.local.key -‐out ${VOL_DATA}/gitlab/config/ssl/gitlab.poc.local.crt # ## Startup command example: # $ docker run -‐-‐name gitlab -‐-‐detach=true -‐-‐restart always -‐-‐hostname gitlab.poc \



# -‐-‐publish ${UTIL_HOST}:8443:443 -‐-‐publish ${UTIL_HOST}:8080:80 -‐-‐publish ${UTIL_HOST}:2222:22 \ # -‐-‐volume ${VOL_DATA}/gitlab/config:/etc/gitlab \ # -‐-‐volume ${VOL_DATA}/gitlab/logs:/var/log/gitlab \ # -‐-‐volume ${VOL_DATA}/gitlab/data:/var/opt/gitlab \ # gitlab/gitlab-‐ce:8.6.8-‐ce.0 gitlab: container_name: gitlab image: gitlab/gitlab-‐ce:8.6.8-‐ce.0 hostname: gitlab dns: -‐ ${DNS_SERVER1} -‐ ${DNS_SERVER2} ports: -‐ ${UTIL_HOST}:2222:22 -‐ ${UTIL_HOST}:8080:80 -‐ ${UTIL_HOST}:8443:8443 volumes: -‐ ${VOL_DATA}/gitlab/config:/etc/gitlab -‐ ${VOL_DATA}/gitlab/data:/var/opt/gitlab -‐ ${VOL_LOGS}/gitlab:/var/log/gitlab environment: GITLAB_OMNIBUS_CONFIG: | external_url 'https://gitlab.poc.local' gitlab_rails['gitlab_shell_ssh_port'] = 2222 cpu_shares: 512 mem_limit: 2G security_opt: -‐ no-‐new-‐privileges restart: on-‐failure:5 ############################################################################## ### Jenkins: CI/automation server ### ========================================================================== ## Depends on: ## 1. Host folders: ${VOL_DATA}/jenkins # ## Ensuring dependencies: ## 1. Creating host folders # $ mkdir -‐p ${VOL_DATA}/jenkins # ## Startup command example: # $ docker run -‐-‐name=jenkins -‐-‐hostname jenkins -‐-‐detach=true -‐-‐restart=always \ # -‐-‐cpu-‐shares 512 -‐-‐memory 2G \ # -‐-‐volume=${VOL_DATA}/jenkins:/var/jenkins_home \ # -‐-‐publish 10.169.64.245:8080:8080 -‐-‐publish 10.169.64.245:50000:50000 \ # jenkins:2.7.3-‐alpine jenkins: container_name: jenkins image: jenkins:2.7.3-‐alpine hostname: jenkins dns: -‐ ${DNS_SERVER1} -‐ ${DNS_SERVER2} ports: -‐ ${UTIL_HOST}:8888:8080 -‐ ${UTIL_HOST}:50000:50000 volumes: -‐ ${VOL_DATA}/jenkins:/var/jenkins_home tmpfs: -‐ /run -‐ /tmp:exec cpu_shares: 512 mem_limit: 2G security_opt: -‐ no-‐new-‐privileges restart: on-‐failure:5 read_only: false



############################################################################## ### Vault: secure credentials storage ### ========================================================================== ## Depends on: ## 1. Host folders: ${VOL_DATA}/vault/{config,data,ssl} ## 2. Service config file: ${VOL_DATA}/vault/config.hcl ## 3. TLS certificates: ${VOL_DATA}/vault/ssl/vault.{crt,key} ## 4. Services: registry # ## Ensuring dependencies: ## 1. Creating host folders # $ mkdir -‐p ${VOL_DATA}/vault/{config,data,ssl} # ## 2. Creating service config # $ cat <<EOT >${VOL_DATA}/vault/config.hcl # backend "file" { # path="/vault/data" # } # listener "tcp" { # address = "0.0.0.0:8200" # tls_disable = 0 # tls_key_file = "/vault/ssl/vault.key" # tls_cert_file = "/vault/ssl/vault.crt" # } # EOT # ## 3. Creating TLS certificates # $ openssl req -‐x509 -‐nodes -‐days 365 -‐newkey rsa:2048 \ # -‐subj "/C=DE/ST=HE/L=Frankfurt/O=Verizon/OU=Managed Hosting/CN=vault.poc.local/[email protected]" \ # -‐keyout ${VOL_DATA}/vault/ssl/vault.key -‐out ${VOL_DATA}/vault/ssl/vault.crt # ## Startup command example: # $ docker run -‐-‐name vault -‐-‐detach=true -‐-‐cap-‐add IPC_LOCK \ # -‐-‐publish ${UTIL_HOST}:8200:8200 -‐-‐env VAULT_ADDR=https://127.0.0.1:8200 -‐-‐env VAULT_SKIP_VERIFY=1 \ # -‐-‐volume /var/data/vault/config.hcl:/vault/config.hcl \ # -‐-‐volume /var/data/vault/data:/vault/data \ # -‐-‐volume /var/data/vault/ssl:/vault/ssl \ # registry.poc:5000/poc/vault server -‐config /vault/config.hcl # ## WARNING: after start the vault storage is sealed and must be unsealed # prior to the first use. The following command must be executed 3 times # and 3 out of 5 vault keys must be provided. # # $ docker exec -‐it vault vault unseal vault: container_name: vault image: ${REGISTRY}/vault depends_on: -‐ registry command: server -‐config /vault/config.hcl hostname: vault dns: -‐ ${DNS_SERVER1} -‐ ${DNS_SERVER2} ports: -‐ ${UTIL_HOST}:8200:8200 volumes: -‐ ${VOL_DATA}/vault/config.hcl:/vault/config.hcl:ro -‐ ${VOL_DATA}/vault/ssl:/vault/ssl:ro -‐ ${VOL_DATA}/vault/data:/vault/data:rw environment: -‐ VAULT_ADDR=https://127.0.0.1:8200 -‐ VAULT_SKIP_VERIFY=1 cpu_shares: 10 mem_limit: 25M



security_opt: -‐ no-‐new-‐privileges cap_add: -‐ IPC_LOCK restart: on-‐failure:5 read_only: false ############################################################################## ### influxDB: time-‐series database ### ========================================================================== ## Depends on: ## 1. Host folders: ${VOL_DATA}/influxdb ## 2. Services: registry # ## Ensuring dependencies: ## 1. Creating host folders # $ mkdir -‐p ${VOL_DATA}/influxdb # ## Startup command example: # $ docker run -‐-‐name=influxdb -‐-‐detach=true -‐-‐restart=always \ # -‐-‐cpu-‐shares 512 -‐-‐memory 1G -‐-‐memory-‐swap 1G \ # -‐-‐volume=${VOL_DATA}/influxdb:/influxdb -‐-‐env ADMIN_USER="root" \ # -‐-‐publish 8083:8083 -‐-‐publish 8086:8086 -‐-‐expose 8090 -‐-‐expose 8099 \ # ${REGISTRY}/influxdb influxdb: container_name: influxdb image: ${REGISTRY}/influxdb depends_on: -‐ registry hostname: influxdb dns: -‐ ${DNS_SERVER1} -‐ ${DNS_SERVER2} ports: -‐ ${UTIL_HOST}:8083:8083 -‐ ${UTIL_HOST}:8086:8086 expose: -‐ 8090 -‐ 8099 volumes: -‐ ${VOL_DATA}/influxdb:/var/lib/influxdb environment: -‐ PRE_CREATE_DB=cadvisor -‐ ADMIN_USER=root cpu_shares: 512 mem_limit: 1G memswap_limit: 1G security_opt: -‐ no-‐new-‐privileges restart: on-‐failure:5 read_only: false ############################################################################## ### cAdvisor: container advisor ### ========================================================================== ## Depends on: ## 1. Storage driver defaults (credentials/db name) ## 2. Services: influxdb # ## Startup command example: # $ docker run -‐-‐name=cadvisor -‐-‐hostname=`hostname` -‐-‐detach=true -‐-‐restart=always \ # -‐-‐cpu-‐shares 100 -‐-‐memory 500m -‐-‐memory-‐swap 1G \ # -‐-‐volume=/:/rootfs:ro -‐-‐volume=/var/run:/var/run:rw \ # -‐-‐volume=/sys:/sys:ro -‐-‐volume=/var/lib/docker/:/var/lib/docker:ro \ # -‐-‐publish=8080:8080 \ # google/cadvisor:v0.23.8 -‐storage_driver=influxdb -‐storage_driver_db=cadvisor\ # -‐storage_driver_host=${INFLUXDB_HOST}:8086



cadvisor: container_name: cadvisor image: google/cadvisor:v0.23.8 command: -‐storage_driver=influxdb -‐storage_driver_host=${INFLUXDB_HOST}:8086 depends_on: -‐ influxdb hostname: util dns: -‐ ${DNS_SERVER1} -‐ ${DNS_SERVER2} ports: -‐ ${UTIL_HOST}:18080:8080 volumes: -‐ /:/rootfs:ro -‐ /var/run:/var/run:rw -‐ /var/log:/var/log:ro -‐ /sys:/sys:ro -‐ /var/lib/docker/:/var/lib/docker:ro cpu_shares: 100 mem_limit: 500m memswap_limit: 1G security_opt: -‐ no-‐new-‐privileges restart: on-‐failure:5 read_only: false ############################################################################## ### Grafana: metric analytics and visualization suite ### ========================================================================== ## Depends on: ## 1. Host folders: ${VOL_DATA}/grafana, ${VOL_LOGS}/grafana ## 2. Services: registry, influxdb # ## Ensuring dependencies: ## 1. Creating host folders # $ mkdir -‐p ${VOL_DATA}/grafana ${VOL_LOGS}/grafana # ## Startup command example: # $ docker run -‐-‐name=grafana -‐-‐hostname grafana.poc -‐-‐detach=true -‐-‐restart=always \ # -‐-‐cpu-‐shares 50 -‐-‐memory 50m -‐-‐publish=3000:3000 \ # ${REGISTRY}/grafana:2.6.0 # ## First boot setup for Grafana: creating DSN and adding dashboards # $ ./setup.sh cadvisor dashboards grafana: container_name: grafana image: ${REGISTRY}/grafana:2.6.0 depends_on: -‐ registry -‐ influxdb hostname: grafana dns: -‐ ${DNS_SERVER1} -‐ ${DNS_SERVER2} ports: -‐ ${UTIL_HOST}:3000:3000 volumes: -‐ ${VOL_DATA}/grafana:/var/lib/grafana -‐ ${VOL_LOGS}/grafana:/var/log/grafana cpu_shares: 50 mem_limit: 50m memswap_limit: 500m security_opt: -‐ no-‐new-‐privileges restart: on-‐failure:5 read_only: false ##############################################################################



### SonarQube ### ========================================================================== ## Depends on: ## 1. Host folders: ${VOL_DATA}/sonar/{conf,data,temp}, ${VOL_LOGS}/sonar ## 2. Services: registry ## 3. Persistent DB storage: MySql DB instance # ## Ensuring dependencies: ## 1. Creating host folders # $ mkdir -‐p ${VOL_DATA}/sonar/{conf,data,temp} ${VOL_LOGS}/sonar sonar: container_name: sonar image: ${REGISTRY}/sonar:5.6.1 depends_on: -‐ registry ports: -‐ ${UTIL_HOST}:9000:9000 environment: -‐ SONARQUBE_JDBC_URL=jdbc:mysql://${MYSQLDB_HOST}:3306/sonar?useUnicode=true&characterEncoding=utf8&rewriteBatchedStatements=true volumes: -‐ ${VOL_DATA}/sonar/conf:/opt/sonarqube/conf -‐ ${VOL_DATA}/sonar/data:/opt/sonarqube/data -‐ ${VOL_DATA}/sonar/temp:/opt/sonarqube/temp -‐ ${VOL_LOGS}/sonar:/opt/sonarqube/logs tmpfs: -‐ /run -‐ /tmp cpu_shares: 512 mem_limit: 2G security_opt: -‐ no-‐new-‐privileges restart: on-‐failure:5 read_only: false networks: default: driver: bridge

We can use the following script to start required foundation services depending on the host role:

#!/usr/bin/env bash # enabling strict mode http://redsymbol.net/articles/unofficial-‐bash-‐strict-‐mode/ set -‐euo pipefail IFS=$'\n\t' declare -‐x HOSTNAME=${HOSTNAME:-‐$(hostname -‐s)} [[ -‐f .env ]] || { echo "environment settings (.env) missing, exiting..."; exit 127; } case ${HOSTNAME} in util*) yml=util.yml ;; wb*) yml=web.yml ;; *) echo "unknown system role, exiting..."; exit 127; esac docker-‐compose -‐-‐file ${yml} config -‐-‐quiet \ && docker-‐compose -‐-‐file ${yml} up -‐d \ || { echo "errors in ${yml}, exiting..."; exit 1; }



Things to keep in mind: • Although docker-‐composer is starting services in proper order, defined via depends_on

statements, it does not ensure whether service startup completed and service is able to perform its function. For example, services depending on image registry may fail to start if they attempt fetching their images before registry service is fully up and running. Composer developers claiming that checking service startup completion and health is outside composer’s remit and it must be performed using different external tools. Indeed, several solutions have been proposed. They are far from being elegant, but they do the trick. For example, see https://github.com/vishnubob/wait-‐for-‐it/

• Composer definition is static and does not allow conditions or parameters. The only way to inject variables into the script at the moment is using environment variables. The .env file located in the current folder (by default or other file specified via command line options) is sourced by composer and variables defined there may be used in the composer definition;

• Not all Docker runtime options are implemented by composer specification yet. Hopefully, it’s a matter of time and composer specification will catch up over time and provide missing definitions.

Base OS Image Often underpaid attention or even overlooked, the Base OS selection question is one of the most important topics that must be addressed early at the design stage and here is why.

The OS Image Inside Container In theory you may use any Linux based OS and “it will run” inside container. There is a huge difference between “just run” and “optimally run”, though. Not to mention that OS image packaged inside container is not really “running” inside container and rather providing runtime and dependencies for the applications packaged inside container, which are still executed on top of the host OS kernel. Basically, inside container you need to have only your application and it’s dependencies. Indeed, just take a look at busybox or cadvisor container images. They don’t package any OS or libraries along with statically linked binaries. Unfortunately, not all applications can be compiled statically and in some cases it does not even make any sense to attempt. The objective is to optimally package applications in containers, not re-‐developing and re-‐building them from scratch. This is where modern OS packagers coming to help. Eventually, if you deploy RPM or DEB package it shall bring all dependencies along. While this holds true, very soon we will realize, that those dependencies are having own dependencies, which are having dependencies too and so on… In reality, a single package may have a long tail of dependencies you’d never think of. At the end of the day you’re ending up installing hundreds of binaries, libraries and files in order to satisfy dependencies of your single required package. Still, this will get you to a better place than just using vanilla OS image of your choice as your base image. Actually, you can approach this problem from both directions. Either choose standard, vendor provided base OS image and remove unnecessary packages and files or you can start from blank image and add only required packages and files. Both approaches are requiring expert knowledge on Linux OS innards, though.



What if you don’t care about removing unneeded stuff and making your container images very lean? Eventually, storage is quite cheap these days. Well, consider this math:

• Standard Linux OS distributions put into container will take ~600-‐700MB is size • Add J2EE application or some other runtime and you’ll quickly hit 1GB threshold • It’s not uncommon to see container images 1-‐2+ GB in size • For every one of your 20-‐30 (or more?) apps you’ll need another image

The real life experience showing that those numbers multiply rather quickly and you won’t notice how your container images are taking hundreds of gigabytes. Even with effective, layered file-‐system employed by Docker, the image sprawl will sooner or later take its tax. Now, let’s take pretty much any Linux OS image with minimal package set installed. What we got inside: kernel, tons of drivers, couple shells, several scripting languages, documentation and man pages, application supervisors and startup scripts, etc. Do you really need this stuff inside each and every container? Do you need to store, multiply and carry along hundreds or thousands of copies of this unneeded stuff, absolutely irrelevant to your application and its functions? Sure, layered or COW container file-‐system can help to an extent, but don’t expect it to solve the issue for you.

One vs. Multiple Applications Another, almost religious subject – is container supposed to run a single binary or multiple processes? There is a part of community, which has taken a notion of micro-‐services and decomposition literally and attempting to package every single binary to a dedicated container. This results in overly complex orchestration mechanisms, integration and performance issues. Are there any benefits? Not really. Another part of community is attempting to package everything inside a single container, basically thinking of container as a “lightweight VM” -‐ the flawed notion spread by multiple analysts. Heck, they even run SSH inside container, to login and update packages inside. Obviously, this is completely defeating the purpose of containers. The right approach is a midway: you can have multiple binaries or applications packaged inside a single container as long as they are contributing to the same business function and where further decomposition does not bring any tangible benefits. Good example would be the web container. Does it make sense to have one container running just Apache and another one running PHP-‐FPM? Technically it’s doable, but what benefits it will give us, not counting added complexity? There are none. Hence why, Apache and PHP-‐FPM made it into the same container and since inter-‐process communication is not traversing the container border we won’t experience increased latency that would surely bit us sooner or later otherwise.

Process Supervisor Now, one would ask, if there are multiple processes inside a single container, how will they start and who will supervise them? This naturally leads us to another important aspect – process supervision inside container. Can we use existing mechanisms provided by modern Linux OS? Not really, they were designed for a different purpose and often are real overkill for our needs. So, there is a need for a lightweight process supervisor that will take care of process startup and, even more importantly, their lifecycle too. Now it will get a bit spooky… Someone or something has to reap zombies. We won’t go into depth of Unix process lifecycle, just mention that there is a special process in Unix OS, often-‐called init or PID 1 process. One of its tasks is to take care of orphaned processes



(those having no parent) and, if needed, adopt them and manage their lifecycle. If no one cares of orphaned processes, at some point they will die and becoming a zombie. They can’t be killed, obviously, since they are already dead. Over time, if not paid attention, they can turn system management to a horror movie… Ever seen a system with thousands of zombie processes? Looking to this, it’s a very good idea to have a process supervisor even if your container is running a single process. Unless, this single process is able to handle all signals sent to container properly and does not fork children processes.

Quick Summary Let’s summarize what we learned above:

• Statically linked self-‐contained applications don’t require any OS at all. They are simple, lean, effective and coming close to a micro-‐service architecture dreams and promises;

• Using the whole Linux distribution as the base image for your containers is BAD! It may work “in vitro”, but is rather bad idea to use “in vivo”;

• Using package managers is good, since they taking care of dependencies, however, beware the long tail of dependencies;

• It’s not bad to have multiple processes and applications inside one container, as long as they fulfilling or supporting the same business function;

• Processes in container do require supervisor to pass signals and manage children and orphans! If you won’t supervise them, beware side effects…

Also, take a look at the following pages and articles: https://phusion.github.io/baseimage-‐docker/#intro https://blog.phusion.nl/2015/01/20/docker-‐and-‐the-‐pid-‐1-‐zombie-‐reaping-‐problem/ https://www.ctl.io/developers/blog/post/optimizing-‐docker-‐images/ http://jonathan.bergknoff.com/journal/building-‐good-‐docker-‐images https://www.dajobe.org/blog/2015/04/18/making-‐debian-‐docker-‐images-‐smaller/ So, what are the viable options? The community reacted to most issues outlined above and explored various options ranging from stripped-‐down OS images, made out of existing popular Linux distributions to new, purpose-‐built distributions created to support specific needs of containerized applications:

• Alpine Linux https://github.com/gliderlabs/docker-‐alpine • Phusion https://github.com/phusion/baseimage-‐docker • Debian Slim (internal image repository)

This POC project has explored various options and all images have been built using either Alpine Linux base image or custom Debian Slim image. The Debian Slim is basically Debian 8.1 stripped-‐down to <50MB, yet fully functional Debian Linux distribution. Besides removing unneeded packages and files, without breaking any functionality, there have been several configurations and optimizations put in place, so that this image runs well in container environment. Talking of process supervisors, there are multiple choices available too:

• S6 Supervisor https://github.com/just-‐containers/s6-‐overlay • Supervisord http://supervisord.org/ • Dumb-‐Init https://github.com/Yelp/dumb-‐init • My_init https://github.com/phusion/baseimage-‐docker/blob/master/image/bin/my_init



The best experience has been made with S6 supervisor and it’s been used for most custom-‐built images during the POC project. Things to keep in mind:

• Unfortunately, there is no single best answer, when it comes to choosing the base image for your containers images. It will depend a lot on specific project requirements, available experience and organizational practices and policies. Nonetheless, it should be clear by now that base image selection topic worth a separate discussion and must be given careful consideration;

• The same goes for container supervisor. There are number of good solutions and one of them has to chosen and used as a standard solution across all container images;

• Base image is a living, dynamic construct. It must be adjusted, rebuilt and modified due to changed requirements, security needs and other reasons. Therefore, image build pipeline should be setup as a Continuous Delivery workflow and all images must be rebuilt continuously and automatically;

• Creating own base image is not one-‐time task. It has to be maintained and managed over time. The image has to be rebuilt with updated and patched packages. Obviously, all depending images have to be rebuilt as well.

Storage Scalability in Docker Historically, the Docker project relied upon AUFS file-‐system. Unfortunately, the AUFS has never made it into upstream Linux kernels and due to this – unsupported by Red Hat OS family and disabled by default for Ubuntu Linux. This leaves us with the following options:

• Device-‐Mapper loopback (or loop-‐lvm) – default configuration for RHEL; • Device-‐Mapper (or direct-‐lvm) – recommended by RHEL configuration; • BTRFS – Docker’s upstream default; • OverlayFS – considered experimental by RHEL; • ZFS – is just getting popularity and traction in container community.

More details, in depth comparison and overview can be found in the following articles:

• Red Hat Developers Blog: Comprehensive Overview of Storage Scalability in Docker http://developerblog.redhat.com/2014/09/30/overview-‐storage-‐scalability-‐docker/

• The Deep Dive into Docker Storage Drivers http://jpetazzo.github.io/assets/2015-‐07-‐01-‐deep-‐dive-‐into-‐docker-‐storage-‐drivers.html

• Docker: Understanding images, containers and storage drivers https://docs.docker.com/engine/userguide/storagedriver/imagesandcontainers/

Below we’ll provide just a quick summary for those storage options. The good news, there are multiple options to choose from. Not so good news, as it often a case, there is no single best choice and each option is having strong and week sides. Nonetheless, a general guidance can be given:

• If you do PaaS or other high-‐density environment -‐ OverlayFS is your best option; • If you put big writable files on the CoW file-‐system – something you should not be doing,

really – Device Mapper, BTRFS or ZFS would be a right choice; • If memory is scarce or limited, ZFS should be avoided; • If you want mature file-‐system backend with little or no maintenance – Device Mapper or

Direct-‐LVM to be precise, is the way to go.



This is leaving us with two storage backend candidates for the hosting platform: OverlayFS and Direct-‐LVM. At the time of writing, Red Hat is still considering OverlayFS an experimental technology and recommends using Direct-‐LVM for production workloads. Things to keep in mind:

• The OverlayFS is requires so-‐called backing file-‐system. It is possible to use both EXT4 and XFS for this, however, XFS is clear winner in terms of performance;

• The OverlayFS is using shared page cache, which is making this file-‐system a clear winner in terms of memory usage and performance;

• Red Hat is actively developing and contributing to OverlayFS project (same as BTRFS and Direct-‐LVM) development;

• The POC project has explored both Direct-‐LVM and OverlayFS options and OverlayFS is a clear winner from both performance and operation perspective;

• The OverlayFS2 has not been tested. It does require 4.x kernel not provided by RHEL; • Red Hat performance engineers are calling OverlayFS a winner in terms of performance.

There must be some reasons, such as supporting SELinux on OverlayFS, which are keeping Red Hat from making this file-‐system a favorite choice.. After multiple enquiries Red Hat did not provide any answers.

Loop LVM This is default configuration in Red Hat OS family. A quick summary:

• By default, Docker puts data and metadata on a loop device backed by a sparse file; • This is great from a usability point of view (zero configuration needed); • But terrible from a performance point of view. Each time a container writes to a new block,

a block has to be allocated from the pool, and when it's written to, a block has to be allocated from the sparse file. Sparse file performance isn't great anyway.

This option is best used for getting started trials and sandbox labs to get familiar with Docker and containers. For anything more demanding in terms of performance use other options.

Direct-‐LVM The same idea, just using read devices for data and metadata:

• Each container gets its own block device with a real file-‐system on it; • So you can also adjust (with -‐-‐storage-‐opt):

o file-‐system type; o file-‐system size; o discard (if you use SSD);

• Caveat: when you start 1000x containers, the files will be loaded 1000x from disk! Although Red Hat is claiming this is the best option to run your containers today, in reality it’s not. You’re getting good, but not the best performance and often you can’t remove stopped container, because its devices are still mounted. Whether it’s a matter of adding longer time-‐out when stopping container or not, prepare for additional hassle.



BTRFS Considered by many as the most natural fit for Docker. It meets the basic requirements of supporting CoW (Copy-‐on-‐write), it’s performing moderately, and being actively developed. The BTRFS does not currently support SELinux, nor does it allow page cache sharing. To summarize:

• BTRFS works by dividing its storage in chunks; • A chunk can contain data or metadata; • You can run out of chunks (and get “No space left on device”), even though df shows space

available because the chunks are not full. So this file-‐system is not exactly maintenance free. • There is not much to tune.

When BTRFS shortcomings will be addressed and file-‐system will mature a bit, it may become an appealing option. For now, it’s not really ready for a primetime.

OverlayFS The OverlayFS or simply overlay is a modern union file-‐system that also meets the basic Docker requirements and may be considered as a successor for AUFS. The quick description of OverlayFS is that it combines a lower (parent) an upper (child) file-‐system and a workdir (on the same file-‐system as the child). The lower file-‐system is the base image, and when you create new Docker containers, a new upper file-‐system is created containing the deltas. A quick summary:

• Identical files are hard-‐linked between images. This avoids doing composed overlays; • It is very fast; • It allows for page cache sharing; • Not much to tune at this point, however OverlayFS2 is providing more options; • Performance should be slightly better than AUFS:

o No stat() explosion; o Good memory use; o Slow copy-‐up, still!

Looking to the past two year developments, OverlayFS has gained a lot of traction and attention. There is OverlayFS2 driver too, supported by 4.x+ kernels, with even better performance and additional features. Nonetheless, the future for this storage option is looking bright.

ZFS We won’t be spending time talking about all virtues of ZFS. Long story short – it’s one of the best file-‐systems out there. Still, your mileage may vary, when you employ it specifically for containers. A quick summary:

• From operation perspective somewhat similar to BTRFS; • Fast snapshot, fast creation, fast snapshot, compression... • Epic performance and reliability, though your mileage may vary; • ZFS has the reputation to be quite memory hungry. Yes you got to pay for those nifty

features. You probably don't want that in a tiny VM; • Setup and operation may require specific expertise.

Despite ZFS reliability and features, one needs to consider the bigger picture and trade-‐offs required for using this great file-‐system for containers.



Conclusion Presented research cannot be seen as an ultimate guide for building your hosting platform, however, it does explore various options and provides an architecture blueprint, thus, laying down a solid foundation for production ready, productized solution. This research started with exploring containerization technology, its features and its readiness for supporting various workload types. The research continued further exploring DevOps practices for implementing and supporting application lifecycle, from development stage and all the way down to the production deployment. All life-‐cycle steps have been automated and made repeatable by using standardized workflows and naming schemes. It was the time to transition from the bunch of scripts to the orchestrated platform, built out of well-‐defined services. The proof-‐of-‐concept project demonstrated that PaaS solution employing containers for workload packaging and isolation is a very good fit for web application hosting requirements. That said, nothing speaks against using this platform for hosting other types of workloads and applications that may be decomposed to standalone components or services, namely micro-‐services based applications. The platform itself is following the same design paradigm and composed from multiple loosely coupled services with well-‐defined interfaces. Special attention has been paid to Drupal CMS deployment, configuration and publishing workflows. The main objective was to demonstrate how presented platform could be used for hosting real-‐life applications and implementing Drupal-‐as-‐a-‐Service solution. It is expected that this work will serve as a foundation for new products and solutions at the same time providing an extensive overview and educational material for a broad audience.

Website in a Box or the Next Generation Hosting Platform

Technology

Transcript of Website in a Box or the Next Generation Hosting Platform