Building a multi-container application with vSphere Integrated Containers v1.1.1

Running demo containers on vSphere Integrated Containers (aka VIC) is pretty exciting but eventually the time will come to get our hands dirty and build an actual application that needs to live on a virtual container host (aka VCH) in a resilient way. So in this post,  I’ll try to cover the steps in order to build an application with two layers (application/ui layer and a database layer) and how to workaround the challenges that VIC engine imposes. High level objectives are;

  • Select a proper application
  • Open the required ports on ESXi servers
  • Create and configure a VCH
  • Create docker-compose configuration file
  • Run the containers and test the application

The component versions that I use for this task is;

  • vSphere Integrated Containers: 1.1.1, build 56a309f
  • docker-compose: 1.13.0, build 1719ceb

Selection of a proper application:

The application that I’m going to deploy is an open source project on GitHub, called Gogs. Gogs is defined by the creators as a painless, self-hosted Git service. The goal of the project is to make the easiest, fastest, and most painless way of setting up a self-hosted Git service, written with Go. So this enables an independent binary distribution across all platforms that Go supports, including Linux, Mac OS X, Windows and even ARM.

There are many ways to install Gogs, such as installation from the source code, packages or with Vagrant but of course we will focus on installation as a Docker container. Normally, one container is more than enough to run Gogs but it also supports remote databases so we will take advantage of that to create a two-layered application.

Required ports on ESXi servers:

Before we start, I assume that VIC appliance is deployed and configured properly. Also we need a vic-machine in order to run vic commands and this is going to be a CentOS box in my environment. If we still don’t have the vic-machine binaries, we can easily get them with curl from the vic appliance. (sddcvic is my appliance and if using self-signed certificates, insert –insecure at the end of the curl command)

curl -L -O https://sddcvic:9443/vic_1.1.1.tar.gz /root --insecure
gzip -d /root/vic_1.1.1.tar.gz
tar -xvf vic_1.1.1.tar

ESXi hosts communicate with the VCHs through port 2377 via Serial Over LAN. For the deployment of a VCH to succeed, port 2377 must be open for outgoing connections on all ESXi hosts before you run vic-machine-xxx create to deploy a VCH. The vic-machine utility includes an update firewall command, that we can use to modify the firewall on a standalone ESXi host or all of the ESXi hosts in a cluster. This command will allow tcp/2377 outgoing connctions for all ESXi servers that exist under the cluster defined with compute-resource option.

./vic-machine-linux update firewall \
    --target=sddcvcs.domain.sddc \
    --user=orcunuso \
    --compute-resource=SDDC.pCluster \
    --thumbprint="3F:6E:2F:16:FA:76:53:74:18:3F:26:9D:1A:58:40:AD:E5:D8:3E:52" \
    --allow

Initially, we may not know the thumbprint of the vcenter server. The trick here is to run the command without thumbprint option, get it from the error message, add the option and re-run the command.

Deployment of the VCH:

Normally, a few options (such as name, target, user and tls support mode) will suffice. But in order to deploy a more customized VCH, there are many options that we can provide (here is the full list). Below is the command that I used to deploy mine.

./vic-machine-linux create \
    --name=vch02.domain.sddc \
    --target=sddcvcs.domain.sddc/SDDC.Datacenter \
    --thumbprint="3F:6E:2F:16:FA:76:53:74:18:3F:26:9D:1A:58:40:AD:E5:D8:3E:52" \
    --user=orcunuso \
    --compute-resource=SDDC.Container \
    --image-store=VMFS01/VCHPOOL/vch02 \
    --volume-store=VMFS01/VCHPOOL/vch02:default \
    --bridge-network=LSW10_Bridge \
    --public-network=LSW10_Mgmt \
    --client-network=LSW10_Mgmt \
    --management-network=LSW10_Mgmt \
    --dns-server=10.10.100.10 \
    --public-network-ip=10.10.100.22/24 \
    --public-network-gateway=10.10.100.1 \
    --registry-ca=/root/ssl/sddcvic.crt \
    --no-tls

The option –no-tls disables TLS authentication of connections between the docker clients and the VCH, so VCH use neither client nor server certificates. In this case, docker clients connect to the VCH via port 2375, instead of port 2376.

Disabling TLS authentication is not the recommended way of deploying VCH because thus, any docker client can connect to this VCH in an insecure manner. But in this practice, we will use docker-compose to build our application, and I’ve encountered many issues with TLS enabled VCH. It’s on my to-do list.

Creating docker-compose.yml configuration file:

Docker-compose is a tool for defining and running multi-container docker applications. With compose, we use a YAML file to configure our application’s services. Then, using a single command, we can create and start all the services from our configuration. First we need to provide the binary to run docker-compose if it is not in place. Simply run curl command to download from GitHub.

curl -L https://github.com/docker/compose/releases/download/1.13.0/docker-compose-`uname -s`-`uname -m` > /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose

Now we need to create our configuration file (docker-compose.yml).

version: "3"
services:
  gogsapp:
    image: sddcvic.domain.sddc/gogs/gogs
    container_name: gogsapp
    restart: on-failure
    depends_on:
      - "gogsdb"
    volumes:
      - "gogsvol2:/data"
    ports:
      - "10080:3000"
      - "10022:22"
    networks:
      - "gogsnet"
  gogsdb:
    image: sddcvic.domain.sddc/library/postgres:9.6
    container_name: gogsdb
    environment:
      - POSTGRES_PASSWORD=gogsdbpassword
      - POSTGRES_USER=gogsuser
      - POSTGRES_DB=gogsdb
      - PGDATA=/var/lib/postgresql/data/data
    restart: on-failure
    volumes:
      - "gogsvol1:/var/lib/postgresql/data"
    networks:
      - "gogsnet"
networks:
  gogsnet:
    driver: bridge
volumes:
  gogsvol1:
    driver: vsphere
  gogsvol2:
    driver: vsphere

From the architectural point of view, we have two services that communicate via a bridge network that we call “gogsnet”. Gogs application requires two ports to be exposed to the real world, which are 22 and 3000. Postgres container accepts connections via port 5432 but that port does not have to be exposed because it is an internal communication.

To make this application more reliable, it is a good practice to use persistent volumes so whenever we recreate the containers from the images that are already pushed to the private registry, the data will not be lost and the application will resume as expected. For the sake of this purpose, we create two volumes, gogsvol1 and gogsvol2, and map these volumes to relevent directories. The default volume driver with VIC is vsphere so what is going to happen is that VIC will create two VMDK files in the locations that we provided with the –volume-store option during the VCH deployment phase and attach those VMDKs to the containers.

Normally, we are not supposed to specify an alternate location for postgres database files but in this case, VIC uses VMDK for the volumes and as it is a new volume, it will have a lost+found folder which causes postgres init scripts to quit with exit code 1, so the container exits as well. That is the reason why we use PGDATA environment variable and specify a subdirectory to contain the data.

At the end of the day, this is how it will look like:

Run and test the application:

Before running the app, let’s make sure that everything is in place. Our check list includes;

  • A functional registry server
  • Required images that are ready to be pulled from registry server (gogs and postgres)
  • A functional VCH with no-tls support
  • Docker-compose and yaml file

Now let’s build our application

docker-compose -f /root/gogs/docker-compose-vic.yml up -d

Excellent!!! Our containers are up and running. Let’s connect to our application via port tcp/10080 and make the initial configurations that needs to be done for the first-time run. We give the options that we specified as environment variables during the build process of postgres container.

And voila!!! Our two-layered, containerized, self-hosted git service is up and running on virtual container host backed by vSphere Integrated Containers registry service (aka Harbor). Enjoy 🙂

Push Windows image layers to Harbor registry

If you ever try to push Windows based images to a private registry, only the custom layers that you create will be pushed, not the initial layers that come with the base image. To demonstrate this phenomenon;

Pull microsoft/nanoserver from Docker Hub.

docker pull microsoft/nanoserver

Tag the base image appropriately (sddcvic is my private Harbor registry and windows is the name of the project that I created before)

docker tag microsoft/nanoserver sddcvic.domain.sddc/windows/nanoserver:3.0

Login to Harbor and push the image that we tagged

docker login -u admin -p password sddcvic.domain.sddc 
docker push sddcvic.domain.sddc/windows/nanoserver:3.0
docker logout sddcvic.domain.sddc

After the push command, the first two layers which are marked as foreign are skipped and not pushed to the registry. This is a common phenomenon, so the container registry service does not explicitly have to be Harbor, it could happen with any private registry service.

P.S. If you use Harbor as the image registry and your push operation errors out with a “blob unknown to registry” message, please refer to my previous post.

As of Docker Registry v2.5.0, a new version of docker image manifest (Image Manifest Version 2, Schema 2) was introduced. Whenever an image gets pushed to a repository, an image manifest is also uploaded that provides a configuration and a set of layers for the container image. One of the most important changes with this schema version is the introduction of foreign layers that are widely used by Windows based containers. As their nature, foreign layers cannot be pushed to any private registry other than the URLs that exist in their descriptor files. This is specifically necessary to support downloading Windows base layers from Microsoft servers, since only Microsoft is allowed to distribute them. This makes any image built on them require internet connectivity to download the bits which is not appropriate for most enterprise environments. Even if it’s allowed, the amount of data needs to be downloaded to the docker host caches every time from the internet will be huge (windows images does not have a good reputation about image sizes), hence it contradicts the main benefits of containers such as mobility and flexibility. Luckily, there is a trick to workaround this issue.

Disclaimer: This procedure might not be recommended or supported by Microsoft, Docker or VMware!!!

In order to make a layer “non-foreign“, we need to manipulate its descriptor file. If there are many images and layers cached on the docker host server, it would take an effort to find the right descriptor.json files. The descriptor files exist under the folder: C:\ProgramData\docker\image\windowsfilter\layerdb\sha256\*\

First we need to get the hashes.

docker inspect --format "{{.RootFS.Layers}}" microsoft/nanoserver

The hashes that we get here are diff (layer created during the docker image build process) hashes that can be found exactly in diff files in the same folder as descriptor.json. The below commands will search diff hashes within all the diff files and return with the full path that will be useful to locate the descriptor files.

Select-String -Pattern "6c357baed9f5177e8c8fd1fa35b39266f329535ec8801385134790eb08d8787d" -Path "C:\ProgramData\docker\image\windowsfilter\layerdb\sha256\*\diff"
Select-String -Pattern "0a051a1149b43239af90a8c11824a685a737c9417387caea392b8c8fee7e3889" -Path "C:\ProgramData\docker\image\windowsfilter\layerdb\sha256\*\diff"

Now as we have the full path to the right descriptor files, we need to modify them. In our scenario, this is how a regular descriptor.json file will look like:

{
   "mediaType": "application/vnd.docker.image.rootfs.foreign.diff.tar.gzip",
   "size": 252691002,
   "digest": "sha256:bce2fbc256ea437a87dadac2f69aabd25bed4f56255549090056c1131fad0277",
   "urls": ["https://go.microsoft.com/fwlink/?linkid=837858"]
}

We need to change the mediaType as below and remove the urls field.

{
   "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
   "size": 252691002,
   "digest": "sha256:bce2fbc256ea437a87dadac2f69aabd25bed4f56255549090056c1131fad0277"
}

As a final step, restart docker service

Restart-Service docker

After all those steps, we are ready to push our own image that we have created just by tagging the official nanoserver image… and voila!!

The good side is, now we have the ability to push this image to any repository regardless of its location, including Harbor. The downside is, Microsoft tends to patch its base images every month, so we need to repeat this procedure and recreate our images occasionally.

How to verify that data is really pushed to Harbor

If you have a suspicious mind like me and cannot help yourself wondering if the data is “really” pushed to the repository, it would make no harm to verify and see that tha data is actually there. The requirement of this task is the hash of the image manifest that was uploaded with the layers which we can easily get from the docker push command output. In our case, that is 2261d13476c671ba182e117001f1bc6ff7c0aa188c8225e6fa5bf0cddebce561.

On Harbor, the blobs exist under /data/harbor/registry/docker/registry/v2/blobs/ directory but once again we have to find the right sub-directory which are based on (surprise!!) hashes. So we start with the hash of the image manifest and get the data inside. Do not forget to add the first two characters of the hash to the full path, as below.

cat /data/harbor/registry/docker/registry/v2/blobs/sha256/22/2261d13476c671ba182e117001f1bc6ff7c0aa188c8225e6fa5bf0cddebce561/data
{
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
   "config": {
      "mediaType": "application/vnd.docker.container.image.v1+json",
      "size": 774,
      "digest": "sha256:6c367cf4cb9815b10e47545dc9539ee4bd5cd0f8697d33f4d9cb1e1850546403"
   },
   "layers": [
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 259260404,
         "digest": "sha256:35aba4b22d486db55a401eebdef3ffa69d00539497b83d9f4b62e3582cb4ced7"
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 125427565,
         "digest": "sha256:e1870ee27293b87c78bf45e078f94acd29c44a0a7963c91932c5c31dfbeeb510"
      }
   ]

Now we have a little bit more information about the layers such as MIME type of the referenced object, size and the digest of the content. With the help of the digest values of the layers, we can locate and verify that the data is actually uploaded, consume storage on the disk as expected and ready to be pulled upon request.

root@sddcvic [ / ]# ls -lh /data/harbor/registry/docker/registry/v2/blobs/sha256/35/35aba4b22d486db55a401eebdef3ffa69d00539497b83d9f4b62e3582cb4ced7
total 248M
-rw-r--r-- 1 root root 248M May 31 10:04 data
root@sddcvic [ / ]# ls -lh /data/harbor/registry/docker/registry/v2/blobs/sha256/e1/e1870ee27293b87c78bf45e078f94acd29c44a0a7963c91932c5c31dfbeeb510
total 120M
-rw-r--r-- 1 root root 120M May 31 10:03 data

Now considering that everything is in place, I can have my cup of coffee and enjoy pulling and pushing Windows container images as a whole 🙂

Harbor, an enterprise class registry server

newharbor0

With the GA release of vSphere Integrated Containers (aka VIC) on December, VMware also announced an enterprise class registry server based on the open source Docker Distribution that allows us to store and distribute container images in our datacenter. While VIC Engine (actually the core component of VIC) is great for providing a container runtime for vSphere, enterprise customers governed by strict regulations require a more private solution in order to store and manage container images rather than the default cloud-based registry service from Docker. Docker has already a private solution, which is called Trusted Registry, but the lack of enterprise grade functionalities such as increased control and security, identity and management triggered this open source project, Project Harbor.

Project Harbor extends Docker Trusted Registry and adds the following features,

  • Role based access control: Users and repositories are well organized and a user can have different permissions for images under a project.
  • Policy based image replication: Images can be replicated (synchronized) between multiple registry instances.
  • LDAP/AD support: Harbor integrates with existing enterprise LDAP/AD for user authentication and management.
  • Image deletion & garbage collection: Images can be deleted and their space can be recycled.
  • Graphical user portal: Users can easily browse, search repositories and manage projects.
  • Auditing: All the operations to the repositories are tracked.
  • RESTful API: RESTful APIs for most administrative operations, easy to integrate with external systems.
  • Easy deployment: Provide both an online and offline installer. Besides, a virtual appliance for vSphere platform (OVA) is available.

It’s possible to download the binary from My VMware Portal which is required to install Harbor as a virtual appliance. If we are not eligible to reach the binary through the portal, we can always visit the GitHub page of the project for manual installation and the OVA file.

Let’s assume that we have decided to use the OVA format in the name of the ease of the deployment. As there are many options we need to provide, such as regular virtual appliance options (hostname, datastore, IP configuration) and application specific configurations (passwords, SMTP configuration and a few harbor specific options), the most important one is the one that configures the authentication method that Harbor will be configured with. It can be either LDAP authentication or database authentication but it cannot be modified after the installation, if required, we have to install a fresh instance. LDAP authentication is a good practice because it will save us from managing custom users within the database and will be more secure. Below are the options that we need to provide (please modify according to your domain)

  • Authentication mode: ldap_auth
  • LDAP URL: ldap://DomainController.demo.local
  • LDAP Search DN: CN=User2MakeQueries,OU=Users,DC=demo,DC=local
  • LDAP Search Password: Password of the above user
  • LDAP Base DN: OU=OrganizationUnitInWhichUsersWillBeQueried,DC=demo,DC=local
  • LDAP UID: The filter to query the users, such as uid, cn, email, sAMAccountName or any other attribute.

After the installation, it’s highly recommended to change the admin password of Harbor provided by us during the deployment phase because it will persist in the configuration file (/harbor/harbor/harbor.cfg) in plain text.

It will take a few minutes to complete the installation. If we set the “Permit Root Login” option to ‘true’ during the deployment phase, we can connect to the server via SSH with root credentials and begin to play around. The deployed operating system is Photon OS and the sub-components of Harbor are actually running as containers. When we run docker ps -a, all those running containers come to daylight. Harbor consists of six containers composed by docker-compose.

  • Proxy: This is a NGINX reverse-proxy. The proxy forwards requests from browsers and Docker clients to backend services such as the registry service and core services.
  • Registry: This registry service is based on Docker Registry 2.5 and is responsible for storing Docker images and processing Docker push/pull commands.
  • Core Services: Harbor’s core functions, which mainly provides the following services:
    • UI: A graphical user interface to help users manage images on the Registry
    • Webhook: Webhook is a mechanism configured in the Registry so that image status changes in the Registry can be populated to the Webhook endpoint of Harbor. Harbor uses webhook to update logs, initiate replications, and some other functions.
    • Token service: Responsible for issuing a token for every docker push/pull command according to a user’s role of a project. If there is no token in a request sent from a Docker client, the Registry will redirect the request to the token service.
  • Database: Derived from official MySQL image and is responsible for storing the metadata of projects, users, roles, replication policies and images.
  • Job Services: This service is responsible for image replication to other Harbor instances (if there are any).
  • Log Collector: This service is responsible for collecting logs of other modules in a single place.

And this is what it looks like from an architectural point of view;

All the blue boxes shown in the diagram are running as containers. If we would like to know more about those containers and how they are configured, we can always run docker inspect commands on them.


docker inspect nginx
docker inspect harbor-jobservice
docker inspect harbor-db
docker inspect registry
docker inspect harbor-ui
docker inspect harbor-log

As a result of the inspect commands, the thing that attracts my attention is the persistent volume configurations. By their nature, containers are immutable and disposable. So in order to keep the data and the service configurations persistent, Harbor takes advantage of volume mounts between the docker host and the containers. And it’s also useful to modify configuration of the services and to replace the ui certificate.

This is the list of the volume mounts (sources and destinations) used in all containers.

We can now enjoy and push images to our on-prem brand new registry server.

Introduction to PhotonOS

photonos-1

Since the cloud native landscape has been greatly embraced by the developers and the open source community, we witness an increasing momentum on container runtimes. The focus center is shifting from virtual machines to containers and running microservices in minimalistic operating systems is becoming mainstream (ok, not that fast but we will definitely be there). So the increasing popularity of minimal operating systems such as CoreOS, RancherOS, RedHat Atomic or even Windows Nano Server has forced VMware to build a lightweight operating system that is optimized to run containers, in order to support it’s cloud native strategy.

VMware introduced Photon OS as an open source project on April 2015 and released the first general available version (v 1.0) on June 2016. Briefly, Photon OS is a minimal linux container host designed to have a small footprint and optimized for VMware platforms. With the 1.0 release, the library of packages within Photos OS had been greatly expanded, making Photon OS more broadly applicable to a range of use-cases while keeping both the disk and memory footprints extremely small [kernel boot times are around 200ms with a 384MB memory footprint and 396MB on disk (minimal installation)].

Photos OS is compatible with container runtimes, such as Docker, rkt and Garden (Pivotal), and container scheduling frameworks, like Kubernetes. It contains a new, open source, yum compatible package manager (TDNF, Tiny Dandified Yum) that makes the system as small as possible, but preserves robust yum package management capabilities.

P.S. In conjunction with all the lightweight nature of Photon OS, we already start to witness some other use-cases rather than just running containers (such as virtual appliances owned by VMware). vCenter Server Appliance 6.5 is running on Photon OS and I expect that more virtual appliances will follow vCSA in the near future.

Now, let’s see how we can spin up a Photon OS based virtual machine. First thing we need to do is to download the binaries from GitHub. Photon OS is available in a few different pre-packaged, binary formats but two of them makes more sense to us:

  • ISO Image: The full ISO contains everything we need to install either the minimal or full installation of Photon OS. The bootable ISO has a manual installer or can be used with PXE/kickstart environments for automated installations.
  • OVA format: The OVA is a pre-installed minimal environment. OVA is a complete virtual machine definition, that’s why it exists in two different versions, hardware version 10 and 11. This will allow for compatibility with several versions of VMware platforms.

As deploying Photon OS via OVA format is a valid and an easy option, I will prefer the ISO method to go through with the steps already less in number. Now as step 2 (step 1 was to get the binaries), we need to create a fresh virtual machine. These are the values that I used:

  • Disk space: 8 GB
  • Compatibility: Hardware Level 11 (ESXi 6.0)
  • Guest OS version: Other 3.x or later Linux (64-bit)
  • vCPU: 1
  • vRAM: 512 MB

Now, mount the ISO to the CD-ROM drive, make sure that “Connect at power on” checkbox is selected, and power the VM on. The installation will be so straight-forward, we will pass through the welcome page, licensing agreement and disk formating sections. Probably the most important selection is the Photon OS type that we want to install.

photonos-2

Each install option provides a different runtime environment:

  • Photon Minimal: It is a very lightweight version of the container host runtime that is best suited for container management and hosting. There is sufficient packaging and functionality to allow most common operations around modifying existing containers, as well as being a highly performant and full-featured runtime.
  • Photon Full: Photon Full includes several additional packages to enhance the authoring and packaging of containerized applications and system customization. It’s better to use Photon Full for developing and packaging the application that will be run as a container, as well as authoring the container itself.
  • Photon OSTree Host: This installation profile creates a Photon OS instance that will source its packages from a central rpm-ostree server and continue to have the library and state of packages managed by the definition that is maintained on the central rpm-ostree server.
  • Photon OSTree Server: This installation profile will create the server instance that will host the filesystem tree and managed definitions for rpm-ostree managed hosts created with the Photon OSTree Host installation profile.

After installation type selection, we will be prompted for a hostname. Installation will come up with a randomly generated hostname, we can enter our own hostname right now or we can always modifiy after the installation with hostnamectl command. Lastly, we enter the root password and the installation starts. In my lab environment, it took 157 seconds to complete the full installation.

photonos-3

Voila, we have an up and running Photos OS server. Photon OS is pre-configured to set the ip address dynamically but if we don’t have a DHCP server in our environment, we have to configure it manually.

Network Configuration:

The network service, which is enabled by default, starts when the system boots. We manage the network service by using systemd commands, such as systemd-networkd, systemd-resolvd, and networkctl. The network configurations are based on .network files that are present at /etc/systemd/network/ and /usr/lib/systemd/network folder. By default, when Photon OS starts, it creates a DHCP network configuration file but we are free to add our own configuration files. Photon OS applies the configuration files in the alphabetical order specified by the file names. Once Photon OS matches an interface in a file, it ignores the interface if it appears in files processed later in the alphabetical order. So, to set a static IP address, we

  • create a configuration file with .network extension
  • place it in the /etc/systemd/network directory
  • set the file’s mode bits to 644
  • restart the systemd-networkd service

photonos-4

Please refer to this guide for further instructions on systemd.network service.

SSH Configuration:

The default iptables policy accepts SSH connections but the sshd configuration file on the full version of Photon OS is set to reject root login over SSH. To permit root login over SSH, we need to open /etc/ssh/sshd_config with the vim text editor, set PermitRootLogin to yes and restart the SSH daemon.

photonos-6

Enable Docker:

Among all these configuration steps, I have almost forgotten why we are doing this, so the ultimate objective is to run containers, right? Full version of Photon OS includes the open source version of Docker (v.1.12.1) but it is not started by default. So it requires us to start the daemon and enable it.

  • Start docker: systemctl start docker
  • Enable docker: systemctl enable docker

Now, we are ready to run docker commands and spin up some containers.

photonos-7