Docker Swarm Limitations

As I continue to head towards my goal of using a container orchestration tool to be able to scale this website behind a load balancer I’m learning all kinds of things about the pitfalls involved with scalable systems. I though Docker Swarm would be great for this since it is relatively straight forward to set up but I’ve discovered it has a few limitations.

First of all it has no mechanism to scale the container hosts. I started down the path of scripting that and it was fairly successful at adding hosts to a swarm but then I learned another limiter. It doesn’t seem to have any way to balance containers among swarm workers at least after the initial start of the service. That means in a two worker node configuration you could have two container instances running on the same node. I know there are health checks that would in essence ‘heal’ your application but it seems silly to have an unused server out there. It was a smaller issue but persistent volumes wouldn’t update quite right even when you created them in line. If there was another volume with the same name on one of the worker nodes it would not give any errors and use that volumes settings and paths on that node. It was difficult to troubleshoot what was happening on that one.

Now I move towards the popular alternative, Kubernetes. Its popularity right now makes sense for me to figure out how it works. I’ve already stood up some basic services on a hosted cluster. It seems like an extra challenge to set it up from scratch. Thinking about my goals lately I wonder if I want to support this website on a complex set up long term. I want to learn it and know it but it may not make financial and logistical sense. So I think now my plan is to set it up and see how making changes is, otherwise simple one server setups are in my future for my personal assets.

Docker Complexity

As I continue my plan to adopt docker for most of my personal infrastructure, including this website, I am learning a valuable lesson. Docker works best with simple images. I imagine there are some complex images out there that work fine but I believe its best to avoid those situations. The ability to support changes and updates and how items interact with each other just adds to the layers. For instance I’m a big fan of CentOS. I know it sacrifices running the latest for dependability. This caused me to have issues using it in a Docker image with apache and PHP 7.2. It doesn’t naively support PHP 7.2 so I started down the path for work arounds. I then realized that it would be really simple to use a base image that has native support. In the end I accomplished with Ubuntu very easily which was becoming more and more complex in CentOS. As an aside I do hope that CentOS 8 is out soon so it can take advantage of some of the newer versions.

Here’s my Docker file, I know I need to optimize my commands to reduce the layers but it works fine:

FROM ubuntu:latest
ARG DEBIAN_FRONTEND=noninteractive
RUN apt update
RUN apt upgrade -y
RUN apt install apache2 php libapache2-mod-php php-mysql -y

EXPOSE 80

CMD ["/usr/sbin/apache2ctl", "-DFOREGROUND"]

Docker orchestration

As I work towards my goal of making this website highly available I’ve hit a couple of roadblocks. As always in IT it is important to work through these and treat them as learning experiences. Sometimes late at night when I encounter one of these its tough for me to go to bed even though I know a rested mind the next day would help me work through the problem. I’ve been hitting quite a few blocks in my quest lately.

First I always wanted to give Kubernetes a try. I chose a hosted platform on Digital Ocean to help me learn about it. I’m planning on talking more about it later but it gave me a great insight into how it encompasses more than just compute orchestration and covers much more of IT infrastructure. Right now I don’t think I’m ready to tackle using it for my personal infrastructure.

For now my plan is to use Docker Swarm. It offers application healing through container regeneration and scaling of how many container instances are running at a given time. My biggest road block has been storage. WordPress web content is dynamic, things like media attachments exist on the web server. If I want to load balance across containers I need to have the source content identical. NFS seemed like the easiest answer.

I tried to create a docker volume on the management swarm node based on the NFS path and present it to the docker service but it didn’t seem to work right. It mounted an empty directory every time. I troubleshooted it from an NFS standpoint but I think I had a lack of understanding of Docker Swarm. There is a lot of ambiguity out there and I have to preface this with I may have done something wrong but here’s what I used to create the service with the NFS share to fix it.

docker service create \
   --name nginx-test \
   --replicas 2 \
   --mount 'type=volume,src=nginx-test,volume-driver=local,dst=/usr/share/nginx/html,volume-opt=type=nfs,volume-opt=device=:/storage/nfs/nginxhttp,volume-opt=o=addr=192.168.79.181' \
   --constraint 'node.role != manager' \
   --publish 80:80 \
   nginx:latest

My understanding of the above is I created the volume in line with the service. I wish I could find something definitive that confirms my suspicions but I think it needs to be inline so each replica knows where to get the source from. I hope this helps someone because I couldn’t find much out there.

My Goals

With my current enthusiasm with Docker I’ve been recently thinking about moving my websites (this one included) to a high availability cluster. Now I could containerize WordPress fairly easily. The complication comes with horizontal scaling. WordPress files are dynamic because of media assets are stored on the web server. There are several ways to over come this but my favorite plan has been to have shared storage serve the web content to multiple web servers. The problem is choosing which kind, something like gluster takes multiple nodes and having a straight NFS server is a single point of failure. Also do I use a managed load balancer or stand up something like HAProxy? I worry that many of these plans either involve single points of failure or unnecessary expense. In the end I may just leave the websites be and stand up a lab for temporary use so I don’t run up a crazy bill. On the plus side my monitoring has improved, I can alert easily on log file patterns and performance health. I’d like to revamp up-time monitoring but simple may be better there. Right now I think I’d just like to get comfortable using Docker Swarm and deploying services to it.

System board replacement

This week I replaced a system board in an HPE DL560 Gen10. The picture doesn’t capture it but this model is quite a bit more involved when it comes to removing the old board. There are chassis components that have to be unscrewed and removed to gain access. Thankfully I didn’t lose any screws and the replacement board booted right up.

Finally Graylog is working

After fighting with docker, graylog, nginx, elastic search, linux permissions, java memory limitations, I finally got graylog working. It is aggregating a couple of servers data. Next I need to stand up an email server I think so it can send out email alerts. Part of my problem is my VM didn’t have enough memory and elastic search kept crashing, of course it didn’t really give any good log files to indicate that but I think docker or the system was killing the container before it consumed all the hosts memory. I also learned about file locks in docker volumes and various work arounds. I’ll return to my monitoring/scaling script and configure logging so I get alerts when conditions are met or it takes an action.