Improved Website

The web-server configuration has changed yet again. It used to be highly available with scaling policies and everything but it came at a cost. First of all it carried a price tag, I was using AWS EFS to distribute the files to multiple servers. EFS is a great product but for my use it was also slow. The time to first byte was several seconds. I tried to improve this through memcached and NFS caching utilities but the lowest I got it to was 2.5 seconds to start serving the first byte. Now it is down to half a second and feels much snappier. It is on a single server now with backing EBS as the drive. I’m still running memcached on a seperate instance through AWS’s Elasticache offering. I’m really tempted to stand up an instance on Vultr or Digital Ocean to see how the speeds compare. I will be sure to post the results.

Moved to AWS!

This website is now being hosted on a cluster of servers within AWS. I definitely use this website as an excuse to learn more about infrastructure. I’ll post more about it soon, but currently we are fault tolerant. We are using two EC2 instances running apache, EFS holding the files, and an RDS instance running the database. This is behind an Application Load Balancer and the javascript and pictures are being served over CloudFront. It’s a little slow though for me, I’m trying to figure out why. If you aren’t logged in you’ll have a better experience. I’m gathering metrics on load times to help see what the story is.

SCript Race: Azure VS AWS

Let us compare the data I collected from both my scripts I made for AWS and Azure. Each script accomplishes the same things:

  • Deploy two Windows Server VMs from the providers official image repository
  • Deploy one internet facing load balancer with the two servers behind it on port 80
  • Use the providers built in orchestration method to install IIS and place a simple webpage in the root web directory
  • Validate the website is being served over the internet through the Load Balancer

Here is a comparison of the common tasks:

Here is a table of all the data:

Azure
TaskSecondsDuration in SecondsDuration in Minutes
Script Start00.00
Create Load Balancer14140.23
Create VM11761622.70
Install IIS On VM181664010.67
Deploy Website on VM1878621.03
Add VM1 to LoadBalancer954761.27
Create VM211151612.68
Install IIS On VM214853706.17
Deploy Website on VM21547621.03
Add VM2 to LoadBalancer1601540.90
Website Available160210.02
Script Complete (Total)1602160226.70
AWS
TaskSecondsDuration in SecondsDuration in Minutes
Script Start00.00
Create Load Balancer880.13
Create VM11570.12
Create VM22160.10
Assign SSM IAM Role on VM137160.27
Assign SSM IAM Role on VM24140.07
Deploy System Management Agent on VM11671262.10
Deploy System Management Agent on VM21701292.15
Execute Install IIS & Website On VM117220.03
Execute Install IIS & Website On VM217310.02
Add VM1 to LoadBalancer17520.03
Add VM2 to LoadBalancer17720.03
Website Available260831.38
Script Complete (Total)2602604.30
Certain tasks in AWS do not wait for their execution to complete. Checks were added in the script and the duration column indicates which were essentially run in parallel.

As you can see from the data above, in terms of automation AWS is much faster.

INFRASTRUCTURE AS CODE: AWS EDITION

As a follow on to my script that deploys a cluster of two load balanced Windows servers, installs IIS, and deploys a website for Azure, I created a similar script to do so in AWS. A few things of note that I feel makes AWS’s script better.

  • Certain tasks are non blocking and do not wait for the action to complete. I added wait states in the script to make sure time comparisons are true.
  • AWS actions are much faster. On average in my script it takes Azure 65 seconds to add a VM to a load balancer where in AWS its an average of 2 seconds.
  • AWS’s CLI allows for multiple instance IDs to be provided per command to increase efficiency even though my script doesn’t really take advantage of this which provides a more true comparison since I don’t think Azure’s CLI or PowerShell module allows for this.

Here is the script:

#This script creates a number of Windows VMs, installs IIS, a simple webpage, and places them behind a load balancer
#run this if needed
#aws configure
function elapsedTime {
    $CurrentTime = $(get-date)
    $elapsedTime = $CurrentTime - $StartTime
    $elapsedTime = [math]::Round($elapsedTime.TotalSeconds,2)
    Write-Host "Elapsed time in seconds: " $elapsedTime -BackgroundColor Blue
}
#Captures start time for script elapsed time measurement
$StartTime = $(get-date)

#Sets "Constants" to be used to create VMs
$imageID = "ami-0182e552fba672768" #Amazon's provided windows 2019 datacenter base
$subnet = "subnet-000000000" #my subnet for us-east-2a
$securityGroup = "sg-00000000" #My network security baseline
$instanceType = "t2.medium" #Instance size, 2 vCPUs, 4 GB RAM
$keyPair = "ServerHobbyist" #Keypair to retreive administrator password
$instanceName = "WebWinApp" #sets base name
$class = "disposable" #sets class tag as disposable for easier identification and cleanup

#creates load balancer
$lbName = "WinWebLB1"
Write-Host "Creating Loadbalancer $($LbName)"
aws elb create-load-balancer `
    --load-balancer-name $lbName `
    --listeners "Protocol=HTTP,LoadBalancerPort=80,InstanceProtocol=HTTP,InstancePort=80" `
    --subnets $subnet `
    --security-groups $securityGroup
aws elb add-tags --load-balancer-name $lbName --tags Key=Class,Value=$class #tags elb with disposable class
aws elb configure-health-check --load-balancer-name $lbName --health-check Target=HTTP:80/,Interval=5,UnhealthyThreshold=2,HealthyThreshold=2,Timeout=3 #sets a lower threshold for health checks
Write-Host "Load Balancer $($Lbname) created"
elapsedTime


$serverCount = 2 #how many VMs to deploy
$instancesDeployed =  New-Object System.Collections.Generic.List[System.Object] #creates array list that will contain instance IDs deployed
for ($i=1; $i -le $serverCount; $i++){
    
    $instanceNameTag = $instanceName + $i
    Write-Host "Creating VM $($instanceNameTag)"
    $instance = aws ec2 run-instances `
        --image-id $imageID `
        --count 1 `
        --instance-type $instanceType `
        --key-name $keyPair `
        --security-group-ids $securityGroup `
        --subnet-id $subnet | ConvertFrom-Json
    aws ec2 create-tags --resources $instance.instances.InstanceId --tags Key=Name,Value=$instanceNameTag #tags instance with name
    aws ec2 create-tags --resources $instance.instances.InstanceId --tags Key=Class,Value=$class #tags instance with name

    $instancesDeployed.Add($instance.Instances.InstanceId)
    Write-Host "VM $($instanceNameTag) created"
    elapsedTime
}
Start-Sleep -Seconds 15

#Checks to make sure each instance deployed from above is in a running state, otherwise it can't recieve the IAM role.
foreach ($instanceDeployed in $instancesDeployed){
    $instance = aws ec2 describe-instances --instance-ids $instanceDeployed | ConvertFrom-Json
    $InstanceTags = $Instance.Reservations.Instances.Tags
    $InstanceName = $InstanceTags | Where-Object {$_.Key -eq "Name"}
    $InstanceName = $InstanceName.Value
    Write-Host "Checking if instance $($InstanceName)  is ready to receive IAM role for SSM"
    while ($instance.Reservations.Instances.State.Name -ne "running") {
        Write-Host "Instance $($InstanceName) not ready. Waiting to check again"
        sleep 5
        Write-Host "Checking if instance $($InstanceName) is ready to receive IAM role for SSM"
        $instance = aws ec2 describe-instances --instance-ids $instanceDeployed | ConvertFrom-Json
    }
    Write-Host "Instance $($InstanceName) is now ready, assigning role"
    elapsedTime
    aws ec2 associate-iam-instance-profile --instance-id $instanceDeployed --iam-instance-profile Name=AmazonSSMRoleForInstancesQuickSetup
    
}
Start-Sleep -Seconds 15
#Checks to make sure each instance deployed from above has the SSM agent. Otherwise commands can't be sent through AWS's orchestration system
foreach ($instanceDeployed in $instancesDeployed){
    $instance = aws ec2 describe-instances --instance-ids $instanceDeployed | ConvertFrom-Json
    $InstanceTags = $Instance.Reservations.Instances.Tags
    $InstanceName = $InstanceTags | Where-Object {$_.Key -eq "Name"}
    $InstanceName = $InstanceName.Value
    Write-Host "Checking if instance $($InstanceName) has receieved the system management agent"
    $ssmTest = aws ssm list-inventory-entries --instance-id $instanceDeployed --type-name "AWS:InstanceInformation" | ConvertFrom-Json
    while ($ssmTest.Entries.AgentType -ne "amazon-ssm-agent"){
        $ssmTest = aws ssm list-inventory-entries --instance-id $instanceDeployed --type-name "AWS:InstanceInformation" | ConvertFrom-Json
        Write-Host "Instance $($InstanceName) has not yet received the SSM agent"
        start-sleep -Seconds 5
    }
    Write-Host "Instance $($InstanceName) has received the SSM agent. Proceeding to next instance or step."
    elapsedTime
}

#Installs IIS and deploys website
foreach ($instanceDeployed in $instancesDeployed){
    Write-Host "Sending command to install IIS and deploy website on $($instanceDeployed)"
    aws ssm send-command `
        --document-name "AWS-RunPowerShellScript" `
        --parameters commands=['Add-WindowsFeature Web-Server; Invoke-WebRequest -Uri "https://serverhobbyist.com/deployment/index.html" -OutFile "c:\inetpub\wwwroot\index.html"'] `
        --targets "Key=instanceids,Values=$($instanceDeployed)" `
        --comment "Installs IIS"
    Write-Host "Command sent to $($instanceDeployed)"
    elapsedTime
}

#adds VMs to load balancer
foreach ($instanceDeployed in $instancesDeployed){
    Write-Host "Registering instance $($instanceDeployed) with LB"
    aws elb register-instances-with-load-balancer --load-balancer-name $lbName --instances $instanceDeployed #registers instance with load balancer
    Write-Host "Registered instance $($instanceDeployed) with LB"
    elapsedTime
}

Write-Host "Checking if website is ready to be served from load balancer"
$lbURL = aws elb describe-load-balancers --load-balancer-name $lbName | ConvertFrom-Json
$lbURL = "http://" + $lbUrl.LoadBalancerDescriptions.CanonicalHostedZoneName
$check = $false
while ($check -eq $false){
try {
    $check = $true
    $result = invoke-webrequest -uri $lbURL -UseBasicParsing -TimeoutSec 20
}
catch {
    $check = $false
    Write-Host "Website failed to load. Trying again"
    Start-Sleep -Seconds 5
}}
Write-Host "Website is now loading at $lbURL"
elapsedTime

Write-Host "Script completed" -BackgroundColor Blue
elapsedTime

Infrastructure as Code: Azure Edition

Since I’ve been using AWS as a hobbyist for about a decade it is the public cloud I am most comfortable with. Lately to expand my horizons I’ve been learning about Microsoft’s take on it with Azure. I hope I’m not too bias since its easier for me to favor AWS since I’ve been using it for so long however my initial take on Azure is not positive. It is slow. I’m working on a comparison in terms of speed between AWS and Azure with the goal of standing up a 2 node load balanced IIS cluster. While I continue to work on making a good comparison write-up here is the code that deploys out the Azure resources. It is written in PowerShell and includes a function that measures completion time.

Import-Module Az
function elapsedTime {
    $CurrentTime = $(get-date)
    $elapsedTime = $CurrentTime - $StartTime
    $elapsedTime = [math]::Round($elapsedTime.TotalSeconds,2)
    Write-Host "Elapsed time in seconds: " $elapsedTime -BackgroundColor Green
}
#Run this to connect to Azure account if needed
#Connect-AzAccount
#Captures start time for script elapsed time measurement
$StartTime = $(get-date)
#Sets "Constants" to be used throughout script
$resourceGroup = "DisposableLab"
$location = Get-AzLocation | Where-Object {$_.DisplayName -like "North Central US"}
$vnet = "vnet1"
$subnet = "default"
$securityGroup = "DisposableLabSecurityGroup"
$secpasswd = ConvertTo-SecureString "password" -AsPlainText -Force
$credential = New-Object System.Management.Automation.PSCredential ("username", $secpasswd)
$lbname = "WebAppWinLB"
$availSetName = "WinWebappAvailabilitySet"
#Creates Availability Set to allow both servers to be load balanced
New-AzAvailabilitySet `
   -Location $location.Location `
   -Name $availSetName `
   -ResourceGroupName $resourceGroup `
   -Sku aligned `
   -PlatformFaultDomainCount 2 `
   -PlatformUpdateDomainCount 2

$publicIp = New-AzPublicIpAddress -Name 'LB1PublicIP' -ResourceGroupName $resourceGroup -AllocationMethod Static -Location $location.Location
#sets up the inbound IP pool for the load balancer
$feip = New-AzLoadBalancerFrontendIpConfig -Name 'myFrontEndPool' -PublicIpAddress $publicIp
$bepool = New-AzLoadBalancerBackendAddressPoolConfig -Name 'myBackEndPool' 
#creates health check for load balancer
$probe = New-AzLoadBalancerProbeConfig `
 -Name 'myHealthProbe' `
 -Protocol Http -Port 80 `
 -RequestPath / -IntervalInSeconds 360 -ProbeCount 5
#creates load balancing rule
$rule = New-AzLoadBalancerRuleConfig `
  -Name 'webInbound' -Protocol Tcp `
  -Probe $probe -FrontendPort 80 -BackendPort 80 `
  -FrontendIpConfiguration $feip `
  -BackendAddressPool $bepool
#creates new LB from settings gathered so far
 $lb = New-AzLoadBalancer `
  -ResourceGroupName $ResourceGroup `
  -Name $lbname `
  -Location $location.Location `
  -FrontendIpConfiguration $feip `
  -BackendAddressPool $bepool `
  -Probe $probe `
  -LoadBalancingRule $rule 

elapsedTime

$serversCount = 2
for ($i=1; $i -le $serversCount; $i++) {
  elapsedTime
$VMName = "WebappWin" + $i
Write-Host "Creating VM " + $VMName
#Generates new public IP for the new load balancer to be created
$VM = Get-AzVM -Name $VMName
$NIC = Get-AzNetworkInterface -Name $VMName
#creates new VM
New-AzVm `
    -Credential $credential `
    -ResourceGroupName $resourceGroup `
    -Name $VMName `
    -Location $location.Location `
    -VirtualNetworkName $vnet `
    -SubnetName $subnet `
    -SecurityGroupName $securityGroup `
    -PublicIpAddressName "$($VMName)PublicIP" `
    -AvailabilitySetName $availSetName
Write-Host "VM $($VMName) has been created"
elapsedTime
Write-Host "Installing IIS for " + $VMName
$PublicSettings = '{"commandToExecute":"powershell Add-WindowsFeature Web-Server"}'
#Waits a few seconds for the VM to become available to recieve 
Start-Sleep -Seconds 5
Set-AzVMExtension -ExtensionName "IIS" -ResourceGroupName $resourceGroup -VMName $vmName `
  -Publisher "Microsoft.Compute" -ExtensionType "CustomScriptExtension" -TypeHandlerVersion 1.4 `
  -SettingString $PublicSettings -Location $location.location
Write-Host "IIS Installed for $($VMName)"
elapsedTime
Write-Host "Deploying website for " + $VMName

Invoke-AzVMRunCommand -ResourceGroupName $resourceGroup -VMName $VMName -CommandId "RunPowerShellScript" -ScriptPath "C:\pathhere\WebsiteTest\deployWebsite.ps1"
Write-Host "Website deployed on $($VMname)"
elapsedTime
#Gets load balancer object based on name
$lb = Get-AzLoadBalancer -Name $lbname
$backendConfig = Get-AzLoadBalancerBackendAddressPoolConfig -LoadBalancer $lb
#Get's NIC from virtual machine
$NIC = Get-AzNetworkInterface -Name $VMName
#Removes VM from LB
#$nic.Ipconfigurations[0].LoadBalancerBackendAddressPools=$null
#Adds VM to LB
Write-Host "Adding $($VMName) to Loadbalancer " + $lbname
$nic.IpConfigurations[0].LoadBalancerBackendAddressPools=$lb.BackendAddressPools[0]
Set-AzNetworkInterface -NetworkInterface $nic
Write-Host "VM $($VMName) added to the load balancer"
elapsedTime
}

Write-Host "Script completed" -BackgroundColor Blue
elapsedTime

CI/CD

I learn best by doing. I had heard continuous integration and continuous deployment thrown around for a while, often as buzz words. My current employer doesn’t really do either of these though we do use Jenkins as a deployment server. I’ve never seen it in action but from what I understand it has hooks into some of our java container hosts like WebLogic. It copies files over and understands compilation errors. Again I’ve never actually used it so this is just my understanding from what I’ve heard at work. GitLab gives you 2000 CI minutes which I never used. With my learning of Go it and pushing my code into it I started getting template suggestions for testing. The template provided needed a few edits and it started compiling my code and testing it for errors. It would place the built binary in a kind of hidden directory. It would do this on every push and took about 5 minutes. It used environment variables to protect credentials being in the config file.

Deployment was more difficult. I had trouble finding a good example out there on how to deploy my web app written in Go. Finally I found an alright example with a shell script that copied files. I augmented it with a service created on my web server.

Here’s my gitlab-ci.yml:

# This file is a template, and might need editing before it works on your project.
image: golang:latest

variables:
  # Please edit to your GitLab project
  REPO_NAME: gitlab.com/murphyslaw4267/cool_go_app
  APP_NAME: cool_go_app
  S3_BUCKET_NAME: "serverhobbyistohio"
  AWS_ACCESS_KEY_ID: $AWSID
  AWS_SECRET_ACCESS_KEY: $AWSSecret

# The problem is that to be able to use go get, one needs to put
# the repository in the $GOPATH. So for example if your gitlab domain
# is gitlab.com, and that your repository is namespace/project, and
# the default GOPATH being /go, then you'd need to have your
# repository in /go/src/gitlab.com/namespace/project
# Thus, making a symbolic link corrects this.
before_script:
  - mkdir -p $GOPATH/src/$(dirname $REPO_NAME)
  - ln -svf $CI_PROJECT_DIR $GOPATH/src/$REPO_NAME
  - cd $GOPATH/src/$REPO_NAME

stages:
  - test
  - build
  - deploy

format:
  stage: test
  script:
    - go get github.com/aws/aws-sdk-go
    - go fmt $(go list ./... | grep -v /vendor/)
    - go vet $(go list ./... | grep -v /vendor/)
    - go test -race $(go list ./... | grep -v /vendor/)

compile:
  stage: build
  script:
    - go get github.com/aws/aws-sdk-go
    - echo $CI_PROJECT_DIR
    - go build -race -ldflags "-extldflags '-static'" -o $CI_PROJECT_DIR/build/$APP_NAME
  artifacts:
    paths:
      - build/
  
deploy:
  stage: deploy
  only:
  - master
  image: ubuntu
  before_script:
  - apt update -y
  - apt install rsync -y
  - 'which ssh-agent || ( apt-get update -y && apt-get install openssh-client git -y )'
  - eval $(ssh-agent -s)
  - echo "$SSH_PRIVATE_KEY" | tr -d '\r' | ssh-add - > /dev/null
  - mkdir -p ~/.ssh
  - chmod 700 ~/.ssh
  - ssh-keyscan webapp1.serverhobbyist.com >> ~/.ssh/known_hosts
  - chmod 644 ~/.ssh/known_hosts
  - rsync -ae ssh ./build/* root@webapp1.serverhobbyist.com:/var/www/go/cool_go_app/
  - rsync -ae ssh ./*.html root@webapp1.serverhobbyist.com:/var/www/go/cool_go_app/
  script:
  - bash .gitlab-deploy.sh


Monitoring with Grafana

Monitoring is one of those things I’ve seen get overlooked. Throwing money at it doesn’t always make it better. Taking the time to sit down and understand and doing proper event management is key. Of course I say all of that and I don’t always practice it, especially on my own personal infrastructure. I think I go beyond the basics however. Originally I would just set up nagios and get emails and texts when a server didn’t respond to ping. Finding something that can monitor and alert on performance was more challenging. I ended up settling using the telegraf agent and having it send to an influxdb instance. Then grafana interprets the influxdb data and makes it into pretty visuals.

Docker Swarm Limitations

As I continue to head towards my goal of using a container orchestration tool to be able to scale this website behind a load balancer I’m learning all kinds of things about the pitfalls involved with scalable systems. I though Docker Swarm would be great for this since it is relatively straight forward to set up but I’ve discovered it has a few limitations.

First of all it has no mechanism to scale the container hosts. I started down the path of scripting that and it was fairly successful at adding hosts to a swarm but then I learned another limiter. It doesn’t seem to have any way to balance containers among swarm workers at least after the initial start of the service. That means in a two worker node configuration you could have two container instances running on the same node. I know there are health checks that would in essence ‘heal’ your application but it seems silly to have an unused server out there. It was a smaller issue but persistent volumes wouldn’t update quite right even when you created them in line. If there was another volume with the same name on one of the worker nodes it would not give any errors and use that volumes settings and paths on that node. It was difficult to troubleshoot what was happening on that one.

Now I move towards the popular alternative, Kubernetes. Its popularity right now makes sense for me to figure out how it works. I’ve already stood up some basic services on a hosted cluster. It seems like an extra challenge to set it up from scratch. Thinking about my goals lately I wonder if I want to support this website on a complex set up long term. I want to learn it and know it but it may not make financial and logistical sense. So I think now my plan is to set it up and see how making changes is, otherwise simple one server setups are in my future for my personal assets.

Docker Complexity

As I continue my plan to adopt docker for most of my personal infrastructure, including this website, I am learning a valuable lesson. Docker works best with simple images. I imagine there are some complex images out there that work fine but I believe its best to avoid those situations. The ability to support changes and updates and how items interact with each other just adds to the layers. For instance I’m a big fan of CentOS. I know it sacrifices running the latest for dependability. This caused me to have issues using it in a Docker image with apache and PHP 7.2. It doesn’t naively support PHP 7.2 so I started down the path for work arounds. I then realized that it would be really simple to use a base image that has native support. In the end I accomplished with Ubuntu very easily which was becoming more and more complex in CentOS. As an aside I do hope that CentOS 8 is out soon so it can take advantage of some of the newer versions.

Here’s my Docker file, I know I need to optimize my commands to reduce the layers but it works fine:

FROM ubuntu:latest
ARG DEBIAN_FRONTEND=noninteractive
RUN apt update
RUN apt upgrade -y
RUN apt install apache2 php libapache2-mod-php php-mysql -y

EXPOSE 80

CMD ["/usr/sbin/apache2ctl", "-DFOREGROUND"]