Scaling Automation

I’m slowly on my way to automating the creation of droplets in Digital Ocean and adding them to my docker swarm cluster. I have the creation and configuration working thanks to PowerShell and Ansible. I’m still working on the check logic but have the queries I need for InfluxDB. I need to figure out the best way to query the database. I also started thinking about ensuring there are limitations so this thing doesn’t take off and end up spawning 100+ servers overnight by mistake. I also am trying to set up Graylog to drive alerting for actions taken by this script and a general troubleshooting tool. I’m having trouble getting it to work inside Docker but I’ll work on it some more tomorrow. Here’s the work in progress PowerShell script. The first two functions work but there’s definitely improvements to be made.

#Requirements: Digital Ocean command line, powershell running on linux, ansible
#I can't seem to find a way to find droplets associated with a project so I'm kind of cheating and finding droplets with swarm in the name
Write-Host "Starting script to create and configure new swarm worker node" -ForegroundColor Blue
function Create-Droplet {
Write-Host "Retrieveing existing nodes from DigitalOcean"
$droplets = doctl compute droplet list --output json | ConvertFrom-Json
$swarmIDs = New-Object System.Collections.ArrayList
foreach($droplet in $droplets){
    if($droplet.name -like 'swarm*'){
    $swarmIDs.add($droplet.id)
    }
}

#If droplets get destroyed this should allow those numbers to get reused and not over increment
Write-Host "Finding next swarm node number"
$findIncrement = 1
foreach($swarmID in $swarmIDs){
    $swarmNode = doctl compute droplet get $($swarmID) --output json | ConvertFrom-Json
    if($swarmNode.name -like "*$($findIncrement)*"){
        $findIncrement++
    }
}
Write-Host "Found next available number is $($findIncrement)"
#Sets the name of the droplet to be created
$newName = "swarm$($findIncrement).serverhobbyist.net"
Write-Host "Name of node to be created: $($newName)"
#Checks for and deletes last json file for last created droplet to keep it current
$fileCheck = Test-Path -path /storage/ansible/nodescreated/latest.json
if ($fileCheck -eq $true){
    Remove-Item -Path /storage/ansible/nodescreated/latest.json
}
#Run Ansible Playbook to Create Droplet
$command = "ansible-playbook CreateDroplet.yaml -e `"dropletName=$($newName)`""
Write-Host "Running ansible playbook with command: $($command)"
bash -c $command
}
function Setup-Droplet {
#Gets returned data from Create-Droplet function
Write-Host "Discovering data about newly created node"
$createdNode = Get-Content  /storage/ansible/nodescreated/latest.json | ConvertFrom-Json
$IP = $createdNode.data.ip_address
#$IP = "165.22.178.216"
#Adds IP to hosts file
Write-Host "Writing IP $($IP) to Ansible Hosts file"
Add-Content /etc/ansible/hosts "`n$($IP)"
#Runs playbook to configure node: updates, telegraf agent + config, docker, epel, htop
$command = "ansible-playbook SetupDroplet.yaml -e `"IP=$($IP)`""
Write-Host "Running playbook with command: $($command)"
bash -c $command
}
function Check-Droplets {
#Finds current nodes in account with swarm in the name and adds the name to array
Write-Host "Retrieveing existing nodes from DigitalOcean"
$droplets = doctl compute droplet list --output json | ConvertFrom-Json
$swarmNames = New-Object System.Collections.ArrayList
foreach($droplet in $droplets){
    if($droplet.name -like 'swarm*'){
    $swarmNames.add($droplet.name)
    }
}

$check = 0
foreach($swarmName in $swarmNames){
    $influxCheckCPU = "q=`"select mean(usage_idle),count(usage_idle) from `"cpu`" where `"host`" = " + "'" + $($swarmName) + "' " + "and time > now() - 5m`""
    $influxCheckRAM = "q=`"select mean(used_percent),count(used_percent) from `"mem`" where `"host`" =" + "'" + $($swarmName) + "'" +" and time > now() - 5m`""
    $influxCPUResult = curl -G 'https://influx.server.com/query?pretty=true' -u user:pw --data-urlencode `"db=telegraf`" --data-urlencode $influxCheckCPU | ConvertFrom-Json
    $influxRAMResult = curl -G 'https://influx.server.com/query?pretty=true' -u user:pw --data-urlencode `"db=telegraf`" --data-urlencode $influxCheckRAM | ConvertFrom-Json
    if($influxCPUResult -ne $null -and $influxRAMResult -ne $null){
        $CPUUsage = (100 - $($influxCPUResult.results.series.values)[1])
        $RAMUsage  = $($influxRAMResult.results.series.values)[1]
        $CPUUsage = [math]::Round($($CPUUsage),2)
        $RAMUsage = [math]::Round($($RAMUsage),2)
        Write-Host "Results for $($swarmName):" -ForegroundColor Blue
        Write-host "CPU usage: $($CPUUsage)%" -ForegroundColor Blue
        Write-host "RAM usage: $($RAMUsage)%" -ForegroundColor Blue

        if($CPUUsage -ge 40){
            $CPUUsage = [math]::Round($($CPUUsage),2)
            Write-Host "CPU usage is now at $($CPUUsage)% which is high, an action might be taken"
            $command = "logger CPU usage is now at $($CPUUsage)% which is high, an action might be taken"
            $check++
            bash -c $command
            }
        if($RAMUsage -ge 50){
            $RAMUsage = [math]::Round($($RAMUsage),2)
            Write-Host "RAM usage is now at $($RAMUsage)% which is high, an action might be taken"
            $command = "logger RAM usage is now at $($RAMUsage)% which is high, an action might be taken"
            $check++
            bash -c $command
            }
        }
    }
}


#Write-Host "Starting Function: Create-Droplet" -ForegroundColor Green
#Create-Droplet
#Write-Host "Starting Function: Setup-Droplet" -ForegroundColor Green
#Setup-Droplet
Write-Host "Starting Function: Check-Droplet" -ForegroundColor Green
Check-Droplets

Recreating AWS Fargate (lite)

I’ve recent been on a spree building out my infrastructure as docker containers. I’m often torn in the debate between the cloud and on-prem infrastructure. I think the reason for that is two fold. One: I really like the tangible nature of the on-prem hardware. Seeing the servers, storage, and network gear and thinking about all the bits flying all over the place gives me comfort for some reason. Two: I like understanding the logic of how a system works. In the on-prem world with traditional silo’ed infrastructure you have to understand the inner workings to create automation on top of it. I love knowing the inner workings of IT systems.

Lately however I’ve been using more and more cloud providers. I played around with AWS Fargate where they manage the cluster of container hosts for you. You can set auto scale rules so it adds more computer power as it creates more replicas of your container. It will even register and deregister the containers from the elastic load balancer, make health decisions and “self heal” your application. It integrates with Route 53 for your DNS entries. It redirects your log files and monitors several metrics for you. I can only imagine Amazon invested a lot in this system so it may be lofty of me to want to recreate this. I’d really like to remain provider agnostic but its hard when the ecosystem of the provider works so well together right out of the box.

Because my credit card can only take so much from Amazon experiments I decided to try and create something similar on Digital Ocean. I’m using Ansible to spin up swarm worker nodes, from the inception of the VM to updates and configuration and joining the swarm. That is tied together with PowerShell. There may be a better way but its the scripting language I’m most familiar with and it seems to work quite well on Linux. I’m using influxDB and telegraf to gather performance metrics, I hope to have a query drive scaling decisions. I still need to figure out the load balancing aspect and probably others that I haven’t thought of. Here’s the main part that will drive the creation of new droplets on DigitalOcean.

#Requirements: Digital Ocean command line, powershell running on linux, ansible
#I can't seem to find a way to find droplets associated with a project so I'm kind of cheating and finding droplets with swarm in the name
$droplets = doctl compute droplet list --output json | ConvertFrom-Json  
$swarmIDs = New-Object System.Collections.ArrayList
foreach($droplet in $droplets){
    if($droplet.name -like 'swarm*'){
    $swarmIDs.add($droplet.id)
    }
}

#If droplets get destroyed this should allow those numbers to get reused and not over increment
$findIncrement = 1
foreach($swarmID in $swarmIDs){
    $swarmNode = doctl compute droplet get $($swarmID) --output json | ConvertFrom-Json
    if($swarmNode.name -like "*$($findIncrement)*"){
        $findIncrement++
    }
}
#Sets the name of the droplet to be created
$newName = "swarm$($findIncrement).serverhobbyist.net"
#Run Ansible Playbook to Create Droplet
$command = "ansible-playbook CreateDroplet.yaml -e dropletName = $($newName)"
bash -c $command

One of the best steaks

I was recently in NYC for company travel. I’ve been there several times before and it is not a place I necessarily associate with steak in terms of food. I think of Chinese food, pizza, hot dogs, fresh produce, Italian, and others. However on this trip my team mates really wanted to go to a steak house. They started calling up famous ones only to find our that reservations need to be made way in advance. Finally we found one in the basement of the New Yorker hotel. It was called Butchers and Bankers because it was in a former old bank. Some of the tables were in the former vault. Even though it was fabulous, it was more than I’ve ever paid for a steak. I ordered the NY Strip and we ordered a side of Cauliflower and a side of Mushrooms.

Food?

I love food, I mean who doesn’t? What does food have to do with being a Server Hobbyist and IT infrastructure? Perhaps not much beyond the need to stay energized. I guess I really just wanted a spot to share pictures of food with the world…

Devops Tools

The DevOps tool chain has always fascinated me. I work for a company that is fairly silo’ed and sometimes kind of traditional. We use Microsoft System Center Configuration Manager as our CM tool. Now don’t get me wrong, it works, and it is powerful, but boy is it slow. I’ve been told Microsoft wants it to be able to work on environments with millions of servers. I started looking at Chef because another team at work uses it. It was neat to see all of the integrations it had and the amount of documentation out there. I was also enjoying the elegance of structured files to dictate configurations rather than always having to click through SCCM’s sluggish GUI for every modification. Chef uses a pull methodology, as does SCCM. Chef’s can be set pretty low and it also has some neat bootstrap functions for new infrastructure but I really wanted something with faster feedback. I started learning about Ansible and began using it in my lab. I set up integrations with Digital Ocean and my on premise VMWare lab to orchestrate the deployment of virtual servers. I created a php web app to have a web form where you can fill out your server specs and VMWare will use its operating system customization to set things like name and IP address. I was finally able to spin up a complete machine at home at the click of a button like Vultr, AWS, or Digital Ocean. At work we have a more involved build process and I got really excited to demo my created. Some were impressed, some even got excited that we could create a server from start to finish in about 5 minutes, down from about 45. However I was unable to get management buy in at my demo. They had concerns about the learning curve. I like stepping outside the comfort zone of the OS GUI and doing things command line, PowerShell, Bash, YAML files, etc. Others are not so willing to leave that zone. One team mate approached me and agreed to learn it. Slowly but surely I’ve been creating the ansible playbooks for my organizations Windows infrastructure. Its more complicated than my home lab but I’m getting close to accounting for all the extra configuration our servers need.

Monitoring App

I recently started trying to create a setup that can monitor CPU and RAM usage on a remote server. After quite a bit of thinking I created a SQL database and shell script that saved outputted data into the DB. This wasn’t ideal for remote servers so I created a poor man’s API with Node & Express. It takes a web call from remote systems running a shell script on a schedule. It then saves it to a table and I’m using grafana to visualize the data. Its in a super early state now with no validation but it is functional. Here’s my code for Node:

 

var express = require('express');

var router = express.Router();

var bodyParser = require('body-parser');

var mysql = require('mysql');

var con = mysql.createConnection({

host:"localhost",

user:"perfmon",

password:"passwordhere",

database:"perfmon"

});

con.connect(function (err) {

if (err) throwerr;

console.log("Connected to SQL database");

});

/* GET home page. */

router.get('/', function (req, res, next) {

varserverName=req.query.serverName;

varCPU=req.query.CPU;

varRAM=req.query.RAM;

//sqlInsertPerfmon = "INSERT INTO serverbuilddata (ServerName, CPUCount, RAMCount) VALUES ('" + serverName + "'," + CPUCount + "," + serverMemory + ');'

sqlInsertPerfmon="INSERT INTO perfdata (serverName, CPU, RAM) VALUES ('"+serverName+"',"+CPU+","+RAM+");"

console.log('SQL Statement: '+sqlInsertPerfmon);

con.query(sqlInsertPerfmon, function (err, result) {

if (err) throw(err);

console.log("Performance Data Logged");

res.render('perfmon', { title:'Perfmon', serverName:serverName, CPU:CPU, RAM:RAM });

});




});

module.exports = router;