The Cloud

It has been some time since my last post - I'm going to try to resolve that!

I've not touched VMware for over a year and a half now, having started at Cloudreach in April 2014. I've gained certification during this time in multiple cloud platforms (Amazon Web Services and Microsoft Azure).

The cloud has changed the face of System Administration and Engineering for the better. No longer do you need to go and buy a server for tens of thousands of pounds and spend two weeks setting it up, upgrading components, stress testing, carrying it to the datacenter and making it live. In the cloud, things are so much easier!

Some key practices to live by:

  • automate everything (everything!)
  • design for failure
  • commit your design to a repository

Why automate everything?
If you automate your environment, all servers (also known as Instances or VMs) will have the same configuration which allows you to use the in-built scaling features of all cloud platforms worth their salt - some examples of the tools you can use to do this are: Chef (my favourite), Ansible, SaltStack and PowerShell Desired State Configuration (DSC). It also means you don't need to log into specific servers to troubleshoot as you can send logs elsewhere for investigation and rebuild the instance with very little or no administrative effort.

Why design for failure?
Cloud platforms are by their nature built with the most fault tolerant, up to date equipment and communication links but that doesn't stop a physical server having an issue or a digger killing a bunch of connections. Amazon Web Services specifically state you should design for failure and expect an instance to become unavailable at any time. This is worked around at an architecture level by having at least two instances per service in two separate locations (Availability Zones - AWS), or in Azure in two "Fault Domains". In fact, in Azure you must deploy two VMs (in an Availability Set) for one service to get an SLA which is 99.95%.

Commit your design to a repository?
When building a cloud environment you can use tools such as CloudFormation in AWS or Resource Manager in Azure. These templates define your infrastructure as a JSON template which you can commit to a source control system just like you would for application code or scripts. This allows your engineers to update the JSON templates for any changes to the infrastructure, upload it to the cloud platform and let it handle deploying the changes. There's no need to manually resize a volume or change the desired number of instances in an autoscaling group. Of course, you could write the template just to deploy the environment and then manage manually if you really wanted to.

I'd be interested to hear from anyone that is having issues with cloud deployments, migrations or your thoughts on this post!

Thanks for reading.