Automate to avoid database cloning disasters.

“Accidentally destroyed production database on first day of a job”

Wow, that headline grabbed my attention.

Earlier this week you may have seen an article reported by The Register about a post in  reddit from a junior software developer going by the name of  “cscareerthrowaway567”, who on his/her first day destroyed a Production database and lost their job.

I was basically given a document detailing how to setup my local development environment. Which involves run a small script to create my own personal DB instance from some test data. After running the command i was supposed to copy the database url/password/username outputted by the command and configure my dev environment to point to that database. Unfortunately instead of copying the values outputted by the tool, i instead for whatever reason used the values the document had.

Unfortunately apparently those values were actually for the production database (why they are documented in the dev setup guide i have no idea). Then from my understanding that the tests add fake data, and clear existing data between test runs which basically cleared all the data from the production database. Honestly i had no idea what i did and it wasn’t about 30 or so minutes after did someone actually figure out/realize what i did.

Now, we can not be sure if the story by “cscareerthrowaway567” is true or not but it does provide an great example of the potential dangers of manual database cloning.

The story raises many issues around process, security, change management, training and not least why someone thought it was a good idea to include Production account details in a database cloning document ?

However, this could have easily been avoided through the use of Automation or a CDM (Copy Data Management) tools e.g. Catalogic, Actifio, Delphix etc..

Many modern All-Flash Arrays (AFA) provide powerful REST API’s which can be called from various programming and scripting languages e.g. Python, Java, Perl, PowerShell etc… to create custom solutions.

AFA Storage API’s can also be consumed by DevOPS Automation and Provisioning software e.g. Ansible, Chef, Puppet etc.. to provide full-stack or end-to-end Automation of Database Cloning, improving security and also removing the chance of human error.

Check out the Pure Storage Developer Community area for code examples.

Getting started with Ansible and Oracle

Introduction

In my previous post An introduction to Ansible I shared some reasons why companies are adopting Ansible and described some of the advantages of using Ansible over other configuration management tools.

Now we know what Ansible is, let’s start using it.

Setting up an Ansible Control Machine

The simplest and quickest way to get up and running with Ansible is to use Vagrant to create a virtual machine. Vagrant ships with out of the box support for VirtualBox, Hyper-V and Docker. Vagrant supports other providers e.g. VMware but these are licenceable

So even though I mainly use VMware Fusion on my MacBook I used the links above to install Vagrant and the excellent Oracle VirtualBox to avoid any licensing requirements.

Using Vagrant

Run the following commands to create a Vagrantfile for an Ubuntu Vagrant machine.
$ mkdir ansible_oracle
$ cd ansible_oracle
$ vagrant init ubuntu/trusty64

A `Vagrantfile` has been placed in this directory. You are now ready to `vagrant up` your first virtual environment! Please read the comments in the Vagrantfile as well as documentation on `vagrantup.com` for more information on using Vagrant.

$ vagrant up
You should now be able to SSH into your Ubuntu VM using ‘vagrant ssh’, however before we try and connect to our new VM let’s check the status of all the local Vagrant machines using the following:
$ vagrant global-status

id       name    provider   state    directory
————————————————————————-
a1995ac  default virtualbox running  /Users/ronekins/ansible_oracle

The above shows information about all known Vagrant environments
on this machine. This data is cached and may not be completely
up-to-date. To interact with any of the machines, you can go to
that directory and run Vagrant, or you can use the ID directly
with Vagrant commands from any directory. For example:
“vagrant destroy 1a2b3c4d”

$ vagrant status a1995ac
Current machine states:

default running (virtualbox)

The VM is running. To stop this VM, you can run `vagrant halt` to
shut it down forcefully, or you can run `vagrant suspend` to simply
suspend the virtual machine. In either case, to restart it again,
simply run `vagrant up`.

$ vagrant ssh
If all has gone well you should be presented with your Ubuntu virtual machine.

Useful vagrant machine (vm) commands

destroy       : stops and deletes all traces of the vm 
global-status : outputs status Vagrant env's for this user 
halt          : stops the vm 
init          : initialises a new Vagrant environment 
provision     : provisions the vm 
reload        : restarts vm, loads new Vagrantfile config 
resume        : resume a suspended vm 
snapshot      : manages snapshots, saving, restoring, etc. 
ssh           : connects to vm via SSH 
status        : outputs status of the vm 
suspend       : suspends the vm 
up            : starts and provisions the vm

Ansible Installation

$ sudo apt-get install software-properties-common
$ sudo apt-add-repository ppa:ansible/ansible
$ sudo apt-get update
$ sudo apt-get install ansible

Update local host file

Add the IP address and database server names to your local host file.
$ sudo vi /etc/hosts

Getting Started

Create Ansible configuration file

$ vi ansible.cfg
[defaults]
hostfile = hosts
ansible_private_key_file=~/.ssh/id_rsa

Create Ansible host file

In the host file we can specify that we want ansible to default to the ‘oracle’ user, the first entry is a server alias, in the example below I have kept it the same as the server name but it can be useful if you have cryptic host names or want to refer to the server by it’s database or application name.
$ vi hosts
[dbservers]
z-oracle         ansible_host=z-oracle        ansible_user=oracle
z-oracle-dr  ansible_host=z-oracle-dr  ansible_user=oracle

Ansible Ping Test

Now let’s try using the Ansible ping module to try to connect to our database server and verify a usable version of python, the ping module will return ‘pong’ on success.
$ ansible all -m ping

Both servers will fail returning UNREACHABLE! as the ssh connection failed, to fix this add a public key to the database servers ‘authorized_keys’file.

Generating RSA Keys

Before we can use password-less SSH we need to create a pair of private and public RSA keys for our Ansible control machine.

$ cd ~/.ssh
$ ssh-keygen -t rsa
$ cat id_rsa.pub

‘Copy’ the id_rsa.pub into your client buffer and ssh onto the database servers as the ‘oracle’, cd to the .ssh directory and ‘paste’ the public key into the ‘authorized_keys’ file.

$ cd ~/.ssh
$ vi authorised_keys

Now return to your Ansible control machine to repeat the Ansible Ping Tests.

Ansible Ping Part II

Ok, now we are ready to check connectivity, first lets trying using the database server names individually.
ping_each
That was great, but as we defined a group ‘dbservers’ we can also perform a ‘ping’ test using the group name as we may want to perform an ansible play against a group of servers e.g. Production, Development, Test etc..

ping_group
Very cool, if required you can use the ‘all’ option to run against all entries in the host file.

ping_all
In my next blog post we will start to use our Ubuntu Ansible control machine to interact with our database servers.

An introduction to Ansible

Why this Blog

Over the last couple of years I have found myself increasingly working with DevOps teams and being exposed to the tools and techniques being adopted. However speaking to other DBA’s and Architects it appears that for many it’s still a bit of a ‘Dark Art’, so I thought it was about time I shared some the knowledge over a series of DevOps focused Blogs posts.

Why is Ansible, Ansible ?

The term Ansible is a Science Fiction reference for a ficitonal communications device that can transfer information faster than the speed of light.

The author Ursula LeGuin invented the concept in her 1966 book ‘Rocannon’s World’, subsequently other SciFi authors have borrowed the term.

Only for a moment, when he had located the control room and found the ansible and sat down before it, did he permit his mind-sense to drift over to the ship that sat east of this one. There he picked up a vivid sensation of a dubious hand hovering over a white Bishop. …

As his fingers (left hand only, awkwardly) struck each key, the letter appeared simultaneously on a small black screen in a room in a city on a planet eight lightyears distant:

From Rocannon’s World, by Ursula LeGuin.

Michael DeHaan the creator of Ansible took inspiration for the name Ansible from the book ‘Enders Game’ by Orson Scott Card (note to self must read book / watch the film) in the book Ansible is used to control a large number of remote ships at once, over large distances.  From now on whenever I mention Ansible it will be to control remote servers not ships, however it would be useful to be able to control my Elite Dangerous craft remotely.

What is Ansible ?

Ansible is often lumped into the DevOps tool category of ‘Configuration Management’ and compared to Puppet, Chef & Salt. The term ‘Configuration Management’ is generally used to describe the management of the state of IT infrastructure, which can include servers, storage arrays and databases etc…

When you need to deploy configuration change across multiple platforms ‘Orchestration’ is often required to ensure the correct sequence of events, e.g. you may need to configure storage volumes, Unix mount points all before you can start a database service. Ansible is pretty good a conductor, orchestrating actions across multiple servers.

Why use Ansible

Ansible and Salt both use a ‘Push’ method of communication that does not not require any agents to be installed on remote servers. Ansible’s only requirements are SSH connectivity to the remote servers and for the servers to have Python 2.5 installed. I have not yet had the opportunity to take Salt for a test ride, so I can’t comment on it’s requirements.

Puppet and Chef have taken a ‘Pull-based’ approach, where agents installed on the remote servers periodically check in with a central server and pull down configuration information.

The ‘Push-based’ approach has a significant advantage over ‘Pull-based’ solutions as you can control when a configuration change is implemented rather than having to wait for a timer to expire in a ‘Pull-based’ solution.

My next Blog Post will be ‘Getting Started with Ansible and Oracle’.

Hope to get it out very soon, if you want to know when it’s ready use the below to follow me.