Automate to avoid database cloning disasters.

“Accidentally destroyed production database on first day of a job”

Wow, that headline grabbed my attention.

Earlier this week you may have seen an article reported by The Register about a post in  reddit from a junior software developer going by the name of  “cscareerthrowaway567”, who on his/her first day destroyed a Production database and lost their job.

I was basically given a document detailing how to setup my local development environment. Which involves run a small script to create my own personal DB instance from some test data. After running the command i was supposed to copy the database url/password/username outputted by the command and configure my dev environment to point to that database. Unfortunately instead of copying the values outputted by the tool, i instead for whatever reason used the values the document had.

Unfortunately apparently those values were actually for the production database (why they are documented in the dev setup guide i have no idea). Then from my understanding that the tests add fake data, and clear existing data between test runs which basically cleared all the data from the production database. Honestly i had no idea what i did and it wasn’t about 30 or so minutes after did someone actually figure out/realize what i did.

Now, we can not be sure if the story by “cscareerthrowaway567” is true or not but it does provide an great example of the potential dangers of manual database cloning.

The story raises many issues around process, security, change management, training and not least why someone thought it was a good idea to include Production account details in a database cloning document ?

However, this could have easily been avoided through the use of Automation or a CDM (Copy Data Management) tools e.g. Catalogic, Actifio, Delphix etc..

Many modern All-Flash Arrays (AFA) provide powerful REST API’s which can be called from various programming and scripting languages e.g. Python, Java, Perl, PowerShell etc… to create custom solutions.

AFA Storage API’s can also be consumed by DevOPS Automation and Provisioning software e.g. Ansible, Chef, Puppet etc.. to provide full-stack or end-to-end Automation of Database Cloning, improving security and also removing the chance of human error.

Check out the Pure Storage Developer Community area for code examples.

How to resize an XFS filesystem

A question which I frequently get asked is, how do I a resize my Oracle XFS file system ?

As I needed to resize an Oracle FRA area today, I thought this would make a great topic for a Blog post.

Ok, lets start be checking the current size and geometry using the Linux df -h and xfs_growfs -n commands.

[root@z-oracle ~]# df -h
Filesystem                Size  Used Avail Use% Mounted on
..
/dev/mapper/psta-orafra   1.0T   33M  1.0T   1% /u04/app/oracle/fast_recovery_area
..

[root@z-oracle ~]# xfs_growfs /dev/mapper/psta-orafra -n
meta-data=/dev/mapper/psta-orafra isize=256 agcount=4, agsize=67108864 blks
= sectsz=512 attr=2, projid32bit=1
= crc=0 finobt=0 spinodes=0
data = bsize=4096 blocks=268435456, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal bsize=4096 blocks=131072, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0

Now run the multipath command to check the current size of the LUN,  look for the friendly device name within the list of devices.

[root@z-oracle ~]# multipath -ll

psta-orafra (3624a937050c939582b0f46c0000a8f84) dm-17 PURE ,FlashArray
size=1.0T features=’0′ hwhandler=’0′ wp=rw
`-+- policy=’queue-length 0′ prio=1 status=active
|- 10:0:3:20 sdgl 132:16 active ready running
|- 10:0:4:20 sdiq 135:160 active ready running
|- 10:0:5:20 sdkq 66:480 active ready running
|- 10:0:6:20 sdmo 70:256 active ready running
|- 10:0:7:20 sdom 129:288 active ready running
|- 1:0:0:20 sdai 66:32 active ready running
|- 1:0:1:20 sdcr 69:240 active ready running
|- 1:0:2:20 sder 129:48 active ready running
|- 1:0:3:20 sdgr 132:112 active ready running
|- 1:0:4:20 sdil 135:80 active ready running
|- 1:0:5:20 sdkk 66:384 active ready running
|- 1:0:6:20 sdmk 69:448 active ready running
|- 1:0:7:20 sdop 129:336 active ready running
|- 10:0:0:20 sdau 66:224 active ready running
|- 10:0:1:20 sdck 69:128 active ready running
`- 10:0:2:20 sdel 128:208 active ready running

Now resize the volume using the Pure FlashArray UI, command line or REST API.

Screen Shot 2017-05-11 at 10.44.03

We now need to perform a rescan of the SCSI devices on our Linux server to identify any LUNS which have been resized.

[root@z-oracle ~]# rescan-scsi-bus.sh -s
Scanning SCSI subsystem for new devices
Searching for resized LUNs

We can now resize the multipath device using the following command:

[root@z-oracle mapper]# multipathd -k’resize map /dev/dm-17
ok

Great, ok now use the Linux command xfs_growfs to extend the file system, note if you do not specify -D xfs_growfs will grow to use all available space.

[root@z-oracle ~]# xfs_growfs /dev/mapper/psta-orafra
meta-data=/dev/mapper/psta-orafra isize=256 agcount=4, agsize=67108864 blks
= sectsz=512 attr=2, projid32bit=1
= crc=0 finobt=0 spinodes=0
data = bsize=4096 blocks=268435456, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal bsize=4096 blocks=131072, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
data blocks changed from 268435456 to 536870912

Ok, lets check the results

[root@z-oracle ~]# df -h
Filesystem Size Used Avail Use% Mounted on

/dev/mapper/psta-orafra 2.0T 33M 2.0T 1% /u04/app/oracle/fast_recovery_area
..

As you see from above, within a few minutes I have been able to increase my Oracle fast_recovery_area from 1TB to 2TB.

An introduction to Ansible

Why this Blog

Over the last couple of years I have found myself increasingly working with DevOps teams and being exposed to the tools and techniques being adopted. However speaking to other DBA’s and Architects it appears that for many it’s still a bit of a ‘Dark Art’, so I thought it was about time I shared some the knowledge over a series of DevOps focused Blogs posts.

Why is Ansible, Ansible ?

The term Ansible is a Science Fiction reference for a ficitonal communications device that can transfer information faster than the speed of light.

The author Ursula LeGuin invented the concept in her 1966 book ‘Rocannon’s World’, subsequently other SciFi authors have borrowed the term.

Only for a moment, when he had located the control room and found the ansible and sat down before it, did he permit his mind-sense to drift over to the ship that sat east of this one. There he picked up a vivid sensation of a dubious hand hovering over a white Bishop. …

As his fingers (left hand only, awkwardly) struck each key, the letter appeared simultaneously on a small black screen in a room in a city on a planet eight lightyears distant:

From Rocannon’s World, by Ursula LeGuin.

Michael DeHaan the creator of Ansible took inspiration for the name Ansible from the book ‘Enders Game’ by Orson Scott Card (note to self must read book / watch the film) in the book Ansible is used to control a large number of remote ships at once, over large distances.  From now on whenever I mention Ansible it will be to control remote servers not ships, however it would be useful to be able to control my Elite Dangerous craft remotely.

What is Ansible ?

Ansible is often lumped into the DevOps tool category of ‘Configuration Management’ and compared to Puppet, Chef & Salt. The term ‘Configuration Management’ is generally used to describe the management of the state of IT infrastructure, which can include servers, storage arrays and databases etc…

When you need to deploy configuration change across multiple platforms ‘Orchestration’ is often required to ensure the correct sequence of events, e.g. you may need to configure storage volumes, Unix mount points all before you can start a database service. Ansible is pretty good a conductor, orchestrating actions across multiple servers.

Why use Ansible

Ansible and Salt both use a ‘Push’ method of communication that does not not require any agents to be installed on remote servers. Ansible’s only requirements are SSH connectivity to the remote servers and for the servers to have Python 2.5 installed. I have not yet had the opportunity to take Salt for a test ride, so I can’t comment on it’s requirements.

Puppet and Chef have taken a ‘Pull-based’ approach, where agents installed on the remote servers periodically check in with a central server and pull down configuration information.

The ‘Push-based’ approach has a significant advantage over ‘Pull-based’ solutions as you can control when a configuration change is implemented rather than having to wait for a timer to expire in a ‘Pull-based’ solution.

My next Blog Post will be ‘Getting Started with Ansible and Oracle’.

Hope to get it out very soon, if you want to know when it’s ready use the below to follow me.

Resizing Oracle ASM disks

Today I though I would share with how easy it is to resize Oracle ASM volumes with Pure Storage.

Ok, lets first check the Oracle ASM disk sizes using ‘asmcmd -p’

asmcli_pre.png

As you can see from the above, I have 3 volumes each of 100GB, for this test let’s increase them all to 1TB using the purevol command from with the CLI.

Pure_resize.png

Great, my volumes have now al been resized, I could have achieved the same results with the Pure UI or Web Services, but that’s something for another day.

Linux device rescan

Ok, we now need to let Linux and Oracle know about our resized volumes.

As root rescan the SCSI devices to identify which volumes which have been resized using:

rescan-scsi-bus.sh -s

Use ‘multipathd -k ‘resize map ‘ to resize the multipath devices e.g.

multipathd -k'resize map slob-data'

Before you moving onto resizing the Oracle ASM disk groups check your updated multi path configuration with ‘multipath -ll’, look for your device name and size e.g.

slob-data (3624a937050c939582b0f46c000059779) dm-5 PURE,FlashArray      size=1.0T features='0' hwhandler='0' wp=rw
`-+- policy='queue-length 0' prio=1 status=active
...

Oracle ASM resize

As your ‘grid’ user connect as sysasm from sqlplus e.g. sqlplus / as sysasm and perform ‘alter disk group <dg_name> resize all’

sqlplus_resize

Great, job done, but before we more on let’s check out work first using sqlplus as sysasm

SQL>  select name, total_mb/(1024) "Total GiB" from v$asm_diskgroup;
NAME                           Total GiB
------------------------------ ----------
CONTROL_REDO                   1024
FRA                            1024
DATA                           1024

And, now with the ASM command line utility ‘asmcmd’

asmcli_post.png

Or if you prefer the ASM UI ‘asmca’.

asmca.png