tiistai 10. syyskuuta 2013

Bourne Again Fun

Some Quality Time

I've recently had an opportunity to get back familiar with my old friend, bash. I would have called him an acquaintance instead of friend since it's been so long since we last spent truly time together, but we're back to friend status. (Remind me to update our Facebook relationship!)

It all started from something that was supposed to a small script (aren't they always) to help bootstrap a VPC from scratch in Amazon AWS, since it involves quite a lot of nibbling around in the VPC configuration to get it ready for instance deployment. Most if not all of this could be scripted so I decided to spend a couple of days to get it done.

Turned out I got to spend several weeks with this and since I started it all with bash, I might as well finnish with it. Although many people will say, and for a reason, this might've not been the best decision, I considered it a good exercise to get to really know AWS more than just bash exercise. 

But without going into too much details into the whole AWS shenanigans, I decided to share some of bash intricacies I encountered on the way that are easily forgotten or that people seem to find confusing (saw quite a lot of links to Stack Overflow when researching for these).

Arrays - A String Concerto

Yes, bash has arrays and those are quite useful as well, but they are also provide loads of pitfalls to step into. I needed quite a lot of these for various reasons, so here are some that I think are useful tips for you.

$ friend="angry bird"
$ friends_array=($var)

That's a very basic array with space separated string in it. Let's echo the first element:
$ echo ${friends_array[0]}
angry

Great, just what you would expect and similar results on the index of 1. But what if we wrap the variable into double quotes and then echo the first element?

$ friends_array=("$var")
$ echo ${friends_array[0]}
angry bird

You'll get both words without the double quotes as expected, right? There are cases when you might want this, but in many cases you want the space separated word list in there, like so:

# birds_array =("red bird" "blue bird" "yellow bird")
# echo ${birds_array[1]}
blue bird

Meet the IFS

Then there are cases when you need to use arrays but want to use different separator. That where the IFS variable comes handy, which is easy enough to use, but again full of surprises for the unprepared.

First of all, if using default values for separation of: <space><tab><newline>

You can override it as follows:
$ friends="red bird, chuck"
$ IFS=","
$ friends_array=($friends)
$ echo ${friends_array[0]}
red bird

Also, note if you used space to separate the var as we did above, the echo will print that as well:
echo ${friends_array[1]}
 chuck

So even if your mind is used to neat separation by having space after each comma or colon, bash will take this quite literally. Remember to strip your variables or keep your lists clean.

You might also be clever (in your mind) and think to use the following (as I actually did):

$ IFS=" ,"
friends_array=($friends)

Oh, did I mention you need to redefine the array after changing the IFS? It does the separation into array when setting the array variable, not using it as a separator for output via echo for instance. I don't demonstrate that here, but try it and you will see for yourself.

$ echo ${friends_array[0]}
red

Wait, what? Let's see the rest:

$ echo ${friends_array[1]}
bird
$ echo ${friends_array[2]}
chuck

But that's not what I wanted! I thought I created a literal <space><,> as a separator, but instead I made a list of characters to use, much like the IFS has by default. Of course this is what the man pages state as well, but this is for all those brave souls that experiment before reading documentation ;)

So, you've met the IFS. Now don't forget him! You've altered his personality, now change it back if you don't want it to affect the rest of your scripts. Or, even better, use a sub shell when you need to change IFS value to make the change temporary.

Hoop-a-loop

Back to arrays. They're usually good for one purpose: looping through. But beware, what you learned above applies here as well with some exceptions.

Let's make a small shell script called bird.sh:

#!/bin/bash
arr=("red bird" "blue bird" "yellow bird")
for bird in ${arr[@]}; do
echo $bird
done

Running this will output:

$ sh bird.sh 
red
bird
blue
bird
yellow
bird

Exactly. Forgot the double quotes around the array, so instead of listing the elements we thought we setup in the array we just looped through the words using the default IFS separator, space tab and newline. Change the for clause to:
for bird in "${arr[@]}"; do

And your output is what you would expect:

$ sh bird.sh 
red bird
blue bird
yellow bird

Bracketeering

You've probably seen some if statements in bash and stepped into the first mine of every developer oriented soul of using something like:
if [ "foo" == $bar ]; then

Depending on what you want usually the first thing you encounter are that bash might not like space around the comparator or the equal signs need not (or can't) be double etc. But this is old news. What you might not have known is that [ is not only a syntax, it's an actual command. Don't believe me?

$ which [
/bin/[

Whaat, no way! Next, try checking the man page next. It's the condition evaluation utility. This makes some comparisons more interesting than others. See following example:

$ var="angry bird"
$ [ $var = "angry bird" ] && echo match
-bash: [: too many arguments

Bummer. This is caused by the comparison string having space in between. Bash no likey. But no worries, that's when we can use double brackets:

$ [[ $var = "angry bird" ]] && echo match
match

Now, why this is can be go deeper than I'd like to venture here, but as a general rule the world splitting in variables are usually not expected unless specifically prepared to do so. Also, should you have used some other IFS here than space, it would've also worked in the single bracket case since it considers the variable as one string.

There are cases like switch-case (or case-esac in bash, which I always find funny much like if-fi statements) that also in general don't like these kind of strings either, but work if you prepare for a split string explicitly by double (or single) quotes around the variable. 

Fun and Games

The syntax around bash arrays are sometimes something completely else than what an innocent developer is prepared for, but quite easy to remember once familiar with. The most useful I've found are the amount of elements in an array. 

Set the array as follows:

$ friends_array=("red bird" "blue bird" "yellow bird")

List the amount of elements:

$ echo ${#friends_array[*]}
3

Here lies another caveat. Remember IFS? Remember I told you it will affect only after variable is set into the array? In here I must emphasise the use of variables and not static strings. Let's do this again:

$ IFS=","
$ friends_array=("red bird","blue bird","yellow bird")
echo ${#friends_array[*]}
1

Aaargh! Didn't I just told it to use comma as a separator. You did, but like said, this applies only variables. Let's do it via a variable:

$ IFS=","
$ friends="red bird,blue bird,yellow bird"
$ friends_array($friends)
echo ${#friends_array[*]}
3

Voilá ;)

I could continue on the subject, but where I have ventured, many men have gone before me. Did some googling and this seems to cover quite nicely many of the bashiness.

Have fun.sh!