Perl Primer 014, A Perl hash of hashes, or Multidimensional array. HTML FIX IT . COM

Working with an hash of hashes.

In this little tute, I'll cover another of the widely used more advanced Perl data constructs, in this case a "hash of hashes". A hash of hashes is potentially more useful then a hash of arrays, as the hash of hashes has an central searchable index that you can use to extract data without the need to use loops.
This gives a hash of hashes a similar sort of power as an SQL based database. (albiet on a much smaller scale.)

Lets begin by creating an hash of hashes to contain the same details I stored in the array of hashes tutorial

As with most programming languages, good indenting can mean the difference between easy to maintain code, and an illegible mess. To that end I have formatted this hash of hashes in the manner I consider most readable to myself and anyone else that comes after me.
The main benefit of a hash of hashes over an array of hashes is that it is much easier to extract data out of one of the hashes because each inner hash is the value part of the parent hash, so by using the parent hashes key, you can access all the data in a simple and easy manner.
I am using the same data that I used in the array of hashes tute so you can see the benefits that a hash of hashes proves. In the case below, we make the file names the keys of the parent hash, and the data for that file, (price, description etc) the data contained in the child hash for that filename key.
Does that make sense? If not, just look at the code I used to create the hash of hashes and it should all make more sense, in short instead of using a loop with array element numbers we can use the parent and child hash keys to access the data directly

     # First lets define our hash of hashes.
my %file_attachments = (
             'test1.zip'  => { 'price' => '10.00', 'desc' => 'the 1st test'},
             'test2.zip'  => { 'price' => '12.00', 'desc' => 'the 2nd test'},
             'test3.zip'  => { 'price' => '13.00', 'desc' => 'the 3rd test'},
             'test4.zip'  => { 'price' => '14.00', 'desc' => 'the 4th test'}
                       );
     # Now set a variable to keep track of how many
     # elements are in the parent hash..             
my $file_no = scalar(keys(%file_attachments));

Ok, now that you know how to create one, I'm going to show you why they are so useful. Say for example you want to know what the price for test2.zip is. In the array of hashes, you'd have to do some sort of loop though the array to find the hash value, but with the hash of hashes, we can access that information directly.

     # Accessing the 'price' element of test2.zip

print $file_attachments{'test2.zip'}->{'price'};
     #                   ^^                ^^   

     #            parent hash 'key'      child hash 'key'

     # Refering back to the hash, we can see that this 
     # will print the output:

12.00

As with the array of hashes, changing the content of an element is as simple as just assigning a new value to it. Permit me to demonstrate by changing the price for test2.zip to say 18.00.

     # Changing the 'price' element of test2.zip

 $file_attachments{'test2.zip'}->{'price'} = '18.00';
     # Printing $file_attachments{'test2.zip'}->{'price'};
     # would now return 18.00

Adding new elements to either the parent or child hashes is likewise easy. Simply assigning to a key that doesn't exist, will create that key. To demonstrate that process, lets add the location key to the child hash of the test2.zip parent hash.

     # Adding a new element to the  test2.zip child hash.
$file_attachments{'test2.zip'}->{'location'} = '/var/www/html/test2.zip';

     # Printing $file_attachments{'test2.zip'}->{'location'}
     # now returns:

/var/www/html/test2.zip

Likewise, adding a new element to the parent hash is just as easy. Say for example we wish to add test5.zip and all of its data to the hash of hashes. Here is how we would go about it.

     # Adding test5.zip and data to the hash.
$file_attachments{'test5.zip'} = { 'price' => '20.00', 'desc' => 'the last test' };

     # The value of $file_attachments{'test5.zip'}->{'price'};
     # is now: 20.00;

There you have it, that's most of what you need to know to work with a hash of hashes.

One other really useful tool I should discuss is the CPAN module Data::Dumper. This most handy module is great for seeing just what is in the hash of hashes (or any other data structure) at any given time. In our case, we would use it like so:

     # Finding out what data is in the hash of hashes.

use Data::Dumper;
print Dumper(%file_attachments);

The output of that will look something like this (I have formatted it abit to be more readable.):

$VAR1 = 'test1.zip';
$VAR2 = { 'desc' => 'the 1st test', 'price' => '10.00' };

$VAR3 = 'test2.zip';
$VAR4 = { 'desc' => 'the 2nd test', 'location' => '/var/www/html/test2.zip', 'price' => '12.00' };

$VAR5 = 'test3.zip';
$VAR6 = { 'desc' => 'the 3rd test', 'price' => '13.00' };

$VAR7 = 'test4.zip';
$VAR8 = { 'desc' => 'the 4th test', 'price' => '14.00' };

$VAR9 = 'test5.zip';
$VAR10 = { 'desc' => 'the last test', 'price' => '20.00' };

This tute and the tute about arrays of hashes are not the be all and end all of Perl data constructs. You can take it as far as you like,for example you could have a hash of arrays of hashes of hashes, although in my opinion, you'd need a very good reason to want to do that. I would have long since moved to a relational database like MySQL for anything near that level of complexity.
Another good reason to use a relational database in some cases is size. If you are storing huge numbers of elements/values into a construct like this, they will all be stored in the memory on the server when the script is running, and the bigger the construct gets, the more memory it will consume. For example parsing a 50,000 line log file and using it to dynamically populate a multidimensial array like this could conceivably run your server right out of memory. For a relatively small amount of data though, using a multidimensional array is much faster then, say a MySQL database for the insertion and extraction of data.

One last suggestion, I have noticed a trend among young coders where they discover multidimensional arrays and from that point on, they use them for everything they can possibly think of, I am assuming it is because of the "cool" or "l33t" factor. I strongly suggest you resist this urge. Multidimensional arrays are tools like any other, use them where it benefits you, and use something else when it doesn't.
Multidimensional arrays can add considerable complexity to a script that may otherwise have been quite simple, so only use the tool when its appropriate.

So if you are ready to learn more:
Back to the Tutorial Index