My boss tells me I have "an unhealthy fascination with map". That may be true, but it's only because map is a perfect example of what makes Perl Perl. Let's take a quick stroll through some of the ins and outs of this wondrous little function, shall we?
Here's the setup. You have a list. You want to make another list with just as many elements, where every element of the new list is created by some sort of operation on the elements of the first list. Huh, you say? Okay, for a concrete example, let's say you have a list of words, and you want a list of the lengths of those words. Your first instinct may be a loop:
my @lengths; foreach my $word (@words) { push @lengths, length($word); }
You may even think a fancy statement modifier would do the trick nicely:
push @lengths, length($_) foreach @words;
But what you really want is map:
my @lengths = map {length($_)} @words;
or alternatively:
my @lengths = map(length($_), @words);
It stands head and shoulders above the loop implementation for several reasons:
Building on this, we can use map to build hashes:
my %word_length = map { $_ => length($_) } @words;
Note how in the case above, map returns two list elements for each element of @words (remember that the "fat comma" operator "=>" is still basically just a comma, so the expression evaluates to a two-element list). This sort of approach can also be used to create longer lists, e.g.:
my @repeated_words = map { ($_) x length($_) } @words;
You can embed one map inside another, but remember that $_ in the innermost expression will refer to something different from $_ in the outer expression:
@matrix = map {my $x = $_; map {[$x, $_]} @y_values} @x_values;
So when may it be desirable to opt for a loop when map could be used to accomplish the same task? Here are a few general guidelines:
Map, its kid brother grep, (and under certain circumstances for/foreach) have an interesting--er--feature, if you will. You see, the $_ that you use in you expression or code block is actually an alias for the element of the list that is being operated on. So if you change the value of $_, you are operating on a reference (just like operating on @_ within a subroutine) and can therefore create all manner of side effects, intended or--more likely--otherwise. This is what is referred to in the Perl world as "action-at-a-distance". I recently had my first run-in without action-at-a-distance bugs as I was building an object-oriented system (a la Moose) which at some point in the hierarchy wraps some older code that I was much less familiar with. Now it seems perfectly reasonable to have a collection of objects and want to do something like get the sum of the values of a certain attribute. Something like:
sum map { $_->get_population } @cities
Now if you like being lazy as any great programmer should, you might not actually construct those population attributes until they're needed, and Moose makes such deferred evaluation a snap. But what if somewhere in the process of constructing the attribute some code along the way decides to use $_ for its own nefarious purposes. This is exactly what I ran into. Somewhere someone decided to assign a value to $_ (which is not necessarily a bad thing, as there are occasions where this is quite convenient) but $_ being a global variable, it impacted the map statement I had written, through many layers of code. My array, which originally contained blessed object references, all of a sudden had some stray strings, and sometimes undefs, in it. A most unwelcome turn of events.
What to do? Well, there are a few precautions that can and should be taken to prevent such things from happening. First of all, any time $_ is used, it should be localized or even lexicalized in scope. Constructs which generate $_ for you typically automatically limit the scope to the construct itself (in the case of map, the map statement itself) as a byproduct of aliasing, i.e.:
$_ = 'a'; @b = 1..3; for (@b) {++$_}
In the outer scope, $_ still has the value "a", but incrementing $_ in the loop's scope has incremented all the values of the array @b so now @b is 2..4. Nutty, huh? But that value of "a" could be problematic to some other scope, so to be a good citizen, I should preface that first assignment with local or (if you have Perl 5.10) my. If the code I had been calling had taken such precautions, I wouldn't have had so much trouble.
However, there are precautions that can be taken at the other end as well. And in the end, even though I went back and sanitized the underlying code to have good scoping hygiene, I still applied the following so that my code was more robust even if someone went back and did the same thing over again at the lower level. Well how can we prevent the value of the alias from changing? If we are sure that our expression/code-block doesn't change the value of $_ within its lexical scope, we could (again, if you have Perl 5.10) lexicalize $_. However, if you have no such guarantee, or you don't have 5.10, then you probably need to localize $_, like so:
@a = 100..200; @b = map { local $_ = $_; s/^/1/ } @a;
This leaves @a intact but leaves @b equivalent to 1100..1200. If you want a simple way to do this more often, you can roll your own "safe" version of map:
sub safe_map (&@) { my $code = shift; map {local $_ = $_; $code->()} @_ }
Now you can just drop that in where you usually use map (except if you are using the EXPR form, which won't work with the prototype) and you can ensure your mapped lists won't get corrupted. There are ways the output list could still get corrupted by poorly-scoped subroutines, but that's a much easier problem to recognize and deal with.
Hopefully this is enough to make you want to use map but want to do so carefully. For with great power comes great responsibility. It is a beautiful tool, but in the wrong hands it can unleash a force so terrible, few have lived to tell the tale. So forge on, intrepid adventurers. Excelsior!