Tutorial: Perl 5.10 and Koha

The current stable version of Perl is 5.18.0 … but for very good reasons, Koha doesn’t require the latest and greatest. For a very long time, Koha required a minimum version of 5.8.8. It wasn’t until October 2011, nearly four years after Perl 5.10.0 was released, that a patch was pushed setting 5.10.0 as Koha’s minimum required version.

Why so long? Since Perl is used by a ton of core system scripts and utilities, OS packagers are reluctant to push ahead too quickly. Debian oldstable has 5.10.1 and Debian stable ships with 5.14.2. Ubuntu tracks Debian in this respect. RHEL5 ships with Perl 5.8 and won’t hit EOL until 2017.

RHEL5 takes it too far in my opinion, unless you really need that degree of stasis — and I’m personally not convinced that staying that far behind the cutting edge necessarily gives one much more in the way of the security. Then again, I don’t work for a bank. Suffice it to say, if you must run a recent version of Koha on RHEL5, you have your work cut out for you — compiling Perl from tarball or using something like Perlbrew to at least get 5.10 is a good idea. That will still leave you with rather a lot of modules to install from CPAN.

But since we, as Koha hackers, can count on having Perl 5.10, we can make the most of it. Here are a few constructs that were added in 5.10 that I find particularly useful for hacking on Koha.

Defined-OR operator

The defined-or operator, //, returns its left operand unless its value is undefined, in which case it returns the right operand. It lets you write:

my $a = get_a_possibly_undefined_value();
$a //= '';
print "Label: $a\n"; # won't throw a warning if the original value was undefined


my $a = get_a_possibly_undefined_value() // '';

rather than

my $a = get_a_possibly_undefined_value();
$a = '' unless defined($a);

or (horrors!)

my $a = get_a_possibly_undefined_value();
$a ||= ''; # if $a started out as 0...

Is this just syntactical sugar? Sure, but since Koha is a database-driven application whose schema has a lot of nullable columns, and since use of the Perl warnings pragma is mandated, it’s a handy one.

Named capture buffers

This lets you give a name to a regular expression capture group, allowing you to using the name rather than (say) $1, $2, etc. For example, you can write

if ($str =~ /tag="(?[0-9]{3})"/ ){
    print $+{tag}, "\n"; # %- is a magic hash that contains the named capture groups' contents

rather than

if ($str =~ /tag="([0-9]{3})"/ ){
    print $1, "\n";

There’s a bit of a trade-off with this because the regular expression is now a little more difficult to read. However, since the code that uses the results can avoid declaring unnecessary temporary variables and is more robust in the face of changes to the number of capture groups in the regex, that trade-off can be worth it.


The UNITCHECK block joins BEGIN, END, INIT and CHECK as ways of designating blocks of code to execute during specific points during the compilation process for a Perl module. UNITCHECK code is executed right after the module has been compiled. In the patch I’m proposing for bug 10503, I found this handy to allow module initialization code to make use of functions defined in that same module.

Warning, warning!

There are some constructs that were added in Perl 5.10, including the given/when keywords and the smart match operator ~~, that are deprecated as of Perl 5.18. Consequently, I will say no more about them other than this: don’t use them! Maybe the RHEL5 adherents have a point after all.

CC BY-SA 4.0 Tutorial: Perl 5.10 and Koha by Galen Charlton is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

2 thoughts on “Tutorial: Perl 5.10 and Koha

  1. Bummer about smartmatch operator, but named capture groups are a major advantage. After all these years, Perl continues to be the single best environment for Regular Expressions because of how much is baked into the language itself.

Comments are closed.