Tied Variables (Programming Perl) Book Home Programming PerlSearch this book

Chapter 14. Tied Variables

Contents:

Tying Scalars
Tying Arrays
Tying Hashes
Tying Filehandles
A Subtle Untying Trap
Tie Modules on CPAN

Some human endeavors require a disguise. Sometimes the intent is to deceive, but more often, the intent is to communicate something true at a deeper level. For instance, many job interviewers expect you to dress up in a tie to indicate that you're seriously interested in fitting in, even though both of you know you'll never wear a tie on the job. It's odd when you think about it: tying a piece of cloth around your neck can magically get you a job. In Perl culture, the tie operator plays a similar role: it lets you create a seemingly normal variable that, behind the disguise, is actually a full-fledged Perl object that is expected to have an interesting personality of its own. It's just an odd bit of magic, like pulling Bugs Bunny out of a hat.

Put another way, the funny characters $, @, %, or * in front of a variable name tell Perl and its programmers a great deal--they each imply a particular set of archetypal behaviors. You can warp those behaviors in various useful ways with tie, by associating the variable with a class that implements a new set of behaviors. For instance, you can create a regular Perl hash, and then tie it to a class that makes the hash into a database, so that when you read values from the hash, Perl magically fetches data from an external database file, and when you set values in the hash, Perl magically stores data in the external database file. In this case, "magically" means "transparently doing something very complicated". You know the old saying: any technology sufficiently advanced is indistinguishable from a Perl script. (Seriously, people who play with the guts of Perl use magic as a technical term referring to any extra semantics attached to variables such as %ENV or %SIG. Tied variables are just an extension of that.)

Perl already has built-in dbmopen and dbmclose functions that magically tie hash variables to databases, but those functions date back to the days when Perl had no tie. Now tie provides a more general mechanism. In fact, Perl itself implements dbmopen and dbmclose in terms of tie.

You can tie a scalar, array, hash, or filehandle (via its typeglob) to any class that provides appropriately named methods to intercept and emulate normal accesses to those variables. The first of those methods is invoked at the point of the tie itself: tying a variable always invokes a constructor, which, if successful, returns an object that Perl squirrels away where you don't see it, down inside the "normal" variable. You can always retrieve that object later using the tied function on the normal variable:

tie VARIABLE, CLASSNAME, LIST;  # binds VARIABLE to CLASSNAME
$object = tied VARIABLE;
Those two lines are equivalent to:
$object = tie VARIABLE, CLASSNAME, LIST;
Once it's tied, you treat the normal variable normally, but each access automatically invokes methods on the underlying object; all the complexity of the class is hidden behind those method invocations. If later you want to break the association between the variable and the class, you can untie the variable:
untie VARIABLE;
You can almost think of tie as a funny kind of bless, except that it blesses a bare variable instead of an object reference. It also can take extra parameters, just as a constructor can--which is not terribly surprising, since it actually does invoke a constructor internally, whose name depends on which type of variable you're tying: either TIESCALAR, TIEARRAY, TIEHASH, or TIEHANDLE.[1] These constructors are invoked as class methods with the specified CLASSNAME as their invocant, plus any additional arguments you supplied in LIST. (The VARIABLE is not passed to the constructor.)

[1] Since the constructors have separate names, you could even provide a single class that implements all of them. That would allow you to tie scalars, arrays, hashes, and filehandles all to the same class, although this is not generally done, since it would make the other magical methods tricky to write.

These four constructors each return an object in the customary fashion. They don't really care whether they were invoked from tie, nor do any of the other methods in the class, since you can always invoke them directly if you'd like. In one sense, all the magic is in the tie, not in the class implementing the tie. It's just an ordinary class with funny method names, as far as the class is concerned. (Indeed, some tied modules provide extra methods that aren't visible through the tied variable; these methods must be called explicitly as you would any other object method. Such extra methods might provide services like file locking, transaction protection, or anything else an instance method might do.)

So these constructors bless and return an object reference just as any other constructor would. That reference need not refer to the same type of variable as the one being tied; it just has to be blessed, so that the tied variable can find its way back to your class for succor. For instance, our long TIEARRAY example will use a hash-based object, so it can conveniently hold additional information about the array it's emulating.

The tie function will not use or require a module for you--you must do that yourself explicitly, if necessary, before calling the tie. (On the other hand, the dbmopen function will, for backward compatibility, attempt to use one or another DBM implementation. But you can preempt its selection with an explicit use, provided the module you use is one of the modules in dbmopen's list of modules to try. See the online docs for the AnyDBM_File module for a fuller explanation.)

The methods called by a tied variable have predetermined names like FETCH and STORE, since they're invoked implicitly (that is, triggered by particular events) from within the innards of Perl. These names are in ALLCAPS, a convention we often follow for such implicitly called routines. (Other special names that follow this convention include BEGIN, CHECK, INIT, END, DESTROY, and AUTOLOAD, not to mention UNIVERSAL->VERSION. In fact, nearly all of Perl's predefined variables and filehandles are in uppercase: STDIN, SUPER, CORE, CORE::GLOBAL, DATA, @EXPORT, @INC, @ISA, @ARGV, and %ENV. Of course, built-in operators and pragmas go to the opposite extreme and have no capitals at all.)

The first thing we'll cover is extremely simple: how to tie a scalar variable.

14.1. Tying Scalars

To implement a tied scalar, a class must define the following methods: TIESCALAR, FETCH, and STORE (and possibly DESTROY). When you tie a scalar variable, Perl calls TIESCALAR. When you read the tied variable, it calls FETCH, and when you assign a value to the variable, it calls STORE. If you've kept the object returned by the initial tie (or if you retrieve it later using tied), you can access the underlying object yourself--this does not trigger its FETCH or STORE methods. As an object, it's not magical at all, but rather quite objective.

If a DESTROY method exists, Perl invokes it when the last reference to the tied object disappears, just as for any other object. That happens when your program ends or when you call untie, which eliminates the reference used by the tie. However, untie doesn't eliminate any outstanding references you might have stored elsewhere; DESTROY is deferred until those references are gone, too.

The Tie::Scalar and Tie::StdScalar packages, both found in the standard Tie::Scalar module, provide some simple base class definitions if you don't want to define all of these methods yourself. Tie::Scalar provides elemental methods that do very little, and Tie::StdScalar provides methods that make a tied scalar behave like a regular Perl scalar. (Which seems singularly useless, but sometimes you just want a bit of a wrapper around the ordinary scalar semantics, for example, to count the number of times a particular variable is set.)

Before we show you our elaborate example and complete description of all the mechanics, here's a taste just to whet your appetite--and to show you how easy it really is. Here's a complete program:

#!/usr/bin/perl
package Centsible;
sub TIESCALAR { bless \my $self, shift }
sub STORE { ${ $_[0] } = $_[1] }  # do the default thing
sub FETCH { sprintf "%.02f", ${ my $self = shift } } # round value

package main;
tie $bucks, "Centsible";
$bucks = 45.00;
$bucks *= 1.0715; # tax
$bucks *= 1.0715; # and double tax!
print "That will be $bucks, please.\n";
When run, that program produces:
That will be 51.67, please.
To see the difference it makes, comment out the call to tie; then you'll get:
That will be 51.66505125, please.
Admittedly, that's more work than you'd normally go through to round numbers.

14.1.1. Scalar-Tying Methods

Now that you've seen a sample of what's to come, let's develop a more elaborate scalar-tying class. Instead of using any canned package for the base class (especially since scalars are so simple), we'll look at each of the four methods in turn, building an example class named ScalarFile. Scalars tied to this class contain regular strings, and each such variable is implicitly associated with a file where that string is stored. (You might name your variables to remind you which file you're referring to.) Variables are tied to the class this way:

use ScalarFile;       # load ScalarFile.pm
tie $camel, "ScalarFile", "/tmp/camel.lot";
Once the variable has been tied, its previous contents are clobbered, and the internal connection between the variable and its object overrides the variable's normal semantics. When you ask for the value of $camel, it now reads the contents of /tmp/camel.lot, and when you assign a value to $camel, it writes the new contents out to /tmp/camel.lot, obliterating any previous occupants.

The tie is on the variable, not the value, so the tied nature of a variable does not propagate across assignment. For example, let's say you copy a variable that's been tied:

$dromedary = $camel;

Instead of reading the value in the ordinary fashion from the $camel scalar variable, Perl invokes the FETCH method on the associated underlying object. It's as though you'd written this:

$dromedary = (tied $camel)->FETCH():
Or if you remember the object returned by tie, you could use that reference directly, as in the following sample code:
$clot = tie $camel, "ScalarFile", "/tmp/camel.lot";
$dromedary = $camel;          # through the implicit interface
$dromedary = $clot->FETCH();  # same thing, but explicitly
If the class provides methods besides TIESCALAR, FETCH, STORE, and DESTROY, you could use $clot to invoke them manually. However, one normally minds one's own business and leaves the underlying object alone, which is why you often see the return value from tie ignored. You can still get at the object via tied if you need it later (for example, if the class happens to document any extra methods you need). Ignoring the returned object also eliminates certain kinds of errors, which we'll cover later.

Here's the preamble of our class, which we will put into ScalarFile.pm:

package ScalarFile;
use Carp;                # Propagate error messages nicely.
use strict;              # Enforce some discipline on ourselves.
use warnings;            # Turn on lexically scoped warnings.
use warnings::register;  # Allow user to say "use warnings 'ScalarFile'".
my $count = 0;           # Internal count of tied ScalarFiles.
The standard Carp module exports the carp, croak, and confess subroutines, which we'll use in the code later in this section. As usual, see Chapter 32, "Standard Modules", or the online docs for more about Carp.

The following methods are defined by the class.

CLASSNAME->TIESCALAR(LIST)

The TIESCALAR method of the class is triggered whenever you tie a scalar variable. The optional LIST contains any parameters needed to initialize the object properly. (In our example, there is only one parameter: the name of the file.) The method should return an object, but this doesn't have to be a reference to a scalar. In our example, though, it is.

sub TIESCALAR {           # in ScalarFile.pm
    my $class    = shift;
    my $filename = shift;
    $count++;             # A file-scoped lexical, private to class.
    return bless \$filename, $class;
}
Since there's no scalar equivalent to the anonymous array and hash composers, [] and {}, we merely bless a lexical variable's referent, which effectively becomes anonymous as soon as the name goes out of scope. This works fine (you could do the same thing with arrays and hashes) as long as the variable really is lexical. If you try this trick on a global, you might think you're getting away with it, until you try to create another camel.lot. Don't be tempted to write something like this:
sub TIESCALAR { bless \$_[1], $_[0] }    # WRONG, could refer to global.
A more robustly written constructor might check that the filename is accessible. We check first to see if the file is readable, since we don't want to clobber the existing value. (In other words, we shouldn't assume the user is going to write first. They might be treasuring their old Camel Lot file from a previous run of the program.) If we can't open or create the filename specified, we'll indicate the error gently by returning undef and optionally printing a warning via carp. (We could just croak instead--it's a matter of taste whether you prefer fish or frogs.) We'll use the warnings pragma to determine whether the user is interested in our warning:
sub TIESCALAR {           # in ScalarFile.pm
    my $class    = shift;
    my $filename = shift;
    my $fh;
    if (open $fh, "<", $filename or
        open $fh, ">", $filename)
    {
        close $fh;
        $count++;
        return bless \$filename, $class;
    }
    carp "Can't tie $filename: $!" if warnings::enabled();
    return;
}
Given such a constructor, we can now associate the scalar $string with the file camel.lot:

tie ($string, "ScalarFile", "camel.lot") or die;

(We're still assuming some things we shouldn't. In a production version of this, we'd probably open the filehandle once and remember the filehandle as well as the filename for the duration of the tie, keeping the handle exclusively locked with flock the whole time. Otherwise we're open to race conditions--see "Timing Glitches" in Chapter 23, "Security".)

SELF->FETCH

This method is invoked whenever you access the tied variable (that is, read its value). It takes no arguments beyond the object tied to the variable. In our example, that object contains the filename.

sub FETCH {
    my $self  = shift;
    confess "I am not a class method" unless ref $self;
    return unless open my $fh, $$self;
    read($fh, my $value, -s $fh);  # NB: don't use -s on pipes!
    return $value;
}
This time we've decided to blow up (raise an exception) if FETCH gets something other than a reference. (Either it was invoked as a class method, or someone miscalled it as a subroutine.) There's no other way for us to return an error, so it's probably the right thing to do. In fact, Perl would have raised an exception in any event as soon as we tried to dereference $self; we're just being polite and using confess to spew a complete stack backtrace onto the user's screen. (If that can be considered polite.)

We can now see the contents of camel.lot when we say this:

tie($string, "ScalarFile", "camel.lot");
print $string;

SELF->STORE(VALUE)

This method is run when the tied variable is set (assigned). The first argument, SELF, is as always the object associated with the variable; VALUE is whatever was assigned to the variable. (We use the term "assigned" loosely--any operation that modifies the variable can call STORE.)

sub STORE {
    my($self,$value) = @_;
    ref $self                   or confess "not a class method";
    open my $fh, ">", $$self    or croak "can't clobber $$self: $!";
    syswrite($fh, $value) == length $value
                                or croak "can't write to $$self: $!";
    close $fh                   or croak "can't close $$self: $!";
    return $value;
}
After "assigning" it, we return the new value--because that's what assignment does. If the assignment wasn't successful, we croak out the error. Possible causes might be that we didn't have permission to write to the associated file, or the disk filled up, or gremlins infested the disk controller. Sometimes you control the magic, and sometimes the magic controls you.

We can now write to camel.lot when we say this:

tie($string, "ScalarFile", "camel.lot");
$string  = "Here is the first line of camel.lot\n";
$string .= "And here is another line, automatically appended.\n";

SELF->DESTROY

This method is triggered when the object associated with the tied variable is about to be garbage collected, in case it needs to do something special to clean up after itself. As with other classes, such a method is seldom necessary, since Perl deallocates the moribund object's memory for you automatically. Here, we'll define a DESTROY method that decrements our count of tied files:

sub DESTROY {
    my $self = shift;
    confess "wrong type" unless ref $self;
    $count--;
}
We might then also supply an extra class method to retrieve the current count. Actually, it doesn't care whether it's called as a class method or an object method, but you don't have an object anymore after the DESTROY, now do you?
sub count {
    # my $invocant = shift;
    $count;
}
You can call this as a class method at any time like this:
if (ScalarFile->count) {
    warn "Still some tied ScalarFiles sitting around somewhere...\n";
}

That's about all there is to it. Actually, it's more than all there is to it, since we've done a few nice things here for the sake of completeness, robustness, and general aesthetics (or lack thereof). Simpler TIESCALAR classes are certainly possible.

14.1.2. Magical Counter Variables

Here's a simple Tie::Counter class, inspired by the CPAN module of the same name. Variables tied to this class increment themselves by 1 every time they're used. For example:

tie my $counter, "Tie::Counter", 100;
@array = qw /Red Green Blue/;
for my $color (@array) {               # Prints:
    print "  $counter  $color\n";      #   100  Red
}                                      #   101  Green
                                       #   102  Blue
The constructor takes as an optional extra argument the first value of the counter, which defaults to 0. Assigning to the counter will set a new value. Here's the class:
package Tie::Counter;
sub FETCH     { ++ ${ $_[0] } }
sub STORE     { ${ $_[0] } = $_[1] }
sub TIESCALAR {
    my ($class, $value) = @_;
    $value = 0 unless defined $value;
    bless \$value => $class;
}
1;  # if in module
See how small that is? It doesn't take much code to put together a class like this.

14.1.3. Magically Banishing $_

This curiously exotic tie class is used to outlaw unlocalized uses of $_. Instead of pulling in the module with use, which invokes the class's import method, this module should be loaded with no to call the seldom-used unimport method. The user says:

no Underscore;
And then all uses of $_ as an unlocalized global raise an exception.

Here's a little test suite for the module:

#!/usr/bin/perl
no Underscore;
@tests = (
    "Assignment"  => sub { $_ = "Bad" },
    "Reading"     => sub { print },
    "Matching"    => sub { $x = /badness/ },
    "Chop"        => sub { chop },
    "Filetest"    => sub { -x },
    "Nesting"     => sub { for (1..3) { print } },
);

while ( ($name, $code) = splice(@tests, 0, 2) ) {
    print "Testing $name: ";
    eval { &$code };
    print $@ ? "detected" : " missed!";
    print "\n";
}
which prints out the following:
Testing Assignment: detected
Testing Reading: detected
Testing Matching: detected
Testing Chop: detected
Testing Filetest: detected
Testing Nesting: 123 missed!
The last one was "missed" because it was properly localized by the for loop and thus safe to access.

Here's the curiously exotic Underscore module itself. (Did we mention that it's curiously exotic?) It works because tied magic is effectively hidden by a local. The module does the tie in its own initialization code so that a require also works.

package Underscore;
use Carp;
sub TIESCALAR { bless \my $dummy => shift }
sub FETCH { croak 'Read access to $_ forbidden'  }
sub STORE { croak 'Write access to $_ forbidden' }
sub unimport { tie($_, __PACKAGE__) }
sub import   { untie $_ }
tie($_, __PACKAGE__) unless tied $_;
1;
It's hard to usefully mix calls to use and no for this class in your program, because they all happen at compile time, not run time. You could call Underscore->import and Underscore->unimport directly, just as use and no do. Normally, though, to renege and let yourself freely use $_ again, you'd just use local on it, which is the whole point.



Library Navigation Links

Copyright © 2001 O'Reilly & Associates. All rights reserved.