home " subscribe " advertise " customer service " back issues " " contacts

Linux Magazine / July 1999 / PERL OF WISDOM
TIEing Up Loose Ends
 
       page 01 02   next >>

PERL OF WISDOM
TIEing Up Loose Ends
by Randal L. Schwartz

Perl has a lot of cool stuff. Certainly, the basic: print "Hello, world!\n"; gets people started without knowing much about the language, but the question "Is there a way to do (X) in Perl?" can usually be answered "Yes!"

For example, the neat way that a DBM can appear to be a hash in Perl rather transparently is done with a mechanism called "tied variables". Buttied variables aren't limited to DBMs -- we can make scalars, arrays, hashes, and filehandles all have similar magic.

What? Filehandles? Yes. Imagine a "magic" filehandle that appears to the rest of the program to be a normal filehandle (albeit already opened). But, every time the program "reads a line" from the filehandle, a subroutine gets invoked; and, for every operation on this so-called filehandle, a different subroutine gets invoked. Well, that's what a tied filehandle does.

One use of having a magic filehandle is to create a filehandle that automatically expands "include" specifications, where some part of the contents indicate that other files must be consulted as well. For example, Perl's requireoperator brings in additional Perl code from other files, and the C preprocessor (CPP) looks for lines like #include "file.h" to bring in more C code.

The advantage of the filehandle having all the smarts is that we can re-use existing code or libraries that expect a filehandle, and yet get the include-file expansion done transparently.

Listing One: ihtest

 1  #!/usr/bin/perl -w
 2  use strict;
 3
 4  use IncludeHandle;
 5
 6  {
 7    local *FRED;
 8  tie *FRED, 'IncludeHandle', "localfile",   
    q/^#include "(.*)"/
 9      or die "Cannot tie: $!";
 10
 11       while (<FRED>) {
 12  print;
 13       }
 14     }
 15
 16     {
 17       local *BARNEY;
 18 tie *BARNEY, 'IncludeHandle', "localfile",  qr/^#include "(.*)"/
 19  or die "Cannot tie: $!";
 20
 21       my @a = <BARNEY>;
 22       print @a;
 23
 24     }
 25
 26     {
 27       local *DINO;
 28 tie*DINO,'IncludeHandle',"localfile",sub{
 29 /^#include \"(.*)\"/ ? $1 : 
       /^#include <(.*)>/ ? $1 : undef
 30       }
 31  or die "Cannot tie: $!";
 32
 33       print <DINO>;
 34     }
 


While getting a filehandle to be tied may seem difficult, the process is actually rather straightforward. Just create a class library (IncludeHandle, for example), and then create handles with tie rather than open. To demonstrate this, I've written a program that uses the IncludeHandle class. [See Listing One below].

Lines 1 and 2 turn on warnings, and enable compiler restrictions.

Line 4 pulls in the IncludeHandlemodule, described later. Because this is an object-oriented class, we won't be importing any functions.

Lines 6 through 14 demonstrate the first use of the tied IncludeHandle generated handles. I'm setting up a "naked block" (a block that is not otherwise part of a larger construct, like an if or a while), so that the local on the soon-to-be tied (or is that fit to be tied?) filehandlefound in *FRED will disappear when I'm done.

The local *FRED in line 7 creates a temporary value for all kinds of things that share the name FRED. One of these is our filehandle, and although the others (like $FRED and %FRED) are also localized, that doesn't make much difference. This temporary value will get undone in line 14.

Lines 8 and 9 tie the filehandle FRED (indicated by passing the symbol name *FRED to tie), using the designated parameters. The first parameter must be a class name (a package name with certain subroutines defined within that package). Here, I've designated the IncludeHandle class to handle the tie. The parameters of localfile and a quoted string that looks like a C-language preprocessor include-file directive get passed to the TIEHANDLE method, described later. If this succeeds, the tie returns true; otherwise, the die is executed with $! having an appropriate brief error code.

I've defined the first parameter after the classname to be treated as a filename to open. You can think of this as if it were:

open(FRED, "localfile") or die "Cannot open: $!";

except that any include files (denoted by lines that match the second additional parameter after the classname) will be expanded in place. Thus, the normal-looking loop in lines 11 through 13 will dump out the contents of this file. If any line of localfile matches the pattern ^#include"(.*)", however, the part returned as $1 in that pattern will be opened as a new file, and its contents will be inserted in place of the line. This is a recursive operation: included files may themselves contain include-file lines. We'll see how this all works later when I describe the class file.

Lines 16 to 24 show a similar example. Note, however, that the include-file pattern specification is being passed as a compiled regular expression, rather than just a string. That's helpful if the tie is being executed in a loop, so that the expression doesn't have to continue to be recompiled on each iteration. Again, I'm just showing off the versatility of this particular tie usage.

Also note here that lines 21 and 22 invoke the filehandle read-line operator in a list context instead of a scalar context. We'll see later how this is supported later.

Lines 26 to 34 show a more interesting and complicated use of that same second parameter. If the parameter is a "coderef" (a reference to a named or anonymous subroutine), then the subroutine is called for each line read from the file, with $_ set to the line. The subroutine can return undef to indicate that the line is an ordinary text line to be returned as part of the read operation, or can return a string indicating a new filename to open.

Lines 28 through 30 define an anonymous subroutine that looks for two different kinds of include lines --both kinds that the C-preprocessor understands. If we wanted, we could even create a "search path" for names found within angle brackets, just like the C-preprocessor. Again, note in line 33 that we're invoking the read-line operator in a list context, here being passed directly back to print.

To test this, we can create a local filelocalfile that might look like this:

aaa
#include "incfile
bbb
#include <incfile>
ccc

and then another file incfile that contains this:

111
222
333

and the output will look like Figure 1.

Note that the line with the angle-bracketed name is processed only on the third include, because the first two used a simple, regular expression that did not include the angle-bracketed form.

       page 01 02   next >>
 
Linux Magazine / July 1999 / PERL OF WISDOM
TIEing Up Loose Ends

" " printer-friendly version "

home " subscribe " advertise " customer service " back issues " " contacts