Small cover
Embedding Perl in HTML with Mason
Dave Rolsky
Ken Williams

Table of Contents | Foreword | Preface
Chapters: 1 2 3 4 5 6 7 8 9 10 11 12
Appendices: A B C D
Glossary | Colophon | Copyright


Chapter 11: Recipes

No, we are not going teach you how to make a delicious tofu and soybean stew. But this is almost as good. This chapter shows how to do some common Mason tasks, some of them with more than one implementation.

Sessions

For many of our session examples, we will be using the Apache::Session module. Despite its name, this module doesn't actually require mod_perl or Apache, though that is the context in which it was born and in which it's most often used. It implements a simple tied hash interface to a persistent object.1 It has one major gotcha: you must make sure that the session object gets cleaned up properly (usually by letting it go out of scope), so that it will be written to disk after each access.

Without Touching httpd.conf

Here is an example that doesn't involve changing any of your Apache configuration settings. The following code should be placed in a top-level autohandler. Any component that needs to use the session will have to inherit from this component, either directly or via a longer inheritance chain.

It uses cookies to store the session.

  <%once>
   use Apache::Cookie;
   use Apache::Session::File;
  </%once>
  <%init>
   my %c = Apache::Cookie->fetch;
   my $session_id =
       exists $c{masonbook_session} ? $c{masonbook_session}->value : undef;

First, it loads the necessary modules. Normally we recommend that you do this at server startup via a PerlModule directive in your httpd.conf file or in your handler.pl file to save memory, but we load them here just to show you which ones we are using. The component uses the Apache::Cookie module to fetch any cookies that might have been sent by the browser. Then we check for the existence of a cookie called masonbook_session, which if it exists should contain a valid session ID.

  local *MasonBook::Session;
  
  eval {
      tie %MasonBook::Session, 'Apache::Session::File', $session_id, {
          Directory => '/tmp/sessions',
          LockDirectory => '/tmp/sessions',
      };
  };
  
  if ([email protected]) {
      die [email protected] unless [email protected] =~ /Object does not exist/;  # Re-throw
  
      $m->redirect('/bad_session.html');       
  }

The first line ensures that when this component ends, the session variable will go out of scope, which triggers Apache::Session's cleanup mechanisms. This is quite important, as otherwise the data will never be written to disk. Even worse, Apache::Session may still be maintaining various locks internally, leading to deadlock. We use local()

to localize the symbol table entry *MasonBook::Session; it's not enough to localize just the hash %MasonBook::Session, because the tie()

magic is attached to the symbol table entry. It's also worth mentioning that we use a global variable rather than a lexical one, because we want this variable to be available to all components.

If the value in the $session_id variable is undef, that is not a problem. The Apache::Session module simply creates a new session ID. However, if $session_id is defined but does not represent a valid session, an exception will be thrown. This means either that the user's session has expired or that she's trying to feed us a bogus ID. Either way, we want to tell her what's happened, so we redirect to another page that will explain things. To trap the exception, we wrap the tie() in an eval {} block.

If an exception is thrown, we check [email protected] to see whether the message indicates that the session isn't valid. Any other error is fatal. If the session isn't valid, we use the redirect() method provided by the request object.

Finally, we send the user a cookie:

  Apache::Cookie->new( $r,
                       name => 'masonbook_session',
                       value => $MasonBook::Session{_session_id},
                       path => '/',
                       expires => '+1d',
                     )->bake;

This simply uses the Apache::Cookie module to ensure that a cookie will be sent to the client with the response headers. This cookie is called 'masonbook_session' and is the one we checked for earlier. It doesn't hurt to send the cookie every time a page is viewed, though this will reset the expiration time of the cookie each time it is set. If you want the cookie to persist for only a certain fixed length of time after the session is created, don't resend the cookie.

  $m->call_next;
  </%init>

This line simply calls the next component in the inheritance chain. Presumably, other components down the line may change the contents of %MasonBook::Session, and those modifications will be written to disk at the end of the request.

Example 11-1 shows the entire component.

Example 11-1. session-autohandler-Apache-Session.comp
  <%once>
   use Apache::Cookie;
   use Apache::Session::File;
  </%once>
  <%init>
   my %c = Apache::Cookie->fetch;
   my $session_id =
       exists $c{masonbook_session} ? $c{masonbook_session}->value : undef;
   
   local *MasonBook::Session;
   
   eval {
       tie %MasonBook::Session, 'Apache::Session::File', $session_id, {
           Directory => '/tmp/sessions',
           LockDirectory => '/tmp/sessions',
       };
   };
   
   if ([email protected]) {
       die [email protected] unless [email protected] =~ /Object does not exist/;  # Re-throw
       
       $m->redirect('/bad_session.html');
   }
   
   Apache::Cookie->new( $r,
                        name => 'masonbook_session',
                        value => $MasonBook::Session{_session_id},
                        path => '/',
                        expires => '+1d',
                      )->bake;
   
    $m->call_next;
  </%init>

Predeclaring the Global via an httpd.conf File

It'd be nice to be able to simply use the global session variable without having to type the fully qualified name, %MasonBook::Session in every component. That can be done by adding this line to your httpd.conf file:

  PerlSetVar MasonAllowGlobals %session

Of course, if you're running more than one Mason-based site that uses sessions, you may need to come up with a unique variable name.

Adding this to your httpd.conf means you can simply reference the %session

variable in all of your components, without a qualifying package name. The %session variable would actually end up in the HTML::Mason::Commands package, rather than MasonBook.

Predeclaring the Global via a handler.pl Script

If you have a handler.pl script, you could also use the session-making code we just saw. If you wanted to declare a %session global for all your components, you'd simply pass the allow_globals parameter to your interpreter when you make it, like this:

  my $ah =
      HTML::Mason::ApacheHandler->new( comp_root => ...,
                                       data_dir  => ...,
                                       allow_globals => [ '%session' ] );

You might also choose to incorporate the session-making code into your handler subroutine rather than placing it in a component. This would eliminate the need to make sure that all components inherit from the session-making component.

Using Cache::Cache for Sessions

Just to show you that you don't have to use Apache::Session, here is a simple alternate using Cache::Cache , which is integrated into Mason via the request object's cache() method.

This version also sets up the session in a top-level autohandler just like our first session example. It looks remarkably similar.

  <%once>
   use Apache::Cookie;
   use Cache::FileCache;
   use Digest::SHA1;
  </%once>

Again, for memory savings, you should load these modules at server startup.

  <%init>
   my $cache =
       Cache::FileCache->new( { namespace  => 'Mason-Book-Session',
                                cache_root => '/tmp/sessions',
                                default_expires_in  => 60 * 60 * 24, # 1 day
                                auto_purge_interval => 60 * 60 * 24, # 1 day
                                auto_purge_on_set => 1 } );

This creates a new cache object that will be used to store sessions. Without going into too much detail, this creates a new caching object that will store data on the filesystem under /tmp/sessions.2 The namespace is basically equivalent to a subdirectory in this case, and the remaining options tell the cache that, by default, stored data should be purged after one day and that it should check for purgeable items once per day.

  my %c = Apache::Cookie->fetch;
  
  if (exists $c{masonbook_session}) {
      my $session_id = $c{masonbook_session}->value;
      $MasonBook::Session = $cache->get($session_id);
  }
  
  $MasonBook::Session ||=
      { _session_id => Digest::SHA1::sha1_hex( time, rand, $$ ) };

These lines simply retrieve an existing session based on the session ID from the cookie, if such a cookie exists. If this fails or if there was no session ID in the cookie, we make a new one with a randomly generated session ID. The algorithm used earlier for generating the session ID is more or less the same as the one provided by Apache::Session's Apache::Session::Generate::MD5 module, except that it uses the SHA1 digest module. This algorithm should provide more than enough randomness to ensure that there will never be two identical session IDs generated. It may not be enough to keep people from guessing possible session IDs, though, so if you want make sure that a session cannot be hijacked, you should incorporate a secret into the digest algorithm input.

  Apache::Cookie->new( $r,
                       name => 'masonbook_session',
                       value => $MasonBook::Session->{_session_id},
                       path => '/',
                       expires => '+1d',
                     )->bake;

We then set a cookie in the browser that contains the session ID. This cookie will expire in one day. Again, this piece is identical to what we saw when using Apache::Session.

  eval { $m->call_next };
  
  $cache->set( $MasonBook::Session->{_session_id} => $MasonBook::Session );

Unlike with Apache::Session, we need to explicitly tell our cache object to save the data. This means we need to wrap the call to $m->call_next() in an eval {} block in order to catch any exceptions thrown in other components. Otherwise, this part looks almost exactly like our example using Apache::Session.

After saving the session, we rethrow any exception we may have gotten.

The entire component is shown in Example 11-2.

Example 11-2. session-autohandler-Cache-Cache.comp
  <%once>
   use Apache::Cookie;
   use Digest::SHA1;
  </%once>
  <%init> 
   my $cache =
       Cache::FileCache->new( namespace  => 'Mason-Book-Session',
                              cache_root => '/tmp/sessions',
                              default_expires_in  => 60 * 60 * 24, # 1 day
                              auto_purge_interval => 60 * 60 * 24, # 1 day
                              auto_purge_on_set => 1 } );
   
   my %c = Apache::Cookie->fetch;
  
   if (exists $c{masonbook_session}) {
       my $session_id = $c{masonbook_session}->value;
       $MasonBook::Session = $cache->get($session_id);
   }
  
   $MasonBook::Session ||=
       { _session_id => Digest::SHA1::sha1_hex( time, rand, $$ ) };
   
   Apache::Cookie->new( $r,
                        name => 'masonbook_session',
                        value => $MasonBook::Session->{_session_id},
                        path => '/',
                        expires => '+1d',
                      )->bake;
  
   eval { $m->call_next };
   
   $cache->set( $MasonBook::Session->{_session_id} => $MasonBook::Session );
   
   die [email protected] if [email protected];
  </%init>

Sessions with Cache::Cache have these major differences from those with Apache::Session:

  • The session itself is not a tied hash. Objects are faster than tied hashes but not as transparent.
  • No attempt is made to track whether or not the session has changed. It is always written to the disk at the end of a session. This trades the performance boost of Apache::Session's behavior for the assurance that the data is always written to disk.

    When using Apache::Session, many programmers are often surprised that changes to a nested data structure in the session hash, like:

      $session{user}{name} = 'Bob';

    are not seen as changes to the top-level %session hash. If no changes to this hash are seen, Apache::Session will not write the hash out to storage.

    As a workaround, some programmers may end up doing something like:

      $session{force_a_write}++;

    or:

      $session{last_accessed} = time( );

    after the session is created. Using Cache::Cache and explicitly saving the session every time incurs the same penalty as always changing a member of an Apache::Session hash.

Putting the Session ID in the URL

If you don't want to, or cannot, use cookies, you can store the session ID in the URL. This can be somewhat of a hassle because it means that you have to somehow process all the URLs you generate. Using Mason, this isn't as bad as it could be. There are two ways to do this:

One would be to put a filter in your top-level autohandler that looks something like this:

  <%filter>
   s/href="([^"])+"/add_session_id($1)/eg;
   s/action="([^"])+"/add_session_id($1)/eg;
  </%filter>

The add_session_id() subroutine, which should be defined in a module, would look something like this:

  sub add_session_id {
      my $url = shift;
      return $url if $url =~ m{^\w+://}; # Don't alter external URLs
      if ($url =~ /\?/) {
          $url =~ s/\?/?session_id=$MasonBook::Session{_session_id}&/;
      } else {
          $url .= "?session_id=$MasonBook::Session{_session_id}";
      }
  
      return $url;
  }

This routine accounts for external links as well as links with or without an existing query string. However, it doesn't handle links with fragments properly.

The drawback to putting this in the <%filter> is that it filters URLs only in the content body, not in headers. Therefore you'll need to handle those cases separately.

The other solution would be to create all URLs (including those intended for redirects) via a dedicated component or subroutine that would add the session ID. This latter solution is probably a better idea, as it handles redirects properly. The drawback with this strategy is that you'll have a Mason component call for every link, instead of just regular HTML.

We'll add a single line (bolded in Example 11-3) to the /lib/url.mas component we saw in Chapter 8. Now this component expects there to be a variable named %UserSession.

Example 11-3. url-plus-session-id.mas
  <%args>
   $scheme   => 'http'
   $username => undef
   $password => ''
   $host     => undef
   $port     => undef
   $path
   %query    => ( )
   $fragment => undef
  </%args>
  <%init>
   my $uri = URI->new;
  
   if ($host) {
      $uri->scheme($scheme);
      if (defined $username) {
        $uri->authority( "$username:$password" );
      }
  
      $uri->host($host);
      $uri->port($port) if $port;
   }
  
   # Sometimes we may want to path in a query string as part of the
   # path but the URI module will escape the question mark.
   my $q;
  
   if ( $path =~ s/\?(.*)$// ) {
      $q = $1;
   }
  
   $uri->path($path);
   # If there was a query string, we integrate it into the query
   # parameter.
  
   if ($q) {
      %query = ( %query, split /[&=]/, $q );
   }
  
   $query{session_id} = $UserSession{session_id};
  
   # $uri->query_form doesn't handle hash ref values properly
   while ( my ( $key, $value ) = each %query ) {
      $query{$key} = ref $value eq 'HASH' ? [ %$value ] : $value;
   }
  
   $uri->query_form(%query) if %query;
  
   $uri->fragment($fragment) if $fragment;
  </%init>
  <% $uri->canonical | n %>\

Making Use of Autoflush

Every once in a while you may have to output a very large component or file to the client. Simply letting this accumulate in the buffer could use up a lot of memory. Furthermore, the slow response time may make the user think that the site has stalled.

Example 11-4 sends out the contents of a potentially large file without sucking up lots of memory.

Example 11-4. send_file-autoflush.comp
  <%args>
   $filename
  </%args>
  <%init>
   local *FILE;
   open FILE, "< $filename" or die "Cannot open $filename: $!";
   $m->autoflush(1);
   while (<FILE>) {
       $m->print($_);
   }
   $m->autoflush(0);
  </%init>

If each line wasn't too huge, you might just flush the buffer every once in a while, as in Example 11-5.

Example 11-5. send_file-flush-every-10.comp
  <%args>
   $filename
  </%args>
  <%init>
   local *FILE;
   open FILE, "< $filename" or die "Cannot open $filename: $!";
   while (<FILE>) {
       $m->print($_);
       $m->flush_buffer unless $. % 10;
   }
   $m->flush_buffer;
  </%init>

The unless $. % 10 bit makes use of the special Perl variable $., which is the current line number of the file being read. If this number modulo 10 is equal to zero, we flush the buffer. This means that we flush the buffer every 10 lines. Replace the number 10 with any desired value.

User Authentication and Authorization

One problem that web sites have to solve over and over is user authentication and authorization. These two topics are related but not the same, as some might think. Authentication is the process of figuring out if someone is who he says he is, and usually involves checking passwords or keys of some sort. Authorization comes after this, when we want to determine whether or not a particular person is allowed to perform a certain action.

There are a number of modules on CPAN intended to help do these things under mod_perl. In fact, Apache has separate request-handling phases for both authentication and authorization that mod_perl can handle. It is certainly possible to use these modules with Mason.

You can also do authentication and authorization using Mason components (as seen in Chapter 8). Authentication will usually involve some sort of request for a login and password, after which you give the user some sort of token (either in a cookie or a session) that indicates that he has been authenticated. You can then check the validity of this token for each request.

If you have such a token, authorization simply consists of checking that the user to whom the token belongs is allowed to perform a given action.

Using Apache::AuthCookie

The Apache::AuthCookie module, available from CPAN, handles both authentication and authorization via mod_perl and can be easily hooked into Mason. Let's just skip all the details of configuring Apache::AuthCookie, which requires various settings in your server config file, and show how to make the interface to Mason.

Apache::AuthCookie requires that you create a "login script" that will be executed the first time a browser tries to access a protected area. Calling this a script is actually somewhat misleading since it is really a page rather than a script (though it could be a script that generates a page). Regardless, using a Mason component for your login script merely requires that you specify the path to your Mason component for the login script parameter.

We'll call this script AuthCookieLoginForm-login.comp,as shown in Example 11-6.

Example 11-6. AuthCookieLoginForm-login.comp
  <html>
  <head>
  <title>Mason Book AuthCookie Login Form</title>
  </head>
  <body>
  <p>
  Your attempt to access this document was denied
  (<% $r->prev->subprocess_env("AuthCookieReason") %>).  Please enter
  your username and password.
  </p>
  
  <form action="/AuthCookieLoginSubmit">
  <input type="hidden" name="destination" value="<% $r->prev->uri %>">
  <table align="left">
   <tr>
    <td align="right"><b>Username:</b></td>
    <td><input type="text" name="credential_0" size="10" maxlength="10"></td>
   </tr>
   <tr>
    <td align="right"><b>Password:</b></td>
    <td><input type="password" name="credential_1" size="8" maxlength="8"></td>
   </tr>
   <tr>
    <td colspan="2" align="center"><input type="submit" value="Continue"></td>
   </tr>
  </table>
  </form>
  
  </body>
  </html>

This component is a modified version of the example login script included with the Apache::AuthCookie distribution.

The action used for this form, /AuthCookieLoginSubmit, is configured as part of your AuthCookie configuration in your httpd.conf file.

That is about it for interfacing this module with Mason. The rest of authentication and authorization is handled by configuring mod_perl to use Apache::AuthCookie to protect anything on your site that needs authorization. A very simple configuration might include the following directives:

  PerlSetVar MasonBookLoginScript /AuthCookieLoginForm.comp
  
  <Location /AuthCookieLoginSubmit>
    AuthType MasonBook::AuthCookieHandler
    AuthName MasonBook
    SetHandler  perl-script
    PerlHandler MasonBook::AuthCookieHandler->login
  </Location>
  
  <Location /protected>
    AuthType MasonBook::AuthCookieHandler
    AuthName MasonBook
    PerlAuthenHandler MasonBook::AuthCookieHandler->authenticate
    PerlAuthzHandler  MasonBook::AuthCookieHandler->authorize
    require valid-user
  </Location>

The MasonBook::AuthCookieHandler module would look like this:

  package MasonBook::AuthCookieHandler;
  
  use strict;
  
  use base qw(Apache::AuthCookie);
  
  use Digest::SHA1;
  
  my $secret = "You think I'd tell you?  Hah!";
  
  sub authen_cred {
      my $self = shift;
      my $r = shift;
      my ($username, $password) = @_;
  
      # implementing _is_valid_user() is out of the scope of this chapter
      if ( _is_valid_user($username, $password) ) {
          my $session_key =
            $username . '::' . Digest::SHA1::sha1_hex( $username, $secret );
          return $session_key;
      }
  }
  
  sub authen_ses_key {
      my $self = shift;
      my $r = shift;
      my $session_key = shift;
  
      my ($username, $mac) = split /::/, $session_key;
  
      if ( Digest::SHA1::sha1_hex( $username, $secret ) eq $mac ) {
          return $session_key;
      }
  }

This provides the minimal interface an Apache::AuthCookie subclass needs to provide to get authentication working.

Authentication Without Cookies

But what if you don't want to use Apache::AuthCookie? Your site may need to work without using cookies.

First, we will show an example authentication system that uses only Mason and passes the authentication token around via the URL (actually, via a session).

This example assumes that we already have some sort of session system that passes the session ID around as part of the URL, as discussed previously.

We start with a quick login form. We will call this component login_form.html, as shown in Example 11-7.

Example 11-7. login_form.html
  <%args>
   $username => ''
   $password => ''
   $redirect_to => ''
   @errors => ( )
  </%args>
  <html>
  <head>
  <title>Mason Book Login</title>
  </head>
  
  <body>
  
  % if (@errors) {
  <h2>Errors</h2>
  %   foreach (@errors) {
  <b><% $_ | h %></b><br>
  %   }
  % }
  
  <form action="login_submit.html">
  <input type="hidden" name="redirect_to" value="<% $redirect_to %>">
  <table align="left">
   <tr>
    <td align="right"><b>Login:</b></td>
    <td><input type="text" name="username" value="<% $username %>"></td>
   </tr>
   <tr>
    <td align="right"><b>Password:</b></td>
    <td><input type="password" name="password" value="<% $password %>"></td>
   </tr>
   <tr>
    <td colspan="2" align="center"><input type="submit" value="Login"></td>
   </tr>
  </table>
  </form>
  
  </body>
  </html>

This form uses some of the same techniques we saw in Chapter 8 to prepopulate the form and handle errors.

Now let's make the component that handles the form submission. This component, called login_submit.html and shown in Example 11-8, will check the username and password and, if they are valid, place an authentication token into the user's session.

Example 11-8. login_submit.html
  <%args>
   $username
   $password
   $redirect_to
  </%args>
  <%init>
   if (my @errors = check_login($username, $password) {
       $m->comp( 'redirect.mas',
                  path => 'login_form.html',
                  query => { errors => \@errors,
                             username => $username,
                             password => $password,
                             redirect_to => $redirect_to } );
   }
   
   $MasonBook::Session{username} = $username;
   $MasonBook::Session{token} =
       Digest::SHA1::sha1_hex( 'My secret phrase', $username );
   
   $m->comp( 'redirect.mas',
             path => $redirect_to );
  </%init>

This component simply checks (via magic hand waving) whether the username and password are valid and, if so, generates an authentication token that is added to the user's session. To generate this token, we take the username, which is also in the session, and combine it with a secret phrase. We then generate a MAC from those two things.

The authentication and authorization check looks like this:

  if ( $MasonBook::Session{token} ) {
      if ( $MasonBook::Session{token} eq
           Digest::SHA1::sha1_hex( 'My secret phrase',
                                   $MasonBook::Session{username} ) {
  
          # ... valid login, do something here
      } else {
          # ... someone is trying to be sneaky!
      }
  } else { # no token
       my $wanted_page = $r->uri;
  
       # Append query string if we have one.
       $wanted_page .= '?' . $r->args if $r->args;
  
       $m->comp( 'redirect.mas',
                  path => '/login/login_form.html',
                  query => { redirect_to => $wanted_page } );
  }

We could put all the pages that require authorization in a single directory tree and have a top-level autohandler in that tree do the check. If there is no token to check, we redirect the browser to the login page, and after a successful login the user will return, assuming she submitted valid login credentials.

Access Controls with Attributes

The components we saw previously assumed that there are only two access levels, unauthenticated and authenticated. A more complicated version of this code might involve checking that the user has a certain access level or role.

In that case, we'd first check that we had a valid authentication token and then go on to check that the user actually had the appropriate access rights. This is simply an extra step in the authorization process.

Using attributes, we can easily define access controls for different portions of our site. Let's assume that we have four access levels, Guest, User, Editor, and Admin. Most of the site is public and viewable by anyone. Some parts of the site require a valid login, while some require a higher level of privilege.

We implement our access check in our top-level autohandler, /autohandler, from which all other components must inherit in order for the access control code to be effective.

  <%init>
   my $user = get_user( );  # again, hand waving
  
   my $required_access = $m->base_comp->attr('required_access');
  
   unless ( $user->has_access_level($required_access) ) {
      # ... do something like send them to another page
   }
  
   $m->call_next;
  </%init>
  <%attr>
   required_access => 'Guest'
  </%attr>

It is crucial that we set a default access level in this autohandler. By doing this, we are saying that, by default, all components are accessible by all people, since every visitor will have at least Guest access.

We can override this default elsewhere. For example, in a component called /admin/autohandler, we might have:

  <%attr>
   required_access => 'Admin'
  </%attr>

As long as all the components in the /admin/ directory inherit from the /admin/autohandler component and don't override the required_access attribute, we have effectively limited that directory (and its subdirectories) to admin users only. If we for some reason had an individual component in the /admin/ directory that we wanted editors to be able to see, we could simply set the required_access attribute for that component to 'Editor' .

Co-Branding Color Schemes

One common business practice these days is to take a useful site and offer "cobranded" versions of it to other businesses. A co-branded site might display different graphics and text for each client while retaining the same basic layout and functionality across all clients.

Mason is extremely well-suited to this task. Let's look at how we might apply a new color scheme to each co-brand.

For the purpose of these examples, we're going to assume that the name of the co-brand has already been determined and is being passed to our components as a variable called $cobrand. This variable could be set up by including the co-brand in the query string, in a session, or as part of a hostname.

With Stylesheets

One way to do this is to use stylesheets for all of your pages. Each cobrand will then have a different stylesheet. However, since most of the stylesheets will be the same for each client, you'll probably want to have a parent stylesheet that all the others inherit from.

Of course, while it is supposed to be possible to inherit stylesheets, some older browsers like Netscape 4.x don't support that at all, so we will generate the stylesheet on the fly using Mason instead. This gives you all the flexibility of inheritance without the compatibility headaches.

The stylesheet will be called via:

  <link rel="stylesheet" href="/styles.css?cobrand=<% $cobrand %>">

Presumably, this snippet would go in a top-level autohandler.3 The styles.css component might look something like Example 11-9.

Example 11-9. styles.css
  % while (my ($name, $def) = each %styles) {
  <% $name %> <% $def %>
  % }
  <%args>
   $cobrand
  </%args>
  <%init>
   my %styles;
   
   die "Security violation, style=$style" unless $cobrand =~ /^\w+$/;
   foreach my $file ('default.css', "$cobrand.css") {
       local *FILE;
       open FILE, "< /var/styles/$file"
           or die "Cannot read /var/styles/$file: $!";
       while (<FILE>) {
           next unless /(\S+) \s+ (\S.*)/x;
           $styles{$1} = $2;
       }
       close FILE;
   }
   
   $r->content_type('text/css');
  </%init>

Of course, this assumes that each line of the stylesheet represents a single style definition, something like:

  .foo_class { color: blue }

This isn't that hard to enforce for a project, but it limits you to just a subset of CSS functionality. If this is not desirable, check out the CSS and CSS::SAC modules on CPAN.

This component first grabs all the default styles from the default.css file and then overwrites any styles that are defined in the co-brand-specific file.

One nice aspect of this method is that if the site designers are not programmers, they can just work with plain old stylesheets, which should make them more comfortable.

With Code

Another way to do this is to store the color preferences for each co-brand in a component or perhaps in the database. At the beginning of each request, you could fetch these colors and pass them to each component.

For example, in your top-level autohandler you might have:

  <%init>
   my $cobrand = determine_cobrand( ); # magic hand waving again
  
   my %colors = cobrand_colors($cobrand);
  
   $m->call_next(%ARGS, colors => \%colors);
  </%init>

The cobrand_colors() subroutine could be made to use defaults whenever they were not overridden for a given co-brand.

Then the components might do something like this:

  <%args>
   %colors
  </%args>
  
  <html>
  <head>
  <title>Title</title>
  </head>
  
  <body bgcolor="<% $colors{body_bgcolor} %>">
  ...

This technique is a bit more awkward, as it requires that you have a color set for every possibility ($colors{left_menu_table_cell}, $colors{footer_text}, ad nauseam). It also works only for colors, whereas stylesheets allow you to customize fonts and layouts. But if you're targeting browsers that don't support stylesheets or you don't know CSS, this is a possible alternative.

Developer Environments

Having a development environment is a good thing for many reasons. Testing potential changes on your production server is likely to get you fired, for one thing.

Ideally, you want each developer to have his own playground where changes he makes don't affect others. Then, when something is working, it can be checked into source control and everyone else can use the updated version.

Multiple Component Roots

A fairly simple way to achieve this goal is by giving each developer his own component root, which will be checked before the main root.

Developers can work on components in their own private roots without fear of breaking anything for anyone else. Once changes are made, the altered component can be checked into source control and moved into the shared root, where everyone will see it.

This means that one HTML::Mason::ApacheHandler object needs to be created for each developer. This can be done solely by changing your server configuration file, but it is easiest to do this using an external handler.

The determination of which object to use can be made either by looking at the URL path or by using a different hostname for each developer.

By path

This example checks the URL to determine which developer's private root to use:

  use Apache::Constants qw(DECLINED);
  
  my %ah;
  
  sub handler {
      my $r = shift;
      my $uri = $r->uri;
      $uri =~ s,^/(\w+),,;  # remove the developer name from the path
  
      my $developer = $1 or return DECLINED;
  
      $r->uri($uri);  # set the uri to the new path
  
      $ah{$developer} ||=
        HTML::Mason::ApacheHandler->new
            ( comp_root => [ [ dev  => "/home/$developer/mason" ],
                             [ main => '/var/www' ] ],
              data_dir => "/home/$developer/data" );
  
      return $ah{$developer}->handle_request($r);
  }

We first examine the URL of the request to find the developer name, which we assume will always be the first part of the path, like /faye/index.html. We use a regex to remove this from the URL, which we then change to be the altered path.

If there is no developer name we simply decline the request.

The main problem with this approach is that it would then require that all URLs on the site be relative in order to preserve the developer's name in the path. In addition, some Apache features like index files and aliases won't work properly either. Fortunately, there is an even better way.

By hostname

This example lets you give each developer their own hostname:

  my %ah;
  
  sub handler {
      my $r = shift;
  
      my ($developer) = $r->hostname =~ /^(\w+)\./;
  
      $ah{$developer} ||=
        HTML::Mason::ApacheHandler->new
            ( comp_root => [ [ dev  => "/home/$developer/mason" ],
                             [ main => '/var/www' ] ],
              data_dir => "/home/$developer/data" );
  
      return $ah{$developer}->handle_request($r);
  }

This example assumes that for each developer there is a DNS entry like dave.dev.masonbook.com. You could also insert a CNAME wildcard entry in your DNS. The important part is that the first piece is the developer name.

Of course, with either method, developers will have to actively manage their development directories. Any component in their directories will block their access to a component of the same name in the main directory.

Multiple Server Configurations

The multiple component root method has several downsides:

  • Modules are shared by all the developers. If a change is made to a module, everybody will see it. This means that API changes are forced out to everyone at once, and a runtime error will affect all the developers. Additionally, you may need to stop and start the server every time a module is changed, interrupting everyone (although you could use Apache::Reload from CPAN to avoid this).
  • You can't test different server configurations without all the developers being affected.
  • Truly catastrophic errors that bring down the web server affect everyone.
  • The logs are shared, so if you like to send messages to the error log for debugging you'd better hope that no one else is doing the same thing or you'll have a mess.

The alternative is to run a separate daemon for each developer, each on its own port. This means maintaining either one fairly complicated configuration file, with a lot of <IfDefine> directives or separate configuration files for each developer.

The latter is probably preferable as it gives each developer total freedom to experiment. The configuration files can be generated from a template (possibly using Mason) or a script. Then each developer's server can listen on a different hostname or port for requests.

You can have each server's component root be the developer's working directory, which should mirror the layout of the real site. This means that there is no need to tweak any paths in the components.

This method's downside is that it will inevitably use up more memory than having a single server. It also requires a greater initial time investment in order to generate the configuration file templates. But the freedom it gives to individual developers is very nice, and the time investment is fixed.

Of course, since each developer has a computer, there is nothing to stop a developer from simply setting up Apache and mod_perl locally. And the automation would be even easier since there's no need to worry about dealing with unique port numbers or shared system resources. Even better (or worse, depending on your point of view), a developer can check out the entire system onto a laptop and work on the code without needing to be on the office network.

Managing DBI Connections

Not infrequently, we see people on the Mason users list asking questions about how to handle caching DBI connections.

Our recipe for this is really simple:

  use Apache::DBI

Rather than reinventing the wheel, use Apache::DBI , which provides the following features:

  • It is completely transparent to use. Once you've used it, you simply call DBI->connect() as always and Apache::DBI gives you an existing handle if one is available.
  • It makes sure that the handle is live, so that if your RDBMS goes down and then back up, your connections still work just fine.
  • It does not cache handles made before Apache forks, as many DBI drivers do not support using a handle after a fork.

Using Mason Outside of Dynamic Web Sites

So far we've spent a lot of time telling you how to use Mason to generate spiffy web stuff on the fly, whether that be HTML, WML, or even dynamic SVG files.

But Mason can be used in lots of other contexts. For example, you could write a Mason app that recursively descends a directory tree and calls each component in turn to generate a set of static pages.

How about using Mason to generate configuration files from templates? This could be quite useful if you had to configure a lot of machines similarly but with each one slightly different (for example, a web server farm).

Generating a Static Site from Components

Many sites might be best implemented as a set of static files instead of as a set of dynamically created responses to requests. For example, if a site's content changes only once or twice a week, generating each page dynamically upon request is probably overkill. In addition, you can often find much cheaper web hosting if you don't need a mechanism for generating pages dynamically.

But we'd still like some of the advantages a Mason site can give us. We'd like to build the site based on a database of content. We'd also like to have a nice consistent set of headers and footers, as well as automatically generate some bits for each page from the database. And maybe, just maybe, we also want to be able to make look-and-feel changes to the site without resorting to a multi-file find-and-replace. These requirements suggest that Mason is a good choice for site implementation.

For our example in this section, we'll consider a site of film reviews. It is similar to a site that one of the authors actually created for Hong Kong film reviews. Our example site will essentially be a set of pages that show information about films, including the film's title, year of release, director, cast, and of course a review. We'll generate the site from the Mason components on our home GNU/Linux box and then upload the site to the host.

First, we need a directory layout. Assuming that we're starting in the directory /home/dave/review-site, here's the layout:

  /home/dave/review-site (top level)
    /htdocs
      - index.html
        /reviews
          - autohandler
          - Anna_Magdalena.html
          - Lost_and_Found.html
          - ... (one file for each review)
  
    /lib
      - header.mas
      - footer.mas
      - film_header_table.mas

The index page will be quite simple. It will look like Example 11-10.

Example 11-10. review-site/htdocs/index.html
  <& /lib/header.mas, title => 'review list' &>
  <h1>Pick a review</h1>
  <ul>
  % foreach my $title (sort keys %pages) {
   <li><a href="<% $pages{$title} | h %>"><% $title | h %></a>
  % }
  </li>
  <%init>
   my %pages;
  
   local *DIR;
   my $dir = File::Spec->catfile( File::Spec->curdir, 'reviews' );
   opendir DIR, $dir
       or die "Cannot open $dir dir: $!";
  
   foreach my $file ( grep { /\.html$/ } readdir DIR ) {
       next if $file =~ /index\.html$/;
  
       my $comp = $m->fetch_comp("reviews/$file")
         or die "Cannot find reviews/$file component";
  
       my $title = $comp->attr('film_title');
  
       $pages{$title} = "reviews/$file";
   }
  
   closedir DIR
       or die "Cannot close $dir dir: $!";
  </%init>

This component simply makes a list of the available reviews, based on the files ending in .html in the /home/dave/review-site/reviews subdirectory. We assume that the actual film title is kept as an attribute (via an <%attr> section) of the component, so we load the component and ask it for the film_title attribute. If it doesn't have one Mason will throw an exception, which we think is better than having an empty link. If this were a dynamic web site, we might want to instead simply skip that review and go on to the next one, but here we're assuming that this script is being executed by a human being capable of fixing the error.

We make sure to HTML-escape the filename and the film title in the <a> tag's href attribute. It's not unlikely that the film could contain an ampersand character (&), and we want to generate proper HTML.

Next, let's make our autohandler for the reviews subdirectory (Example 11-11), which will take care of all the repeated work that goes into displaying a review.

Example 11-11. review-site/htdocs/reviews/autohandler
  <& /lib/header.mas, title => $film_title &>
  <& /lib/film_header_table.mas, comp => $m->base_comp &>
  % $m->call_next;
  <& /lib/footer.mas &>
  
  <%init>
   my $film_title = $m->base_comp->attr('film_title');
  </%init>

Again, a very simple page. We grab the film title so we can pass it to the header component. Then we call the film_header_table.mas component, which will use attributes from the component it is passed to generate a table containing the film's title, year of release, cast, and director.

Then we call the review component itself via call_next() and finish up with the footer.

Our header (Example 11-12) is quite straightforward.

Example 11-12. review-site/lib/header.mas
  <html>
  <head>
  <title><% $real_title | h %></title>
  </head>
  <body>
  
  <%args>
   $title
  </%args>
  <%init>
   my $real_title = "Dave's Reviews - $title";
  </%init>

This is a nice, simple header that generates the basic HTML pieces every page needs. Its only special feature is that it will make sure to incorporate a unique title, based on what is passed in the $title argument.

The footer (Example 11-13) is the simplest of all.

Example 11-13. review-site/lib/footer.mas
  <p>
  <em>Copyright &copy; David Rolsky, 1996-2002</em>.
  </p>
  
  <p>
  <em>All rights reserved.  No part of the review may be reproduced or
  transmitted in any form or by any means, electronic or mechanical,
  including photocopying, recording, or by any information storage and
  retrieval system, without permission in writing from the copyright
  owner.</em>
  </p>
  
  </body>
  </html>

There's one last building block piece left before we get to the reviews, the /lib/film_header_table.mas component (Example 11-14).

Example 11-14. review-site/lib/film_header_table.mas
  <table width="100%">
   <tr>
    <td colspan="2" align="center"><h1><% $film_title | h %></h1></td>
   </tr>
  % foreach my $field ( grep { exists $data{$_} } @optional ) {
   <tr>
    <td><strong><% ucfirst $field %></strong>:</td>
    <td><% $data{$field} | h %></td>
   </tr>
  % }
  </table>
  
  <%args>
   $comp
  </%args>
  
  <%init>
   my %data;
   my $film_title = $comp->attr('film_title');
   my @optional = qw( year director cast );
   foreach my $field (@optional)  {
       my $data = $comp->attr_if_exists($field);
       next unless defined $data;
       $data{$field} = ref $data ? join ', ', @$data : $data;
   }
  </%init>

This component just builds a table based on the attributes of the component passed to it. The required attribute is the film's title, but we can accommodate the year, director(s), and cast.

There are only two slightly complex lines.

The first is:

  % foreach my $field ( grep { exists $data{$_} } @optional ) {

Here we are iterating through the fields in @optional that have matching keys in %data. We could have simply called keys %data, but we want to display things in a specific order while still skipping nonexistent keys.

The other line that bears some explaining is:

  $data{$field} = ref $data ? join ', ', @$data : $data;

We check whether the value is a reference so that the attribute can contain an array reference, which is useful for something like the cast, which is probably going to have more than one person in it. If it is an array, we join all its elements together into a comma-separated list. Otherwise, we simply use it as-is.

Let's take a look at what one of the review components might look like:

  <%attr>
   film_title => 'Lost and Found'
   year => 1996
   director => 'Lee Chi-Ngai'
   cast => [ 'Kelly Chan Wai-Lan', 'Takeshi Kaneshiro', 'Michael Wong Man-Tak' ]
  </%attr>
  
  <p>
   Takeshi Kaneshiro plays a man who runs a business called Lost and
   Found, which specializes in searching for lost things and people. In
   the subtitles, his name is shown as That Worm, though that seems
   like a fairly odd name, cultural barriers notwithstanding. Near the
   beginning of the film, he runs into Kelly Chan. During their first
   conversation, she says that she has lost something. What she says
   she has lost is hope. We soon find out that she has leukemia and
   that the hope she seeks seems to be Michael Wong, a sailor who works
   for her father's shipping company.
  </p>
  
  <p>
   blah blah blah...
  </p>

This makes writing new reviews really easy. All we do is type in the review and a small number of attributes, and the rest of the framework is built automatically.

A more complex version of this site might store some or all of the data, including the reviews, in a database, which would make it easier to reuse the information in another context. But this is certainly good enough for a first pass.

All that's left is the script that will generate the static HTML files. See Example 11-15.

Example 11-15. review-site/generate_html.pl
  #!/usr/bin/perl -w
  
  use strict;  # Always use strict!
  
  use Cwd;
  use File::Basename;
  use File::Find;
  use File::Path;
  use File::Spec;
  use HTML::Mason;
  
  # These are directories.  The canonpath method removes any cruft
  # like doubled slashes.
  my ($source, $target) = map { File::Spec->canonpath($_) } @ARGV;
  
  die "Need a source and target\n"
      unless defined $source && defined $target;
  
  # Make target absolute because File::Find changes the current working
  # directory as it runs.
  $target = File::Spec->rel2abs($target);
  
  my $interp =
      HTML::Mason::Interp->new( comp_root => File::Spec->rel2abs(cwd) );
  
  find( \&convert, $source );
  
  sub convert {
      # We don't want to try to convert our autohandler or .mas
      # components.  $_ contains the filename
      return unless /\.html$/;
  
      my $buffer;
      # This will save the component's output in $buffer
      $interp->out_method(\$buffer);
  
      # We want to split the path to the file into its components and
      # join them back together with a forward slash in order to make
      # a component path for Mason
      #
      # $File::Find::name has the path to the file we are looking at,
      # relative to the starting directory
      my $comp_path = join '/', File::Spec->splitdir($File::Find::name);
  
      $interp->exec("/$comp_path");
      # Strip off leading part of path that matches source directory
      my $name = $File::Find::name;
      $name =~ s/^$source//;
  
      # Generate absolute path to output file
      my $out_file = File::Spec->catfile( $target, $name );
      # In case the directory doesn't exist, we make it
      mkpath(dirname($out_file));
  
      local *RESULT;
      open RESULT, "> $out_file" or die "Cannot write to $out_file: $!";
      print RESULT $buffer or die "Cannot write to $out_file: $!";
      close RESULT or die "Cannot close $out_file: $!";
  }

We take advantage of the File::Find module included with Perl, which can recursively descend a directory structure and invoke a callback for each file found. We simply have our callback (the convert() subroutine) call the HTML::Mason::Interp object's exec() method for each file ending in .html. We then write the results of the component call out to disk in the target directory.

We also use a number of other modules, including Cwd, File::Basename, File::Path, and File::Spec. These modules are distributed as part of the Perl core and provide useful functions for dealing with the filesystem in a cross-platform-compatible manner.

You may have noticed in Example 9-1 that when we invoked the Interpreter's exec() method directly, it didn't attempt to handle any of the web-specific elements of the request.

The same method is employed again here in our HTML generation script, and this same methodology could be applied in other situations that have little or nothing to do with the web.

Generating Config Files

Config files are a good candidate for Mason. For example, your production and staging web server config files might differ in only a few areas. Changes to one usually will need to be propagated to another. This is especially true with mod_perl, where web server configuration can basically be part of a web-based application.

And if you adopt the per-developer server solution discussed earlier, a template-driven config file generator becomes even more appealing.

Example 11-16 is a simple script to drive this generation.

Example 11-16. config_maker.pl
  #!/usr/bin/perl -w
  
  use strict;
  
  use Cwd;
  use File::Spec;
  use HTML::Mason;
  use User::pwent;
  
  my $comp_root =
      File::Spec->rel2abs( File::Spec->catfile( cwd( ), 'config' ) );
  
  my $output;
  my $interp =
      HTML::Mason::Interp->new( comp_root  => $comp_root,
                        out_method => \$output,
                        );
  
  my $user = getpwuid($<);
  
  $interp->exec( '/httpd.conf.mas', user => $user );
  
  my $file =  File::Spec->catfile( $user->dir, 'etc', 'httpd.conf' );
  open FILE, ">$file" or die "Cannot open $file: $!";
  print FILE $output;
  close FILE;

An httpd.conf.mas from the component might look like Example 11-17.

Example 11-17. config/httpd.conf.mas
  ServerRoot <% $user->dir %>
  PidFile <% File::Spec->catfile( $user->dir, 'logs', 'httpd.pid' ) %>
  LockFile <% File::Spec->catfile( $user->dir, 'logs', 'httpd.lock' ) %>
  Port <% $user->uid + 5000 %>
  
  # loads Apache modules, defines content type handling, etc.
  <& standard_apache_config.mas &>
  
  <Perl>
   use lib <% File::Spec->catfile( $user->dir, 'project', 'lib' ) %>;
  </Perl>
  
  DocumentRoot <% File::Spec->catfile( $user->dir, 'project', 'htdocs' ) %>
  
  PerlSetVar MasonCompRoot <% File::Spec->catfile( $user->dir, 'project', 'htdocs' ) %>
  PerlSetVar MasonDataDir <% File::Spec->catfile( $user->dir, 'mason' ) %>
  PerlModule HTML::Mason::ApacheHandler
  
  <LocationMatch "\.html$">
   SetHandler  perl-script
   PerlHandler HTML::Mason::ApacheHandler
  </LocationMatch>
  
  <%args>
  $user
  </%args>

This points the server's document root to the developer's working directory. Similarly, it adds the project/lib directory the Perl's @INC via use lib so that the user's working copy of the project's modules are seen first. The server will listen on a port equal to the user's user id plus 5,000.

Obviously, this is an incomplete example. It doesn't specify where logs, or other necessary config items, will go. It also doesn't handle generating the config file for a server intended to be run by the root user on a standard port.

Footnotes

1. If you are not familiar with Perl's tied variable feature, we suggest reading the perltie manpages (perldoc perltie). -- Return.

2. See the documentation accompanying the Cache::Cache modules for more detail. -- Return.

3. The overachieving reader may want to imagine a dhandler-based solution with URLs like /styles/<cobrand>.css. -- Return.


Table of Contents | Foreword | Preface
Chapters: 1 2 3 4 5 6 7 8 9 10 11 12
Appendices: A B C D
Glossary | Colophon | Copyright

These HTML pages were created by running this script against the pseudo-POD source.